Small Datum: Using sysbench to measure how Postgres performance changes over time, November 2025 edition

This has results for the sysbench benchmark on a small and big server for Postgres versions 12 through 18. Once again, Postgres is boring because I search for perf regressions and can't find any here. Results from MySQL are here and MySQL is not boring.

While I don't show the results here, I don't see regressions when comparing the latest point releases with their predecessors -- 13.22 vs 13.23, 14.19 vs 14.20, 15.14 vs 15.15, 16.10 vs 16.11, 17.6 vs 17.7 and 18.0 vs 18.1.

tl;dr

a few small regressions
many more small improvements
for write-heavy tests at high-concurrency there are many large improvements starting in PG 17

Builds, configuration and hardware

I compiled Postgres from source for versions 12.22, 13.22, 13.23, 14.19, 14.20, 15.14, 15.15, 16.10, 16.11, 17.6, 17.7, 18.0 and 18.1.

I used two servers:

small

an ASUS ExpertCenter PN53 with AMD Ryzen 7735HS CPU, 32G of RAM, 8 cores with AMD SMT disabled, Ubuntu 24.04 and an NVMe device with ext4 and discard enabled.

an ax162s from Hetzner with an AMD EPYC 9454P 48-Core Processor with SMT disabled
2 Intel D7-P5520 NVMe storage devices with RAID 1 (3.8T each) using ext4
128G RAM
Ubuntu 22.04 running the non-HWE kernel (5.5.0-118-generic)

Configuration files for the small server

Configuration files are here for Postgres versions 12, 13, 14, 15, 16 and 17.
For Postgres 18 I used io_method=sync and the configuration file is here.

Configuration files for the big server

Configuration files are here for Postgres versions 12, 13, 14, 15, 16 and 17.
For Postgres 18 I used io_method=sync and the configuration file is here.

Benchmark

I used sysbench and my usage is explained here. I now run 32 of the 42 microbenchmarks listed in that blog post. Most test only one type of SQL statement. Benchmarks are run with the database cached by Postgres.

The read-heavy microbenchmarks are run for 600 seconds and the write-heavy for 900 seconds. On the small server the benchmark is run with 1 client and 1 table with 50M rows. On the big server the benchmark is run with 12 clients and 8 tables with 10M rows per table.

The purpose is to search for regressions from new CPU overhead and mutex contention. I use the small server with low concurrency to find regressions from new CPU overheads and then larger servers with high concurrency to find regressions from new CPU overheads and mutex contention.

Results

The microbenchmarks are split into 4 groups -- 1 for point queries, 2 for range queries, 1 for writes. For the range query microbenchmarks, part 1 has queries that don't do aggregation while part 2 has queries that do aggregation.

I provide charts below with relative QPS. The relative QPS is the following:

(QPS for some version) / (QPS for Postgres 12.22)

When the relative QPS is > 1 then some version is faster than Postgres 12.22. When it is < 1 then there might be a regression. When the relative QPS is 1.2 then some version is about 20% faster than Postgres 12.22.

Values from iostat and vmstat divided by QPS are here for the small server and the big server. These can help to explain why something is faster or slower because it shows how much HW is used per request, including CPU overhead per operation (cpu/o) and context switches per operation (cs/o) which are often a proxy for mutex contention.

The spreadsheet and charts are here and in some cases are easier to read than the charts below. Converting the Google Sheets charts to PNG files does the wrong thing for some of the test names listed at the bottom of the charts below.

Results: point queries

This is from the small server.

a large improvement arrived in Postgres 17 for the hot-points test
otherwise results have been stable from 12.22 through 18.1

This is from the big server.

a large improvement arrived in Postgres 17 for the hot-points test
otherwise results have been stable from 12.22 through 18.1

Results: range queries without aggregation

This is from the small server.

there are small improvements for the scan test
otherwise results have been stable from 12.22 through 18.1

This is from the big server.

there are small improvements for the scan test
otherwise results have been stable from 12.22 through 18.1

Results: range queries with aggregation

This is from the small server.

there are small improvements for a few tests
otherwise results have been stable from 12.22 through 18.1

This is from the big server.

there might be small regressions for a few tests
otherwise results have been stable from 12.22 through 18.1

Results: writes

This is from the small server.

there are small improvements for most tests
otherwise results have been stable from 12.22 through 18.1

This is from the big server.

there are large improvements for half of the tests
otherwise results have been stable from 12.22 through 18.1

From vmstat results for update-index the per-operation CPU overhead and context switch rate are much smaller starting in Postgres 17.7. The CPU overhead is about 70% of what it was in 16.11 and the context switch rate is about 50% of the rate for 16.11. Note that context switch rates are often a proxy for mutex contention.

Small Datum

Saturday, November 29, 2025

Using sysbench to measure how Postgres performance changes over time, November 2025 edition

No comments:

Post a Comment

Explaining why throughput varies for Postgres with a CPU-bound Insert Benchmark