Small Datum: Postgres 17rc1 vs sysbench on small & large servers: looking great

Monday, September 9, 2024

Postgres 17rc1 vs sysbench on small & large servers: looking great

This has benchmark results for Postgres 15.8, 16.4 and 17 (beta3, rc1) using sysbench with large and small servers. A recent result for Postgres 17 beta3 from a large server is here. The server in this case is an ax162-s from Hetzner.

This work was done by Small Datum LLC.

tl;dr

17rc1 looks great - there are no big regressions and several big improvements
There might be small regressions (~2%) from Postgres 15 and 16 to 17 but this benchmark was not setup to diagnose that.

Builds, configuration and hardware

I compiled Postgres versions 15.8, 16.4, 17beta3 and 17rc1 from source using -O2 -fno-omit-frame-pointer.

The servers are:

small

The server is named v5 or Beelink SER7 here and has 8 AMD cores with SMT disabled, 16G of RAM and uses Ubuntu 22.04 and ext4 with 1 NVMe device.

large

a ax162-s from Hetzner with 48 cores, AMD 128G RAM and AMD SMT disabled. It uses Ubuntu 22.04 and storage is ext4 using SW RAID 1 over 2 locally attached NVMe devices. More details on it are here. At list prices a similar server from Google Cloud costs 10X more than from Hetzner.

The configuration files for the large server are in the pg* subdirectories here with the name conf.diff.cx10a_c32r128.

The configuration files for the small server are in the pg* subdirectories here with the name conf.diff.cx10a_c8r32.

Benchmark

I used sysbench and my usage is explained here. There are 42 microbenchmarks and most test only 1 type of SQL statement. Benchmarks are run with the database cached by Postgres.

For the large server the tests run with 8 tables and 10M rows/table. There are 40 client threads, read-heavy microbenchmarks run for 180 seconds and write-heavy run for 300 seconds. The command line to run all tests was: bash r.sh 8 10000000 180 300 md2 1 1 40

For the small server the tests run with 1 tables and 50M rows. There is 1 client thread, read-heavy microbenchmarks run for 180 seconds and write-heavy run for 300 seconds. The command line to run all tests was: bash r.sh 1 50000000 180 300 nvme0n1 1 1 1

Results

For the results below I split the 42 microbenchmarks into 5 groups -- 2 for point queries, 2 for range queries, 1 for writes. For the range query microbenchmarks, part 1 has queries that don't do aggregation while part 2 has queries that do aggregation. The spreadsheet with all data is here.

Values from iostat and vmstat divided by QPS are here for the small server and the large server. This can help to explain why something is faster or slower because it shows how much HW is used per request.

The numbers in the spreadsheets are the relative QPS. When the relative QPS is > 1 then $version is faster than Postgres 15.8. When it is 3.0 then $version is 3X faster than the base case.

The relative QPS is the following where $version is one of 16.4, 17beta3, 17rc1:

(QPS for $version) / (QPS for Postgres 15.8)

Results: charts

Notes on the charts

the y-axis shows the relative QPS
the y-axis starts at 0.80 to make it easier to see differences
in some cases the y-axis truncates the good outliers, cases where the relative QPS is greater than 1.5. I do this to improve readability for values near 1.0. Regardless, the improvements are nice.

Point queries, part 1

Small Server

The relative QPS in hot-points exceeds 2.0 for Postgres 17 (beta3 & rc1). The y-axis truncates that excellent result and the result is explained by a reduction in the CPU overhead per-query (see cpu/o here).
Otherwise, 17rc1 gets between 5% less and 2% more QPS vs Postgres 15.8

Large server

The relative QPS in hot-points is almost 3.0 for Postgres 17 (beta3 and rc1). The y-axis truncates that excellent result. The CPU overhead per-query is greatly reduced (see cpu/o here).
Otherwise, 17rc1 gets between 3% less and 6% more QPS vs Postgres 15.8

Point queries, part 2

Small server

Postgres 17rc1 has similar QPS as 15.8 and 16.4

Large server

Postgres 17rc1 has similar QPS as 15.8 and might be ~2% slower than 16.4

Range queries, part 1

Small server

While it looks like there is a regression for scan performance, the scan microbenchmark seems to have more variance than I can explain so I ignore that for now.
Postgres 17rc1 has similar QPS as 15.8 and 16.4