This has performance results for Postgres 18 beta3, beta2, beta1 and 17.6 using the sysbench benchmark and a small server. The working set is cached and the benchmark is run with low concurrency (1 connection). The goal is to search for CPU regressions. This work was done by Small Datum LLC and not sponsored
tl;dr
- There might be small regressions (~2%) for several range queries that don't do aggregation. This is similar to what I reported for 18 beta1.
- Vacuum continues to be a problem for me and I had to repeat the benchmark a few times to get a stable result. It appears to be a big source of non-deterministic behavior leading to false alarms for CPU regressions in read-heavy tests that run after vacuum. In some ways, RocksDB compaction causes similar problems. Fortunately, InnoDB MVCC GC (purge) does not cause such problems.
Builds, configuration and hardware
I compiled Postgres versions 17.6, 18 beta1, 18 beta2 and 18 beta3 from source.
The server is a Beelink SER7 with a Ryzen 7 7840HS CPU, 32G of RAM, 8 cores with AMD SMT disabled, Ubuntu 24.04 and an NVMe devices with discard enabled and ext4 for the database.
The config file for Postgres 17.6 is x10a_c8r32.
The config files for Postgres 18 are:
- x10b_c8r32 is functionally the same as x10a_c8r32 but adds io_method=sync
- x10b1_c8r32 starts with x10b_c8r32 and adds vacuum_max_eager_freeze_failure_rate =0
- x10b2_c8r32 starts with x10b_c8r32 and adds vacuum_max_eager_freeze_failure_rate =0.99
Benchmark
I used sysbench and my usage is explained here. To save time I only run 32 of the 42 microbenchmarks and most test only 1 type of SQL statement. Benchmarks are run with the database cached by Postgres.
The tests are run using 1 table with 50M rows. The read-heavy microbenchmarks run for 600 seconds and the write-heavy for 900 seconds.
Results
The microbenchmarks are split into 4 groups -- 1 for point queries, 2 for range queries, 1 for writes. For the range query microbenchmarks, part 1 has queries that don't do aggregation while part 2 has queries that do aggregation.
I provide charts below with relative QPS. The relative QPS is the following:
(QPS for some version) / (QPS for Postgres 17.6)
When the relative QPS is > 1 then some version is faster than PG 17.6. When it is < 1 then there might be a regression. Values from iostat and vmstat divided by QPS are also provided here. These can help to explain why something is faster or slower because it shows how much HW is used per request.
The numbers highlighted in yellow below might be from a small regression for range queries that don't do aggregation. But note that this does reproduce for the full table scan microbenchmark (scan). I am not certain it is a regression as this might be from non-deterministic CPU overheads for read-heavy workloads that are run after vacuum. I hope to look at CPU flamegraphs soon.
The numbers highlighted in yellow below might be from a small regression for range queries that don't do aggregation. But note that this does reproduce for the full table scan microbenchmark (scan). I am not certain it is a regression as this might be from non-deterministic CPU overheads for read-heavy workloads that are run after vacuum. I hope to look at CPU flamegraphs soon.
- the mapping from microbenchmark name to Lua script is here
- the range query without aggregation microbenchmarks use oltp_range_covered.lua with various flags set and the SQL statements it uses are here. All of these return 100 rows.
- the scan microbenchmark uses oltp_scan.lua which is a SELECT with a WHERE clause that filters all rows (empty result set)
Relative to: x.pg176_o2nofp.x10a_c8r32.pk1
col-1 : x.pg18beta1_o2nofp.x10b_c8r32.pk1
col-2 : x.pg18beta2_o2nofp.x10b_c8r32.pk1
col-3 : x.pg18beta3_o2nofp.x10b_c8r32.pk1
col-4 : x.pg18beta3_o2nofp.x10b1_c8r32.pk1
col-5 : x.pg18beta3_o2nofp.x10b2_c8r32.pk1
col-1 col-2 col-3 col-4 col-5 -> point queries
1.00 1.00 0.98 0.99 0.99 hot-points_range=100
1.00 1.01 1.00 1.00 0.99 point-query_range=100
1.00 1.02 1.01 1.01 1.01 points-covered-pk
1.00 1.00 1.00 1.00 1.00 points-covered-si
1.01 1.01 1.00 1.00 1.00 points-notcovered-pk
1.01 1.00 1.00 1.00 1.00 points-notcovered-si
0.99 1.00 0.99 1.00 1.00 random-points_range=1000
1.01 1.00 1.00 1.00 1.00 random-points_range=100
1.01 1.01 1.00 1.00 0.99 random-points_range=10
col-1 col-2 col-3 col-4 col-5 -> range queries w/o agg
0.98 0.99 0.97 0.98 0.96 range-covered-pk_range=100
0.98 0.99 0.96 0.98 0.97 range-covered-si_range=100
0.98 0.98 0.98 0.97 0.98 range-notcovered-pk
0.99 0.99 0.98 0.98 0.98 range-notcovered-si
1.01 1.02 1.00 1.00 1.00 scan
col-1 col-2 col-3 col-4 col-5 -> range queries with agg
1.02 1.01 1.02 1.01 0.98 read-only-count_range=1000
0.98 1.01 1.01 1.00 1.03 read-only-distinct
0.99 0.99 0.99 0.99 0.99 read-only-order_range=1000
1.00 1.00 1.01 1.00 1.01 read-only_range=10000
0.99 0.99 0.99 0.99 0.99 read-only_range=100
0.99 0.99 0.99 0.98 0.99 read-only_range=10
1.01 1.00 1.00 1.00 1.01 read-only-simple
1.01 1.00 1.01 1.00 1.00 read-only-sum_range=1000
col-1 col-2 col-3 col-4 col-5 -> writes
0.99 1.00 0.98 0.98 0.98 delete_range=100
0.99 0.98 0.98 1.00 0.98 insert_range=100
0.99 0.99 0.99 0.98 0.99 read-write_range=100
0.98 0.99 0.99 0.98 0.99 read-write_range=10
1.00 0.99 0.98 0.97 0.99 update-index_range=100
1.01 1.00 0.99 1.01 1.00 update-inlist_range=100
1.00 1.00 0.99 0.96 0.99 update-nonindex_range=100
1.01 1.01 0.99 0.97 0.99 update-one_range=100
1.00 1.00 0.99 0.98 0.99 update-zipf_range=100
1.00 0.99 0.98 0.98 1.00 write-only_range=10000
No comments:
Post a Comment