Monday, July 8, 2024

Postgres 17beta2 vs sysbench: looking good

This has benchmark results for Postgres 17beta2 using sysbench and a medium server. By small, medium or large server I mean < 10 cores for small, 10 to 19 cores for medium, 20+ cores for large.

Recent results for Postgres 17beta1 are here:

tl;dr

  • 17beta2 looks good
  • Write microbenchmarks are much faster in 17beta1 and 17beta2 vs 16.3
  • Read microbenchmarks have similar performance between 16.3, 17beta1 and 17beta2
Builds, configuration and hardware

I compiled Postgres versions 16.3, 17beta1 and 17beta2 from source.

The server is a c2d-highcpu-32 instance type on GCP (c2d high-CPU) with 32 vCPU, 64G RAM and SMT disabled so there are 16 cores. It uses Ubuntu 22.04 and storage is ext4 (data=writeback) using SW RAID 0 over 2 locally attached NVMe devices.

The configuration file is here.

Benchmark

I used sysbench and my usage is explained here. There are 42 microbenchmarks and most test only 1 type of SQL statement. Benchmarks are run with two workloads:
  • cached - database is cached by Postgres, 8 tables and 10M rows/table
  • IO-bound - database is larger than memory, 8 tables and 200M rows/table
In both cases, there are 12 client threads, read-heavy microbenchmarks run for 300 seconds and write-heavy run for 600 seconds.

The command lines for my helper scripts were:
# cached -> 10M rows/table
bash r.sh 8 10000000 300 600 md0 1 1 12
# IO-bound -> 200M rows/table
bash r.sh 8 200000000 300 600 md0 1 1 12
Results

For the results below I split the 42 microbenchmarks into 5 groups -- 2 for point queries, 2 for range queries, 1 for writes. For the range query microbenchmarks, part 1 has queries that don't do aggregation while part 2 has queries that do aggregation. The spreadsheet with all data is here. For each microbenchmark group there is a table with summary statistics. 

The numbers in the spreadsheets are the relative QPS for Postgres 17beta1  and 17beta2 which are:
(QPS for 17beta1) / (QPS for 16.3)

(QPS for 17beta2) / (QPS for 16.3) 

When the relative QPS is > 1 then 17beta1 or 17beta2 are faster.

I use summary statistics per microbenchmark group rather than charts to save time. The numbers are the relative QPS. I focus on the median value per microbenchmark group

Cached
  • Many of the write microbenchmarks are significantly faster in 17beta1 and 17beta2
  • The min relative QPS in the range-1 group is 0.97 for 17beta1 and 0.93 for 17beta2 and that comes from the scan microbenchmark. I thought this was a regression but it might be variance from some of my test methods, Postgres and my test HW. I spent much time trying to reproduce this but mostly see variance.
17beta1minmaxavgmedian
point-10.971.801.060.98
point-20.980.990.980.98
range-10.971.021.001.01
range-20.981.021.001.00
writes1.001.501.151.13

17beta2minmaxavgmedian
point-10.971.811.060.99
point-20.980.990.990.99
range-10.931.010.980.98
range-20.981.011.001.01
writes0.991.481.151.11

IO-bound
  • The hot-points result here shows a big improvement as it does. The hot-points workload fits in the Postgres buffer pool so the result here matches the result above.
  • The min relative QPS in the range-1 group is 0.95 and that comes from the scan microbenchmark. See the comments above.
17beta1minmaxavgmedian
point-10.962.141.091.00
point-21.001.011.001.00
range-10.991.021.001.00
range-21.001.011.001.00
writes1.001.101.021.01

17beta2minmaxavgmedian
point-10.942.161.091.00
point-21.001.001.001.00
range-10.951.011.001.00
range-21.001.011.001.00
writes1.001.101.021.00


No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...