Monday, April 24, 2023

Revisiting perf regressions in Postgres, a larger server and sysbench

The previous results I shared for Postgres and sysbench on 30-core server are not to be trusted. I used huge pages and a too-big value for vm.nr_hugepages. This caused some odd behavior. So I reduced the Postgres buffer pool to 150G. vm.nr_hugepages to 85000 and repeated the benchmarks.

Refer to the previous post for more details. The summary is that I ran in-memory sysbench for many versions of Postgres on a c2-standard-60 server. 

The benchmarks tries to answer two questions:

  • How does perf change from Postgres version 11 to version 15?
  • What is the impact from compiler optimizations?
tl;dr
  • The o3_native_lto build has the best performance as it did in the previous post. It provides ~5% more throughput.
  • For perf changes from 11.19 to 15.2
    • Point queries get about 3% more QPS in 15.2
    • Range queries get about 8% more QPS in 15.2
    • Writes get about 4% less QPS in 15.2
    • The range query and write microbenchmarks have more variance. See the Results for all versions section for more details.

Benchmark

A description of how I run sysbench is here. The sysbench microbenchmarks were run for 20 clients, 600 seconds per microbenchmark using 4 tables with 50M rows per table. The test database was <= 64G and fits in memory.

The workload was configured to fit in the 150G database buffer pool. The config file, conf.diff.cx7a_gcp_c2s60, is here for Postgres 11, 12, 13, 14, 15.

The previous post explains the builds that I used where each build uses different compiler optimizations. Here I used all of the builds for Postgres. This has results for Postgres 11.19, 12.14, 13.10, 14.7, 15.1 and 15.2. For 15.1 I used all builds (def, o2_nofp, o3, o3_native, o3_native_lto). For other versions I only used the o3_native_lto build to save time.

I use sysbench to run 42 microbenchmarks and each microbenchmark is put in one of three groups based on the dominant operation: point query, range query, writes.

Results for all versions

The result spreadsheet is here. See the pgall.redo tab. The chart doesn't show the full name for each benchmark. Consult the spreadsheet.

The graphs use relative throughput which is throughput for me / throughput for base case. When the relative throughput is > 1 then my results are better than the base case. When it is 1.10 then my results are ~10% better than the base case. The base case is Postgres 11.19.

The microbenchmarks for range queries and writes have more variance than for point queries based on the relative throughput per microbenchmark. The bullet points list the relative throughput for version 15.2 versus 11.19. The microbenchmark names on the graphs below are cut off so refer to the spreadsheet:
  • range queries
    • range-covered-pk_range=100 - relative throughput is 1.26
    • range-covered-si_range=100 - relative throughput is 1.27
    • read-only.pre_range=10 - relative throughput is 1.14
    • read-only_range=10 - relative throughput is 1.17
    • scan_range=100 - relative throughput for is 1.12
  • writes
    • read-write_range=100 - relative throughput is 0.88
    • update-linst_range=100 - relative throughput is 1.24
    • update-one_range=100 - relative throughput is 0.80
    • write-only_range=100 - relative throughput is 0.89
Summary statistics:

pg111912.1413.1014.715.115.2
Point: avg1.011.011.021.031.03
Point: median1.001.011.031.041.04
Point: min0.980.970.960.980.99
Point: max1.051.051.071.101.11
Point: stddev0.0170.0170.0310.0330.033
Range: avg1.031.041.051.081.08
Range: median1.001.021.021.031.03
Range: min0.950.900.991.001.00
Range: max1.201.251.241.281.27
Range: stddev0.0750.0940.0830.0940.093
Write: avg0.920.920.970.970.96
Write: median0.920.910.950.960.95
Write: min0.700.860.790.800.77
Write: max1.071.001.241.241.22
Write: stddev0.1010.0450.1240.1210.119


Results for Postgres 15.1

The result spreadsheet is here. See the pg151.redo tab. The chart doesn't show the full name for each benchmark. Consult the spreadsheet.

The graphs use relative throughput which is throughput for me / throughput for base case. When the relative throughput is > 1 then my results are better than the base case. When it is 1.10 then my results are ~10% better than the base case. The base case is the def build that uses -O2.

Summary statistics:

pg151_defo2_nofpo3o3_nativeo3_native_lto
Point: avg1.001.011.021.05
Point: median1.001.011.021.05
Point: min0.981.001.011.03
Point: max1.011.041.041.06
Point: stddev0.0080.0100.0070.008
Range: avg1.001.021.031.07
Range: median0.991.021.031.07
Range: min0.991.001.001.02
Range: max1.021.031.051.11
Range: stddev0.0100.0100.0150.030
Write: avg1.001.021.011.03
Write: median1.001.011.011.02
Write: min0.960.980.991.00
Write: max1.031.201.021.08
Write: stddev0.0210.0640.0110.021



No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...