I shared benchmark results for RocksDB a few weeks ago and there was a suggestion for me to repeat tests using different (older) values for format_version. Then while replacing a failed SSD, I also updated the OS and changed a few kernel-related config options. Thus, I ended up repeating all tests.
This post has results from a small server with leveled compaction. Results from a large server and from universal compaction are in progress.
tl;dr - on a small server with a low concurrency workload
- older values of format_version (2 thru 5) don't impact QPS
- auto hyperclock cache makes read-heavy tests up to 15% faster
- for a cached database
- QPS drops by 5% to 15% from RocksDB 6.0.2 to 9.7.2
- QPS has been stable since 8.0
- for an IO-bound database with buffered IO
- bug 12038 hurts QPS for overwrite (will be fixed soon in 9.7)
- QPS for fillseq has been stable
- QPS for read-heavy tests is 15% to 20% better in RocksDB 9.7.2 vs 6.0.2
- for an IO-bound database with O_DIRECT
- QPS for fillseq is ~11% less in 9.7.2 vs 6.0.2 but has been stable since 7.0. My vague memory is that the issue is new CPU overhead from better error checking.
- QPS for overwrite is stable
- QPS for read-heavy tests is 16% to 38% better in RocksDB 9.7.1 vs 6.0.2
Hardware
The small server is named SER7 and is a Beelink SER7 7840HS (see here) with 8 cores, AMD SMT disabled, a Ryzen 7 7840HS CPU, Ubuntu 22.04. Storage is ext4 with data=writeback and 1 NVMe device.
The storage device has 128 for max_hw_sectors_kb and max_sectors_kb. This is relevant for bug 12038 which will be fixed real soon in a 9.7 patch release.
Builds
I compiled db_bench from source on all servers. I used versions:
- 6.x - 6.0.2, 6.10.4, 6.20.4, 6.29.5
- 7.x - 7.0.4, 7.3.2, 7.6.0, 7.10.2
- 8.x - 8.0.0, 8.3.3, 8.6.7, 8.9.2, 8.11.4
- 9.x - 9.0.1, 9.1.2, 9.2.2, 9.3.2, 9.4.1, 9.5.2, 9.6.1 and 9.7.2 at git sha b5cde68b8a
Benchmark
All tests used the default value for compaction_readahead_size. For all versions tested I used the default values for the block cache (LRU) and format_version. For 9.6.1 I repeated tests using the hyperclock cache (default. is LRU) and format_version =2, =3, =4 and =5 (default is =6).
I used my fork of the RocksDB benchmark scripts that are wrappers to run db_bench. These run db_bench tests in a special sequence -- load in key order, read-only, do some overwrites, read-write and then write-only. The benchmark was run using 1 thread for the small server and 8 threads for the medium server. How I do benchmarks for RocksDB is explained here and here. The command line to run the tests is:
# Small server, SER7: use 1 thread, 20M KV pairs for cached, 400M for IO-bound
bash x3.sh 1 no 1800 c8r32 20000000 400000000 byrx iobuf iodirThe tests on the charts are named as:
- fillseq -- load in key order with the WAL disabled
- revrangeww -- reverse range while writing, do short reverse range scans as fast as possible while another thread does writes (Put) at a fixed rate
- fwdrangeww -- like revrangeww except do short forward range scans
- readww - like revrangeww except do point queries
- overwrite - do overwrites (Put) as fast as possible
Workloads
There are three workloads, all of which use one client (thread):
- byrx - the database is cached by RocksDB
- iobuf - the database is larger than memory and RocksDB uses buffered IO
- iodir - the database is larger than memory and RocksDB uses O_DIRECT
A spreadsheet with all results is here and performance summaries with more details are linked below:
- byrx - all versions and 9.6 variations
- iobuf - all versions
- iodir - all versions
Relative QPS
The numbers in the spreadsheet and on the y-axis in the charts that follow are the relative QPS which is (QPS for $me) / (QPS for $base). When the value is greater than 1.0 then $me is faster than $base. When it is less than 1.0 then $base is faster (perf regression!).
The base version is RocksDB 6.0.2 for the all versions tests and 9.6.1 with my standard configuration for the 9.6 variations tests.
Results: byrx with 9.6 variations
The byrx tests use a cached database. The performance summary is here. This has results for RocksDB 9.6.1 using my standard configuration and the variations are:
- fv2 - uses format_version=2 instead of the default (=6)
- fv3 - uses format_version=3
- fv4 - uses format_version=4
- fv5 - uses formatio_version=5
- ahcc - uses auto_hyper_clock_cache instead of the default (LRU)
This chart shows the relative QPS for RocksDB 9.6.1 with a given configuration relative to 9.6.1 with my standard configuration. The y-axis doesn't start at 0 to improve readability.
Summary:
- Using different values of format_version don't have a large impact here
- Using auto hyperclock instead of LRU improves read-heavy QPS by up to 15%
Results: byrx with all versions
The byrx tests use a cached database. The performance summary is here.
This chart shows the relative QPS for a given version relative to RocksDB 6.0.2. The y-axis doesn't start at 0 to improve readability.
Summary:
- QPS drops by 5% to 15% from RocksDB 6.0.2 to 9.7.2
- Performance has been stable since 8.0
- For overwrite the excellent result in RocksDB 6.0.2 comes at the cost of bad write stalls (see pmax here)
Results: iobuf with all versions
The iobuf tests use a database larger than memory with buffered IO. The performance summary is here.
This chart shows the relative QPS for a given version relative to RocksDB 6.0.2. The y-axis doesn't start at 0 to improve readability.
Summary:
- bug 12038 explains the regression for overwrite (fixed soon in 9.7)
- QPS for fillseq has been stable
- QPS for read-heavy tests is 15% to 20% better in RocksDB 9.7.2 vs 6.0.2
Results: iodir with all versions
The iodir tests use a database larger than memory with O_DIRECT. The performance summary is here.
This chart shows the relative QPS for a given version relative to RocksDB 6.0.2. The y-axis doesn't start at 0 to improve readability.
Summary:
- QPS for fillseq is ~11% less in 9.7.2 vs 6.0.2 but has been stable since 7.0. My vague memory is that the issue is new CPU overhead from better error checking.
- QPS for overwrite is stable
- QPS for read-heavy tests is 16% to 38% better in RocksDB 9.7.1 vs 6.0.2
No comments:
Post a Comment