This post has results from a large server with universal compaction from the same server for which I recently shared leveled compaction results. The results are boring (no large regressions) but a bit more exciting than the ones for leveled compaction because there is more variance. A somewhat educated guess is that variance more likely with universal.
tl;dr
- there are some small regressions for cached workloads (see byrx below)
- there are some small to medium improvements for IO-bound workloads (see iodir and iobuf)
- modern RocksDB would look better were I to use the Hyper Clock block cache, but here I don't to test similar code across all versions
Hardware
The server is an ax162-s from Hetzner with an AMD EPYC 9454P processor, 48 cores, AMD SMT disabled and 128G RAM. The OS is Ubuntu 22.04. Storage is 2 NVMe devices with SW RAID 1 and ext4.
Builds
I compiled db_bench from source on all servers. I used versions:
- 6.x - 6.0.2, 6.10.4, 6.20.4, 6.29.5
- 7.x - 7.0.4, 7.3.2, 7.6.0, 7.10.2
- 8.x - 8.0.0, 8.3.3, 8.6.7, 8.9.2, 8.11.4
- 9.x - 9.0.1, 9.1.2, 9.2.2, 9.3.2, 9.4.1, 9.5.2, 9.6.1 and 9.7.3
Benchmark
All tests used the default value for compaction_readahead_size and the block cache (LRU).
I used
my fork of the RocksDB benchmark scripts that are wrappers to run db_bench. These run db_bench tests in a special sequence -- load in key order, read-only, do some overwrites, read-write and then write-only. The benchmark was run using 40 threads.
How I do benchmarks for RocksDB is explained here and here. The command line to run the tests is: bash x3.sh 40 no 1800 c48r128 100000000 2000000000 byrx iobuf iodir
The tests on the charts are named as:
- fillseq -- load in key order with the WAL disabled
- revrangeww -- reverse range while writing, do short reverse range scans as fast as possible while another thread does writes (Put) at a fixed rate
- fwdrangeww -- like revrangeww except do short forward range scans
- readww - like revrangeww except do point queries
- overwrite - do overwrites (Put) as fast as possible
Workloads
There are three workloads, all of which use 40 threads:
- byrx - the database is cached by RocksDB (100M KV pairs)
- iobuf - the database is larger than memory and RocksDB uses buffered IO (2B KV pairs)
- iodir - the database is larger than memory and RocksDB uses O_DIRECT (2B KV pairs)
A spreadsheet with all results
is here and performance summaries with more details are here for
byrx,
iobuf and
iodir.
Relative QPS
The numbers in the spreadsheet and on the y-axis in the charts that follow are the relative QPS which is (QPS for $me) / (QPS for $base). When the value is greater than 1.0 then $me is faster than $base. When it is less than 1.0 then $base is faster (perf regression!).
The base version is RocksDB 6.0.2.
Results: byrx
The byrx tests use a cached database. The performance summary
is here.
The chart shows the relative QPS for a given version relative to RocksDB 6.0.2. There are two charts and the second narrows the range for the y-axis to make it easier to see regressions.
Summary:
- fillseq has new CPU overhead in 7.0 from code added for correctness checks and QPS has been stable since then
- QPS for other tests has been stable, with some variance, since late 6.x
Results: iobuf
The iodir tests use an IO-bound database with buffered. The performance summary
is here.
The chart shows the relative QPS for a given version relative to RocksDB 6.0.2. There are two charts and the second narrows the range for the y-axis to make it easier to see regressions.
Summary:
- fillseq has been stable since 7.6
- readww has always been stable
- overwrite improved in 7.6 and has been stable since then
- fwdrangeww and revrangeww improved in late 6.0 and have been stable since then
Results: iodir
The iodir tests use an IO-bound database with O_DIRECT. The performance summary
is here.
The chart shows the relative QPS for a given version relative to RocksDB 6.0.2. There are two charts and the second narrows the range for the y-axis to make it easier to see regressions.
Summary:
- fillseq has been stable since 7.6
- readww has always been stable
- overwrite improved in 7.6 and has been stable since then
- fwdrangeww and revrangeww have been stable but there is some variance
No comments:
Post a Comment