Small Datum: RocksDB benchmarks: small server, universal compaction

I shared benchmark results for RocksDB a few weeks ago using leveled compaction and a small server. Here I have results for universal compaction and the same small server.

tl;dr

in general the there are some improvements and some small regressions with one exception (see bug 12038)
for a cached database

From RocksDB 6.0.2 to 9.x QPS drops by ~10% for fillseq and ~15% for other tests
Performance has been stable since 7.x

for an IO-bound database with buffered IO

bug 12038 hurts QPS for overwrite (will be fixed soon in 9.7)
QPS is otherwise stable

for an IO-bound database with O_DIRECT

QPS for fillseq and overwrite is ~10% less in 9.7 vs 6.0.2 and has been stable since 7.0
QPS for read-heavy tests is ~5% better in RocksDB 9.7.2 vs 6.0.2

Hardware

The small server is named SER7 and is a Beelink SER7 7840HS (see here) with 8 cores, AMD SMT disabled, a Ryzen 7 7840HS CPU, Ubuntu 22.04. Storage is ext4 with data=writeback and 1 NVMe device.

The storage device has 128 for max_hw_sectors_kb and max_sectors_kb. This is relevant for bug 12038 which will be fixed real soon in a 9.7 patch release.

Builds

I compiled db_bench from source on all servers. I used versions:

6.x - 6.0.2, 6.10.4, 6.20.4, 6.29.5
7.x - 7.0.4, 7.3.2, 7.6.0, 7.10.2
8.x - 8.0.0, 8.3.3, 8.6.7, 8.9.2, 8.11.4
9.x - 9.0.1, 9.1.2, 9.2.2, 9.3.2, 9.4.1, 9.5.2, 9.6.1, 9.6.2, 9.7.2, 9.7.4 and 9.8.1

Benchmark

All tests used the default value for compaction_readahead_size. For all versions tested I used the default values for the block cache (LRU) and format_version.

I used my fork of the RocksDB benchmark scripts that are wrappers to run db_bench. These run db_bench tests in a special sequence -- load in key order, read-only, do some overwrites, read-write and then write-only. The benchmark was run using 1 thread for the small server and 8 threads for the medium server. How I do benchmarks for RocksDB is explained here and here. The command line to run the tests is:

# Small server, SER7: use 1 thread, 20M KV pairs for cached, 400M for IO-bound

bash x3.sh 1 no 1800 c8r32 20000000 400000000 byrx iobuf iodir

The tests on the charts are named as:

fillseq -- load in key order with the WAL disabled
revrangeww -- reverse range while writing, do short reverse range scans as fast as possible while another thread does writes (Put) at a fixed rate
fwdrangeww -- like revrangeww except do short forward range scans
readww - like revrangeww except do point queries
overwrite - do overwrites (Put) as fast as possible

Workloads

There are three workloads, all of which use one client (thread):

byrx - the database is cached by RocksDB
iobuf - the database is larger than memory and RocksDB uses buffered IO
iodir - the database is larger than memory and RocksDB uses O_DIRECT

A spreadsheet with all results is here and performance summaries with more details are here for byrx, for iobuf and for iodir.

Relative QPS

The numbers in the spreadsheet and on the y-axis in the charts that follow are the relative QPS which is (QPS for $me) / (QPS for $base). When the value is greater than 1.0 then $me is faster than $base. When it is less than 1.0 then $base is faster (perf regression!).

The base version is RocksDB 6.0.2.

Results: byrx

The byrx tests use a cached database. The performance summary is here.

The charts show the relative QPS for a given version of RocksDB 6.0.2. There are two charts with the same data and the y-axis on the second doesn't start at 0 to improve readability.

Summary:

From RocksDB 6.0.2 to 9.x QPS drops by ~10% for fillseq and ~15% for other tests
Performance has been stable since 7.x

Results: iobuf

The iobuf tests use a database larger than memory with buffered IO. The performance summary is here.

The charts show the relative QPS for a given version of RocksDB 6.0.2. There are two charts with the same data and the y-axis on the second doesn't start at 0 to improve readability.

Summary:

bug 12038 explains the regression for overwrite (fixed soon in 9.7)
QPS for fillseq has been stable
QPS for revrangeww, fwdrangeww and readww is stable. I am not sure about the variance in 9.6 and 9.7 releases. The cause might be that universal (tiered) is more prone to variance. I will revisit that when I run tests again in a few months.

Results: iodir

The iodir tests use a database larger than memory with O_DIRECT. The performance summary is here.

The charts show the relative QPS for a given version of RocksDB 6.0.2. There are two charts with the same data and the y-axis on the second doesn't start at 0 to improve readability.

Summary:

QPS for fillseq and overwrite is ~10% less in 9.7 vs 6.0.2 and has been stable since 7.0. My vague memory is that the issue is new CPU overhead from better error checking.
QPS for read-heavy tests is ~5% better in RocksDB 9.7.2 vs 6.0.2

Small Datum

Monday, November 4, 2024

RocksDB benchmarks: small server, universal compaction

No comments:

Post a Comment

Postgres 18.0 vs sysbench on a 24-core, 2-socket server