Small Datum: Checking RocksDB 8.x for performance regressions on a large server, part 2

Thursday, November 16, 2023

Checking RocksDB 8.x for performance regressions on a large server, part 2

This post has results for performance regressions in all versions of 8.x using a large server. In a previous post I shared results for RocksDB 7.x and 8.x on the same hardware. Here I have results for new versions of RocksDB 8.7 and 8.8.

tl;dr

Things mostly look good
There are a few known problems
There are a few possible regressions that will take more time to figure out

Builds

I compiled with gcc RocksDB 8.0.0, 8.1.1, 8.2.1, 8.3.3, 8.4.4, 8.5.4, 8.6.7, 8.7.3 and 8.8.0 which are the latest patch releases.

Benchmark

All tests used a server with 40 cores, 80 HW threads, 2 sockets, 256GB of RAM and many TB of fast NVMe SSD with Linux 5.1.2, XFS and SW RAID 0 across 6 devices.

Everything used the LRU block cache and the default value for compaction_readahead_size.

I used my fork of the RocksDB benchmark scripts that are wrappers to run db_bench. These run db_bench tests in a special sequence -- load in key order, read-only, do some overwrites, read-write and then write-only. The benchmark was repeated using 12 and 24 threads. How I do benchmarks for RocksDB is explained here and here.

The benchmark was repeated in three setups:

cached - database fits in the RocksDB block cache
iobuf - IO-bound, working set doesn't fit in memory, uses buffered IO
iodir - IO-bound, working set doesn't fit in memory, uses O_DIRECT

A spreadsheet with all results is here and performance summaries are here.

Results: cached

The charts use relative QPS which is: (QPS for my version / QPS for RocksDB 8.8.0). The y-axis usually doesn't start at zero to improve readability at the risk of improving hype-ability.

From 8.0 to 8.8

fillseq QPS is stable
fwdrange QPS has much variance. This is a known issue with the LRU block cache on multi-socket servers (hello NUMA).
read QPS for readrandom and multireadrandom is down by ~5%. This might be a regression.
read QPS for *whilewriting is down by ~2%. This might be a regression.
overwrite QPS is stable to up by ~3%

Results: IO-bound with buffered IO

The charts use relative QPS which is: (QPS for my version / QPS for RocksDB 8.8.0). The y-axis usually doesn't start at zero to improve readability at the risk of improving hype-ability.

From 8.0 to 8.8

fillseq QPS is stable
fwdrange QPS has much variance. This is a known issue with the LRU block cache on multi-socket servers (hello NUMA).
readrandom and multireadrandom QPS are stable
read QPS for *whilewriting might be down by 4% and might be a regression
overwrite QPS is down by ~5% and might be the result of compaction_readahead_size being larger than max_sectors_kb as explained here.

Results: IO-bound with O_DIRECT

From 8.0 to 8.8

fillseq QPS is stable
fwdrange QPS has much variance. This is a known issue with the LRU block cache on multi-socket servers (hello NUMA).
readrandom and multireadrandom QPS are stable
read QPS for *whilewriting is stable
overwrite QPS is up by 3%

Small Datum

Thursday, November 16, 2023

Checking RocksDB 8.x for performance regressions on a large server, part 2

No comments:

Post a Comment

Sysbench for MySQL 5.6 thru 9.4 on a small server