Small Datum: RocksDB benchmarks: large server, leveled compaction

Saturday, November 9, 2024

RocksDB benchmarks: large server, leveled compaction

I recently shared benchmark results for RocksDB a few weeks ago for both leveled and universal compaction on a small server. This post has results from a large server with leveled compaction.

tl;dr

there are a few regressions from bug 12038
QPS for overwrite is ~1.5X to ~2X better in 9.x than 6.0 (ignoring bug 12038)
otherwise QPS in 9.x is similar to 6.x

Hardware

The server is an ax162-s from Hetzner with an AMD EPYC 9454P processor, 48 cores, AMD SMT disabled and 128G RAM. The OS is Ubuntu 22.04. Storage is 2 NVMe devices with SW RAID 1 and ext4.

Builds

I compiled db_bench from source on all servers. I used versions:

6.x - 6.0.2, 6.10.4, 6.20.4, 6.29.5
7.x - 7.0.4, 7.3.2, 7.6.0, 7.10.2
8.x - 8.0.0, 8.3.3, 8.6.7, 8.9.2, 8.11.4
9.x - 9.0.1, 9.1.2, 9.2.2, 9.3.2, 9.4.1, 9.5.2, 9.6.1 and 9.7.3

Benchmark

All tests used the default value for compaction_readahead_size and the block cache (LRU).

I used my fork of the RocksDB benchmark scripts that are wrappers to run db_bench. These run db_bench tests in a special sequence -- load in key order, read-only, do some overwrites, read-write and then write-only. The benchmark was run using 40 threads. How I do benchmarks for RocksDB is explained here and here. The command line to run the tests is: bash x3.sh 40 no 1800 c48r128 100000000 2000000000 byrx iobuf iodir

The tests on the charts are named as:

fillseq -- load in key order with the WAL disabled
revrangeww -- reverse range while writing, do short reverse range scans as fast as possible while another thread does writes (Put) at a fixed rate
fwdrangeww -- like revrangeww except do short forward range scans
readww - like revrangeww except do point queries
overwrite - do overwrites (Put) as fast as possible

Workloads

There are three workloads, all of which use 40 threads:

byrx - the database is cached by RocksDB (100M KV pairs)
iobuf - the database is larger than memory and RocksDB uses buffered IO (2B KV pairs)
iodir - the database is larger than memory and RocksDB uses O_DIRECT (2B KV pairs)

A spreadsheet with all results is here and performance summaries with more details are here for byrx, iobuf and iodir.

Relative QPS

The numbers in the spreadsheet and on the y-axis in the charts that follow are the relative QPS which is (QPS for $me) / (QPS for $base). When the value is greater than 1.0 then $me is faster than $base. When it is less than 1.0 then $base is faster (perf regression!).

The base version is RocksDB 6.0.2.

Results: byrx

The byrx tests use a cached database. The performance summary is here.

This chart shows the relative QPS for a given version relative to RocksDB 6.0.2. The y-axis doesn't start at 0 in the second chart to improve readability for some lines.

Summary:

fillseq is worse from 6.0 to 8.0 but stable since then
overwrite has large improvements late in 6.0 and small improvements since then
fwdrangeww has small improvements in early 7.0 and is stable since then
revrangeww and readww are stable from 6.0 through 9.

Results: iobuf

The iobuf tests use an IO-bound database with buffered IO. The performance summary is here.

This chart shows the relative QPS for a given version relative to RocksDB 6.0.2. The y-axis doesn't start at 0 in the second chart to improve readability for some lines.

Summary:

bug 12038 explains the drop in throughput for overwrite since 8.6.7
otherwise QPS in 9.x is similar to 6.0

Results: iodir

The iodir tests use an IO-bound database with O_DIRECT. The performance summary is here.

This chart shows the relative QPS for a given version relative to RocksDB 6.0.2. The y-axis doesn't start at 0 in the second chart to improve readability for some lines.

Summary:

the QPS drop for overwrite in 8.6.7 occurs because the db_bench client wasn't updated to use the new default value for compaction readahead size
QPS for overwrite is ~2X better in 9.x relative to 6.0
otherwise QPS in 9.x is similar to 6.0

todo

Small Datum

Saturday, November 9, 2024

RocksDB benchmarks: large server, leveled compaction

No comments:

Post a Comment

Postgres 18 beta2: large server, Insert Benchmark, part 2