Small Datum: Sysbench performance over time for InnoDB and MyRocks: part 1

I spent much time in 2024 documenting performance regressions from old to new versions of MySQL with InnoDB and MyRocks. More posts will be published in 2025, including this sequence of posts, but my work on that is winding down. Most of the problems are from many small regressions rather than a few big ones and it is very expensive to find and fix regressions long after they arrive.

Hopefully someone else will emerge to do work like this for MySQL going forward or upstream finds more resources to prevent new small regressions from arriving. The trend over the past decade has been great for Postgres. And if upstream wants to grow Heatwave, then avoiding regressions in the base product is one way to help with that.

The purpose of this post is to document the regressions from MySQL 5.6 through 8.0 for MyRocks and upstream InnoDB using sysbench with a cached workload.

tl;dr, v1

Modern InnoDB at low concurrency gets about 70% of the QPS relative to older InnoDB (5.6.51). The regressions aren't as bad at high concurrency.
Modern MyRocks gets between 80% and 95% of the QPS relative to older MyRocks (5.6.35) in most cases (high-concurrency writes is the exception where QPS is similar for old and new.

tl;dr, v2

Both upstream MySQL and the FB MySQL project would benefit from changepoint detection (like Nyrkio) using sysbench microbenchmarks to detect regressions
Regressions from MySQL 5.6 to 8.0 for InnoDB are worse at low concurrency than at high. MySQL has gotten more efficient at high concurrency workloads, although some of that benefit is lost from code bloat which is more visible at low concurrency.
Regressions from MySQL 5.6 to 8.0 for MyRocks are better than InnoDB at low concurrency, but worse at high concurrency. I suspect that regressions at high concurrency for MyRocks would be better than InnoDB had I enabled the hyper clock cache.

Builds

I used InnoDB from upstream MySQL 5.6.51, 5.7.44, 8.0.28 and 8.0.32 and MyRocks from FB MySQL 5.6.35, 8.0.28 and 8.0.32. Everything was compiled with gcc, CMAKE_BUILD_TYPE =Release, -O2 and -fno-omit-frame-pointer.

For MyRocks, the results in this post use these builds:

fbmy5635_rel_o2nofp_210407_f896415f_6190

MyRocks 5.6.35 at git hash f896415f (2021/04/07) and RocksDB 6.19.0

fbmy5635_rel_o2nofp_231016_4f3a57a1_870

MyRocks 5.6.35 at git hash 4f3a57a1 (2023/10/16) and RocksDB 8.7.0

fbmy8028_rel_o2nofp_231202_4edf1eec_870

MyRocks 8.0.28 at git hash 4edf1eec (2023/12/02) and RocksDB 8.7.0

fbmy8032_rel_o2nofp_end_241023_ba9709c9_971

MyRocks 8.0.32 at git hash ba9709c9 (2024/10/23) and RocksDB 9.7.1

fbmy8032_rel_o2nofp_241023_ba9709c9_971_bug1473_1481_1482_1506

MyRocks 8.0.32 at git hash ba9709c9 (2024/10/23) and RocksDB 9.7.1 with fixes applied for bugs 1473, 1481, 1482 and 1506. There is also a workaround for bug 1505 via my.cnf changes. But note that bug 1506 is only for InnoDB.

The bugs mentioned above are specific to FB MySQL and have simple fixes:

1473

is_thd_db_read_only_by_name accounts for ~2% of CPU time on CPU-bound and write-heavy microbenchmarks

1481

ha_statistic_increment accounts for ~5% of CPU time on CPU-bound table scans

1482

calls to clock_gettime in sql/optimizer.cc account for ~4% of CPU time on several microbenchmarks

1505

the default for yield_check_frequency is zero which makes MySQL waste much time in calls to thd_wait_yield. A better default is 10. Too many my.cnf options isn't a big deal. Too many options with bad default values is a big deal. The workaround for this is to set yield_check_frequency=10 in my.cnf

1506

this is limited to InnoDB in the FB MySQL tree. It is from changes in the concurrency ticket code and reduces CPU-bound table scan performance by ~20%.

The my.cnf files are in the subdirectories here. For InnoDB with upstream MySQL I used my.cnf.cz11a_c32r128 and these are here for 5.6.51, 5.7.44, 8.0.28 and 8.0.32.

For MyRocks I used the following. None of these enable the hyper clock block cache which will make modern MyRocks look much better at high concurrency:

fbmy5635_rel_o2nofp_210407_f896415f_6190

my.cnf.cza1_c32r128

fbmy5635_rel_o2nofp_231016_4f3a57a1_870

my.cnf.cza1_c32r128

fbmy8028_rel_o2nofp_231202_4edf1eec_870

my.cnf.cza1_c32r128

fbmy8032_rel_o2nofp_end_241023_ba9709c9_971

my.cnf.cza2_c32r128 (this adds yield_check_frequency=10)

fbmy8032_rel_o2nofp_241023_ba9709c9_971_bug1473_1481_1482_1506

my.cnf.cza2_c32r128 (this adds yield_check_frequency=10)

Hardware

Tests were run on four different servers. Results in this post are only from c32r128 but the posts that follow have results from the other servers. The servers are:

c8r16

The c8r16 name stands for 8 CPU cores and 16G of RAM. This is a Beelink SER4 with an AMD Ryzen 7 4700 CPU with SMT disabled, 8 cores, 16G of RAM, Ubuntu 22.04 and ext4 on 1 NVMe device.

c8r32

The c8r32 name stands for 8 CPU cores and 32G of RAM. This is an ASUS ExpertCenter PN53 with AMD Ryzen 7 7735HS, with SMT disabled, 8 cores, 32G RAM, Ubuntu 22.04 and ext4 on 1 NVMe device.

c24r64

The c24r64 name stands for 24 CPU cores and 64G of RAM. This is a SuperMicro SuperWorkstation 7049A-T with 2 sockets, 12 cores/socket, 64G RAM, one m.2 SSD (2TB, ext4). The CPUs are Intel Xeon Silver 4214R CPU @ 2.40GHz.

c32r128

The c32r128 name stands for 32 CPU cores and 128G of RAM. This is a Dell Precision 7865 Tower Workstation with 1 socket, 128G RAM, AMD Ryzen Threadripper PRO 5975WX with 32-Cores, 2 m.2 SSD (each 2TB, RAID SW 0, ext4).

Benchmark

I used sysbench and my usage is explained here. A full run has 42 microbenchmarks and most test only 1 type of SQL statement. Here I use abbreviated runs with 26 microbenchmarks to save time. The database is cached by InnoDB and MyRocks.

The benchmark is run with ...

c8r16 - 1 thread, 1 table, 30M rows
c8r32 - 1 thread, 1 table, 50M rows
c24r64 - 1 thread and then 16 threads, 8 tables, 10M rows/table
c32r128 - 1 thread and then 24 threads, 8 tables, 10M rows/table

Each microbenchmark runs for 300 seconds if read-only and 600 seconds otherwise. Prepared statements were enabled.

Relative QPS

The charts and summary statistics that follow use a number that I call the relative QPS (rQPS) where:

rQPS is: (QPS for my version) / (QPS for base version)
base version for InnoDB is 5.6.51 and for MyRocks is 5.6.35 (fbmy5635_rel_o2nofp_210407_f896415f_6190)
my version is one of the other versions

Summary statistics

The microbenchmarks are split into three groups: point queries, range queries, writes. The tables below have summary statistics for InnoDB and MyRocks using the relative QPS of InnoDB from MySQL 8.0.32 and MyRocks from fbmy8032_rel_o2nofp_241023_ba9709c9_971_bug1473_1481_1482_1506.

InnoDB with 1 thread

based on the median values, InnoDB in 8.0.32 gets ~72% of the QPS relative to 5.6.51

	min	max	avg	median
point	0.65	0.88	0.73	0.72
range	0.68	0.99	0.77	0.73
write	0.50	1.21	0.75	0.70

InnoDB with 24 threads

the results here are better than above. For point queries and writes, upstream made InnoDB better for high-concurrency workloads which counters the performance loss from new CPU overheads. However, that isn't true for writes. The issue is made worse by some of the refactoring in InnoDB that landed after 8.0.28.

	min	max	avg	median
point	0.68	1.54	1.09	1.29
range	0.63	1.13	0.84	0.77
write	1.05	2.02	1.47	1.45

MyRocks with 1 thread

the regressions here aren't as bad as they are above for InnoDB

	min	max	avg	median
point	0.79	1.15	0.93	0.93
range	0.63	0.92	0.84	0.86
write	0.63	0.87	0.79	0.80

MyRocks with 24 threads

Based on the median values, MyRocks at higher concurrency here does better than at low concurrency above. I am not sure why. Perhaps had I enabled the hyper clock block cache the results here would be better than they are above for InnoDB at 24 threads.

	min	max	avg	median
point	0.85	1.22	1.01	0.95
range	0.68	1.07	0.97	1.01
write	0.77	0.99	0.91	0.91

Results

The microbenchmarks are split into three groups: point queries, range queries, writes. The charts that follow plot the relative QPS.

To save space on the charts for MyRocks, the version names on the charts are:

5635.231016

fbmy5635_rel_o2nofp_231016_4f3a57a1_870

8028.231202

fbmy8028_rel_o2nofp_231202_4edf1eec_870

8032.241023.nofix

fbmy8032_rel_o2nofp_end_241023_ba9709c9_971

8032.241023.fix

fbmy8032_rel_o2nofp_241023_ba9709c9_971_bug1473_1481_1482_1506

The spreadsheets and charts are here in the dell.fbin.1 and dell.fbin.24 tabs for InnoDB and the dell.fbrx.1 and dell.fbrx.24 tabs for MyRocks. I don't know how to prevent the microbenchmark names on the x-axis from getting truncated in the png files I use here but they are easier to read on the spreadsheet.

The chart with NT=1 in the legend has results for 1 thread tests. The chart with NT=24 has results for 24 threads.

Results: InnoDB and point queries

Notes for NT=1 (1 thread):

the worst regression is from hot-points where 5.6 gets almost 2X more QPS than 8.0
for all tests except one, InnoDB in 8.0 gets less than 75% of the throughput relative to 5.6
the regression for 8.0.28 in the third group from the right is from bug 102037

Notes for NT=24 (24 threads)

Upstream MySQL has improvements for concurrent workloads in 5.7 and 8.0 which offset the CPU regressions.

With 5.7.44, 5 tests are much faster than 5.6.51, 2 tests have similar QPS and 2 are slower
With 8.0.32, 5 tests are much faster than 5.6.51 and 4 are slower

Results: MyRocks and point queries

Notes for NT=1 (1 thread):

Regressions from 5.6 to 8.0 are smaller here for MyRocks than above for InnoDB
In 8.0.32, 7 tests are slower than 5.6.35 and 2 are slightly faster
The regression for 8.0.28 in the third group from the right is from bug 102037

Notes for NT=24 24 threads):

The results here are about the same as the results above for MyRocks at 1 thread. Were I to enable the hyper clock cache they might be much better.

Results: InnoDB and range queries

Notes for NT=1 (1 thread):

The regressions are large for everything but long range scans. And for long-range scans while 5.7.44 is faster than 5.6, that benefit no longer exists in 8.0, especially in 8.0.30+.

Notes for NT=24 (24 threads):

Results here are much better than above for InnoDB at 1 thread. But about half of the tests still have large regressions.

Results: MyRocks and range queries

Notes for NT=1 (1 thread):

Results here are similar to what I see above for InnoDB at 1 thread. I will explain more elsewhere but it has been hard to track down the root causes to anything other than code bloat, as in more cache and TLB activity.

Notes for NT=24 24 threads):

Results here at high concurrency are much better than above at low concurrency, with the exception of the results for the full scan (the last group on the right).