I found performance regressions on a large server with the Insert Benchmark when I compared builds from 2022 versus a current build. These builds were done using a complicated build script (special production compiler toolchains makes builds complex). In that previous post I claimed there was a ~15% regression for write-heavy benchmark steps and ~5% for read-heavy.
Ugh, the results below are bogus because I made mistakes in building MyRocks. The non-bogus results are here.
The rest of this post is not truthy.
The tests for these builds have to be repeated -- fbmy5635_rel_202104072149, fbmy5635_rel_202203072101, fbmy5635_rel_202205192101. Those builds were bad in part because writing C++ that can be compiled across multiple versions of g++ is non-trivial.
The good news is that I can't reproduce this problem using a much simpler build script and the small Beelink servers I have at home as the regression on the home servers is between 0% and 2%.
- fbmy5635_rel_202104072149 - from 20210407 at git hash (f896415fa0 MySQL, 0f8c041ea RocksDB), RocksDB 6.19
- fbmy5635_rel_202203072101 - from 20220307 at git hash (e7d976ee MySQL, df4d3cf6fd RocksDB), RocksDB 6.28.2
- fbmy5635_rel_202205192101 - from 20220519 at git hash (d503bd77 MySQL, f2f26b15 RocksDB), RocksDB 7.2.2
- fbmy5635_rel_202208092101 - from 20220809 at git hash (877a0e585 MySQL, 8e0f4952 RocksDB), RocksDB 7.3.1
- fbmy5635_rel_202210112144 - from 20221011 at git hash (c691c7160 MySQL, 8e0f4952 RocksDB), RocksDB 7.3.1
- fbmy5635_rel_202302162102 - from 20230216 at git hash (21a2b0aa MySQL, e5dcebf7 RocksDB), RocksDB 7.10.0
- fbmy5635_rel_202304122154 - from 20230412 at git hash (205c31dd MySQL, 3258b5c3 RocksDB), RocksDB 7.10.2
- fbmy5635_rel_202305292102 - from 20230529 at git hash (b739eac1 MySQL, 03057204 RocksDB), RocksDB 8.2.1
- fbmy5635_rel_jun23_7e40af677 - from 20230608 at git hash (7e40af67 MySQL, 03057204 RocksDB), RocksDB 8.2.1
The insert benchmark was run in two setups:
- cached by RocksDB - all tables fit in the RocksDB block cache
- IO-bound - the database is larger than memory
This benchmark used the Beelink server explained here that has 8 cores, 16G RAM and 1TB of NVMe SSD with XFS and Ubuntu 22.04.
The benchmark is run with 1 client. The benchmark is a sequence of steps.
- l.i0
- insert X million rows across all tables without secondary indexes where X is 20 for cached and 800 for IO-bound
- l.x
- create 3 secondary indexes. I usually ignore performance from this step.
- l.i1
- insert and delete another 100 million rows per table with secondary index maintenance. The number of rows/table at the end of the benchmark step matches the number at the start with inserts done to the table head and the deletes done from the tail.
- q100
- do queries as fast as possible with 100 inserts/s/client and the same rate for deletes/s done in the background. Run for 3600 seconds.
- q500
- do queries as fast as possible with 500 inserts/s/client and the same rate for deletes/s done in the background. Run for 3600 seconds.
- q1000
- do queries as fast as possible with 1000 inserts/s/client and the same rate for deletes/s done in the background. Run for 3600 seconds.
Configurations
The configuration (my.cnf) files are here and I use abbreviated names for them in this post. For each variant there are two files -- one with a 1G block cache, one with a larger block cache. The larger block cache size is 8G when LRU is used and 6G when hyper clock cache is used (see tl;dr).
- From the summaries that list average throughput for cached and for IO-bound the regression is at most 2%
- From the metrics that list HW overhead per operation for cached and for IO-bound (see the cpupq column) the CPU overhead per operation is similar across versions
- From compaction stats at the end of the last benchmark step (q1000) the cumulative stats for compaction are similar (see here, results are in date order)
No comments:
Post a Comment