Wednesday, September 6, 2023

Chasing a performance regression in MyRocks vs the Insert Benchmark

I found a few performance regressions with the Insert Benchmark and MyRocks on a large server and this explain the start of my search to for the root causes. The regressions are described in an earlier blog post (see the summaries here and here).

Update - the builds I used were bad so the results here are bogus. I fixed the builds, repeated the tests and share results here. There are no regressions and the initial load of the benchmark is ~10% faster in modern MyRocks.

The mystery

Note that l.i1, q100, q500, q1000 are benchmark steps explained here.

With MyRocks from FB MySQL 5.6.35

  1. Throughput drops by ~5% for l.i1 and ~5% for q100, q500, q1000 between fbmy5635_202203072101 and fbmy5635_202205192101. The former release is from 2022/03/07 with RocksDB 6.28.2 and the latter is from 2022/05/19 with RocksDB 7.2.2.
  2. After fbmy5635_202205192101.cy9c5_u the throughput for the read-write benchmark steps (q100, q500, q1000) gradually declines by another ~5%
  3. Throughput drops by ~5% for l.i1 between fbmy5635_202210112144 and fbmy5635_202302162102. The former release is from 2022/10/11 with RocksDB 7.3.1 and the latter is from 2023/02/16 with RocksDB 7.10.0
I will start with problem 3 -- the l.i1 regression between fbmy5635_202210112144 and fbmy5635_202210112144. 

This is an interesting problem

  • I assume the problem is more likely from changes to RocksDB than from changes to MyRocks 5.6.35 because RocksDB source gets more changes.
  • The pure-RocksDB benchmarks might not detect these regressions, but I will revisit that.
  • Running the Insert Benchmark on a small and medium server does not detect this regression. I have only seen it this bad on a large server and the large server not only has more cores but has 2 sockets.
Builds

First there is the base build:

  • fbmy5635_rel_202210112144 - from 20221011 at git hash (c691c7160 MySQL, 8e0f4952 RocksDB), RocksDB 7.3.1
Then there are variants that use the same MyRocks source as the base but with a more recent version of RocksDB:
  • fbmy5635_20221011_732 - base with RocksDB 7.3.2
  • fbmy5635_20221011_743 - base with RocksDB 7.4.3
  • fbmy5635_20221011_745 - base with RocksDB 7.4.5
  • fbmy5635_20221011_750 - base with RocksDB 7.5.0
  • fbmy5635_20221011_751 - base with RocksDB 7.5.1
  • fbmy5635_20221011_752 - base with RocksDB 7.5.2
  • fbmy5635_20221011_753 - base with RocksDB 7.5.3
  • fbmy5635_20221011_754 - base with RocksDB 7.5.4
  • fbmy5635_20221011_760 - base with RocksDB 7.6.0
  • fbmy5635_20221011_778 - base with RocksDB 7.7.8
  • fbmy5635_20221011_783 - base with RocksDB 7.8.3
  • fbmy5635_20221011_793 - base with RocksDB 7.9.3
  • fbmy5635_20221011_7102 - base with RocksDB 7.10.2

Benchmark

The insert benchmark was run in one setup -- cached by RocksDB and all tables fit in the RocksDB block cache. The server has 80 HW threads, 40 cores, 256G of RAM and fast NVMe storage with XFS. The configuration files (my.cnf) are here: base and c5. The difference between them is that c5 adds rocksdb_max_subcompactions=4.

The benchmark is run with 24 clients, 24 tables and a client per table. The benchmark is a sequence of steps.

  • l.i0
    • insert X million rows across all tables without secondary indexes where X is 20 for cached and 500 for IO-bound
  • l.x
    • create 3 secondary indexes. I usually ignore performance from this step.
  • l.i1
    • insert and delete another 50 million rows per table with secondary index maintenance. The number of rows/table at the end of the benchmark step matches the number at the start with inserts done to the table head and the deletes done from the tail.
  • q100, q500, q1000
    • do queries as fast as possible with 100, 500 and 1000 inserts/s/client and the same rate for deletes/s done in the background. Run for 7200 seconds.
Results

Performance reports are here for Cached by RocksDB (base config and c5 config). The regressions are easy to see in the second table in the Summary section (for base config and c5 config).

For l.i1 the regressions occur at:
  • From RocksDB 7.4.5 to 7.5.0 (fbmy5635_20221011_745 to fbmy5635_20221011_750) the relative throughput drops from 1.00 to 0.93 for the base config and from 1.00 to 0.94 for the c5 config.
  • There is a transient drop from 7.5.0 to 7.5.4 for the base config. I ignore that for now.
  • From RocksDB 7.8.3 to 7.9.3 (fbmy5635_20221011_783 to fbmy5635_20221011_793) the relative throughput drops from 0.93 to 0.90 for the base config and 0.95 to 0.90 for the c5 config.
What comes next?
  • For the regression between 7.4.5 and 7.5.0 I will look at the diffs added to 7.5.0. There are no releases in between them. I might do a custom build that gets some of the 7.5.0 diffs. I checked the RocksDB LOG and confirmed the RocksDB 7.5.0 build is at git hash 7506c1a4ca. I will use a build from b283f041f58de in the 7.5.fb branch which has a change that is also in 7.4.5 and then do move forward to test more diffs in 7.5.fb.
  • For the regression between 7.8.3 and 7.9.3 I will repeat tests for 7.9.0, 7.9.1 and 7.9.2
  • Eventually I will look at flamegraphs. Perhaps differential flamegraphs will be useful if the code hasn't changed too much.
  • Struggle to remember whether I previously reported this as a perf issue and then forgot


No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...