Thursday, November 30, 2017

In-memory linkbench and a fast server: MyRocks, InnoDB and TokuDB

This post explains MySQL performance for Linkbench on a fast server. This used a low-concurrency workload to measure response time, IO and CPU efficiency. Tests were run for MyRocks, InnoDB and TokuDB. I wrote a similar report a few months ago. The difference here is that I used an updated compiler toolchain, a more recent version of MyRocks and MySQL 8.0.3. The results didn't change much from the previous blog post.
'
tl;dr:
  • InnoDB from MySQL 5.6 had the best throughput
  • CPU efficiency is similar for MyRocks and InnoDB. But to be fair, MyRocks uses ~20% more CPU than InnoDB in MySQL 5.6.35
  • There is a CPU regression from MySQL 5.6 to 5.7 to 8.x. About 30% of throughput is lost on load and transaction rates from 5.6.35 to 8.0.3. I assume most of this is code above the storage engine layer.
  • InnoDB writes more than 10X to storage per transaction compared to MyRocks. An SSD will last longer with MyRocks. 
  • Uncompressed InnoDB uses ~1.6X more space than uncompressed MyRocks

Configuration

I used my Linkbench repo and helper scripts to run linkbench with maxid1=10M, loaders=1 and requestors=1 so there will be 2 concurrent connections doing the load and 1 connection running transactions after the load finishes. My linkbench repo has a recent commit that changes the Linkbench workload and this test included that commit. The test pattern is 1) load and 2) transactions. The transactions were run in 12 1-hour loops and I share results from the last hour. The test server has 48 HW threads, fast SSD and 256gb of RAM.

Tests were run for MyRocks, InnoDB from upstream MySQL, InnoDB from FB MySQL and TokuDB. The binlog was enabled but sync on commit was disabled for the binlog and database log. All engines used jemalloc. Mostly accurate my.cnf files are here but the database cache was made large enough to cache the ~10gb database.
  • MyRocks was compiled on October 16 with git hash 1d0132. Compression was not used.
  • Upstream 5.6.35, 5.7.17, 8.0.1, 8.0.2 and 8.0.3 were used with InnoDB. SSL was disabled and 8.x used the same charset/collation as previous releases.
  • InnoDB from FB MySQL 5.6.35 was compiled on June 16 with git hash 52e058. The results for it aren't interesting here but will be interesting for IO-bound linkbench.
  • TokuDB was from Percona Server 5.7.17. Compression was not used.
The performance schema was enabled for upstream InnoDB and TokuDB. It was disabled at compile time for MyRocks and InnoDB from FB MySQL because FB MySQL still has user & table statistics for monitoring.

Graphs

The first two graphs show the load and transaction rates relative to InnoDB from upstream MySQL 5.6.35. For this test it has the best rates for load and transactions. There is a big drop in throughput for InnoDB from 5.6.35 to 8.0.3 for both the load and transaction tests.
The chart below has the KB written to storage per transaction. The rate for InnoDB is more than 10X the rate for MyRocks. An SSD will last longer with MyRocks. The rate for MyRocks is also much better than TokuDB. The rate here for TokuDB is worse than what I measured in September and I have yet to debug it.
All engines use a similar amount of space after the load, ~15gb. But MyRocks does much better after 12 hours of transactions -- InnoDB is ~1.6X larger and TokuDB is ~1.19X larger. The problem for InnoDB is B-Tree fragmentation. The advantage for MyRocks is leveled compaction which limits garbage to ~10% of the database size.

Load Results

All of the data is here. I adjusted iostat metrics for MyRocks because iostat currently counts bytes trimmed as bytes written which is an issue for RocksDB but my adjustment is not exact. The table below has a subset of the results
  • InnoDB 5.6 has the best insert rate but there is a regression from 5.6.35 to 5.7.17 to 8.0.3. I assume most of that is from code above the storage engine.
  • Write efficiency (wkb/i) is similar for all engines
  • CPU efficiency (Mcpu/i) is similar for MyRocks and InnoDB

ips     wkb/i   Mcpu/i  size    wMB/s   cpu     engine
 49986  0.80     98     14      40.1     4.9    MyRocks.Oct16
 62224  0.98     72     15      61.1     4.5    FbInno.Jun16
 63891  1.03     74     16      65.7     4.7    Inno.5635
 56383  1.03     85     16      58.3     4.8    Inno.5717
 55173  1.04     78     16      57.6     4.3    Inno.801
 41815  1.05    103     16      44.0     4.3    Inno.802
 43590  1.06    101     16      46.4     4.4    Inno.803
 23664  1.34    160     14      31.7     3.8    Toku.5717

legend:
* ips - inserts/second
* wkb/i - iostat KB written per insert
* Mcpu/i - normalized CPU time per insert
* wMB/s - iostat write MB/s, average
* size - database size in GB at test end
* cpu - average value of vmstat us + sy columns

Transaction Results

These are results from the 12th 1-hour loop of the transaction phase. All of the data is here. I adjusted iostat metrics to for MyRocks because iostat currently counts bytes trimmed as bytes written which is an issue for RocksDB but my adjustment is not exact. 
  • InnoDB 5.6 has the best transaction rate but there is a regression from 5.6.35 to 5.7.17 to 8.0.3. I assume most of that is from code above the storage engine.
  • Write efficiency (wkb/t) was better for MyRocks. InnoDB writes more than 10X to storage per transaction compared to MyRocks.
  • CPU efficiency (Mcpu/t) is similar for MyRocks and InnoDB
  • Response times are similar for MyRocks and InnoDB
  • Space efficiency is better for MyRocks. InnoDB is ~1.6X larger.

tps   wkb/t  Mcpu/t  size  un   gn   ul   gl    wMB/s  engine
5753  0.44   677     16    0.3  0.1  0.5  0.5    2.5   MyRocks.Oct16
7065  5.11   624     23    0.3  0.1  0.4  0.3   36.1   FbInno.Jun16
7420  5.17   562     26    0.3  0.1  0.4  0.2   38.4   Inno.5635
6616  5.20   628     26    0.3  0.1  0.5  0.3   34.4   Inno.5717
6313  5.16   654     25    0.3  0.1  0.5  0.3   32.6   Inno.801
5978  5.38   682     25    0.3  0.1  0.6  0.3   32.2   Inno.802
6070  5.39   669     25    0.3  0.1  0.6  0.3   32.7   Inno.803
4234  2.92   814     19    0.5  0.2  1    0.6   12.4   Toku.5717

legend:
* tps - transactions/second
* wkb/t - iostat KB written per transaction
* Mcpu/t - normalized CPU time per transaction
* size - database size in GB at test end
* un, gn, ul, gl - 99th percentile response time in millisecs for
      UpdateNode, GetNode, UpdateList and GetLinkedList transactions
* wMB/s - iostat write MB/s, average

No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...