Tuesday, December 5, 2017

tpcc-mysql, IO-bound, high-concurrency: MyRocks, InnoDB and TokuDB

This has results for tpcc-mysql with a high-concurrency workload when the database is larger than memory. Tests were run for MyRocks, InnoDB and TokuDB. I previously shared results for an in-memory workload. While the database is larger than RAM there are few reads from storage per transaction unlike many of the other IO-bound benchmarks I run.

tl;dr:
  • InnoDB from upstream 5.7 has the best throughput and gets 1.18X more TPMC than upstream InnoDB 5.6. MyRocks with and without compression does slightly better than upstream InnoDB 5.6.
  • InnoDB is more CPU efficient. MyRocks uses 1.1X to 1.2X more CPU/transaction than upstream InnoDB 5.6.
  • MyRocks is more write efficient. InnoDB writes ~3X more to storage per transaction.
  • MyRocks is more space efficient. Uncompressed InnoDB uses 1.8X more space than uncompressed MyRocks and 2.9X more space than compressed MyRocks.
  • InnoDB has a regression from 5.7 to 8.0 - less throughput, more CPU overhead. I assume the problem is from new code above the storage engine.
  • InnoDB from FB MySQL 5.6 is almost as fast as upstream InnoDB 5.7. I assume several changes account for that result.

Disclaimer

For several reasons this isn't TPC-C. But this is tpcc-mysql. Read committed was used for MyRocks while the official benchmark requires repeatable read for some of the transaction types. When MyRocks supports gap locks this test can be repeated using repeatable read for it. Repeatable read was used for InnoDB and TokuDB.

Configuration

I used tpcc-mysql from Percona with my helper scripts. The test was run with 1000 warehouses and 20 customers. The database is larger than memory. The test server has 48 HW threads, fast SSD and 50gb of RAM. The database block cache (buffer pool) was 10g for MyRocks, 10g for TokuDB and 35g for InnoDB.

The test pattern is load and then run 12 hours of transactions in 1 hour loops. Results are reported for the last hour of transactions. I used 1000 milliseconds as the response time SLA.

Tests were run for MyRocks, InnoDB from upstream MySQL, InnoDB from FB MySQL and TokuDB. The binlog was enabled but sync on commit was disabled for the binlog and database log. All engines used jemalloc. Mostly accurate my.cnf files are here.
  • MyRocks was compiled on October 16 with git hash 1d0132. Tests were repeated with and without compression. The configuration without compression is called MySQL.none in the rest of this post. The configuration with compression is called MySQL.zstd and used zstandard for the max level, no compression for L0/L1/L2 and lz4 for the other levels. 
  • Upstream 5.6.35, 5.7.17, 8.0.1, 8.0.2 and 8.0.3 were used with InnoDB. SSL was disabled and 8.x used the same charset/collation as previous releases.
  • InnoDB from FB MySQL 5.6.35 was compiled on June 16 with git hash 52e058.
  • TokuDB was from Percona Server 5.7.17. Tests were done without compression and then with zlib compression.
The performance schema was enabled for upstream InnoDB and TokuDB. It was disabled at compile time for MyRocks and InnoDB from FB MySQL because FB MySQL 5.6 has user & table statistics for monitoring.

Graphs

The graphs are from the 12th hour of the transaction test. All of the data is here.  I adjusted iostat bytes written metrics for MyRocks because it currently counts bytes trimmed as bytes written which is an issue for RocksDB but my adjustment is not exact.

The first graph has the TPMC for an engine and the values are relative to the TPMC for InnoDB from upstream MySQL 5.6. TPMC is transaction throughput.
  • InnoDB from 5.7.17 does the best
  • MyRocks is slightly better than InnoDB from upstream 5.6.35.
  • MyRocks doesn't lose much throughput from using compression.
  • InnoDB from FB MySQL 5.6.35 has the second best throughput which is 1.16X more than upstream 5.6.35. I am not sure which changes explain that.
  • There is a regression from 5.7 to 8.x for upstream MySQL. I assume the cause is new code above the storage engine.

The next graph has the CPU overhead per TPMC and the values are relative to upstream InnoDB 5.6. A value > 1 means that the engine uses more CPU/transaction than the base case.
  • MyRocks uses 1.12X to 1.22X more CPU than InnoDB 5.6
  • InnoDB from FB MySQL uses less CPU than InnoDB from upstream 5.6
  • There is a CPU regression from upstream 5.7 to 8.x

This graph has the KB written to storage per TPMC and the values are relative to upstream InnoDB 5.6. MyRocks is more write efficient as InnoDB writes ~3X more to storage per transaction.
The next graph has iostat read operations per transaction. The values are relative to upstream InnoDB 5.6. The absolute values are small, most are less than 0.10. While the test database is larger than memory the working set almost fits in memory. I tried using 2X more warehouses, but that takes longer to load and didn't change the read/transaction rate by much to justify the extra time.

This graph shows the size of the database after the load and after 12 hours of transactions. Rows are added during the transaction test, the database is expected to grow and the growth is a function of the transaction rate. Fortunately MyRocks and InnoDB have similar transaction rates so it is mostly fair to compare database size.

MyRocks is more space efficient than InnoDB. Uncompressed InnoDB uses ~1.8X more space than uncompressed MyRocks and 2.9X more space than compressed MyRocks.

2 comments:

  1. Always admire your work! Is there any reason why you chose to give buffer pool size 35g to InnoDB and 10g to myrocks and toku?

    ReplyDelete
    Replies
    1. MyRocks and Toku used buffered IO, so the OS page cache helps them. InnoDB used innodb_flush_method = O_DIRECT, so it didn't get extra memory from the OS page cache

      Delete

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...