Thursday, December 7, 2017

Insert benchmark: IO-bound, high-concurrency, fast server

This post explains the insert benchmark with an IO-bound and high-concurrency workload for MyRocks, InnoDB and TokuDB. The goal is to understand throughput and efficiency for different storage engines and different versions of MySQL. The previous tests used an in-memory workload.

tl;dr - for an IO-bound, high-concurrency workload
  • MyRocks gets ~3X more inserts/s vs InnoDB 5.6 and ~2X more vs InnoDB 5.7/8.0
  • MyRocks has the best QPS. MyRocks gets ~10X and ~4X more than InnoDB on read-write tests because it does less IO per read and per write.
  • MyRocks PK scans are ~2X slower than InnoDB but secondary index scans are almost as fast as InnoDB. Alas, scans get faster in InnoDB 5.7. Readahead helps MyRocks.
  • MyRocks is more space efficient. InnoDB is almost 2X larger than uncompressed MyRocks and almost 4X larger than compressed MyRocks.
  • MyRocks is more write efficient. InnoDB writes ~5X more to storage per insert on the load and ~15X more on the read-write test.

Configuration

The insert benchmark is described here. The test server has 48 HW threads, fast SSD and 50gb of RAM. The database block cache (buffer pool) was set to 10gb for MyRocks and TokuDB and to 35gb for InnoDB. The database was much larger than 50gb. The test was run with 16 tables and 16 query clients. For the read-write tests there is an additional writer client for each query client. The insert benchmark loaded the tables with 2b rows (125M rows/table), then did a full scan of each index (PK, 3 secondary, PK again), then two read-write tests. The first read-write test tries to insert 1000 rows/second per writer client while the other client does short range scans as fast as possible. The second read-write test is similar except the insert rate limit is 100/second per writer client. The scan and read-write tests use a client per table. With 16 tables there are 16 concurrent clients.

Tests were run for MyRocks, InnoDB from upstream MySQL, InnoDB from FB MySQL and TokuDB. The binlog was enabled but sync on commit was disabled for the binlog and database log. All engines used jemalloc. Mostly accurate my.cnf files are here.
  • MyRocks was compiled on October 16 with git hash 1d0132. Tests were repeated with and without compression. The configuration without compression is called MySQL.none in the rest of this post. The configuration with compression is called MySQL.zstd and used zstandard for the max level, no compression for L0/L1/L2 and lz4 for the other levels.
  • Upstream 5.6.35, 5.7.17, 8.0.1, 8.0.2 and 8.0.3 were used with InnoDB. SSL was disabled and 8.x used the same charset/collation as previous releases.
  • InnoDB from FB MySQL 5.6.35 was compiled on June 16 with git hash 52e058.
  • TokuDB was from Percona Server 5.7.17. Tests were done without compression and then with zlib compression.
The performance schema was enabled for upstream InnoDB and TokuDB. It was disabled at compile time for MyRocks and InnoDB from FB MySQL because FB MySQL 5.6 has user & table statistics for monitoring.

Results

All of the data for the tests is here. I adjusted iostat bytes written metrics for MyRocks because it currently counts bytes trimmed as bytes written which is an issue for RocksDB but my adjustment is not exact.

Load

The load is concurrent and there are 16 clients each loading a separate table . The graphs below have rates for each configuration relative to the rate for InnoDB 5.6.35. The graphs show the insert rate, the CPU overhead per insert, the storage write rate per insert and the storage read rate per insert.
  • MyRocks is ~3X faster than InnoDB 5.6 and more than 2X faster than InnoDB 5.7/8.0
  • MyRocks has the best CPU efficiency. InnoDB uses ~3X more CPU/insert.
  • InnoDB from FB MySQL does much better than upstream 5.6.

MyRocks and TokuDB have better write efficiency. InnoDB 5.6 writes ~5X more to storage per insert than uncompressed MyRocks and ~7X more than compressed MyRocks. I don't understand why the write rate is larger for InnoDB 5.7/8.0 than for 5.6.
MyRocks and TokuDB read less from storage per insert. There are two possible reasons for this. First, non-unique secondary index maintenance is read free. Second, the index might be smaller with them and remain in cache. I am not certain which of these explains it.

Size

This is the database size at the end of the load. The values are absolute. MyRocks is more space efficient than InnoDB. InnoDB uses almost 2X more space than uncompressed MyRocks and almost 4X more space than compressed MyRocks.


Scan

This graph shows the scan time for the PK index and and all of the secondary indexes. The value is relative to the time for InnoDB 5.6.35. The absolute value in seconds is here in the scan section. The number for the PK scan is from the second scan of it. The graph excludes FbInno5635 for readability and that version of FbInno5635 has a perf bug for concurrent scans (since fixed). The scans are concurrent and there are 16 clients each scanning indexes from separate tables.

The graph has two extra configurations: MyRocks.none.ra, MyRocks.zstd.ra. Both of these enabled filesystem readahead in RocksDB. For this workload InnoDB suffers from b-tree fragmentation for the secondary indexes but probably not for the PK. That might explain the MyRocks perf results for PK vs secondary scans.
  • MyRocks PK scans are ~2X slower than InnoDB 5.6
  • MyRocks secondary index scans are slightly faster than InnoDB 5.6
  • Readahead is a big deal for MyRocks index scans
  • Scans are faster for InnoDB starting in 5.7


Read-write, 1000 inserts/second

This section has results for the read-write test when the writer is limited to 1000 inserts/second, there are 16 tables and a reader/writer pair per table. The first graph has the QPS for short range queries. The second graph has the CPU/query. Both use values relative to InnoDB 5.6.35. All storage engines sustained the target insert rate of 16,000 rows/second.
  • MyRocks gets more than 6X the QPS compared to InnoDB
  • MyRocks uses less CPU and gets more QPS than InnoDB because it is more read and write efficient
This graph shows iostat read operations per query. The values are relative to InnoDB 5.6.35. The rate for InnoDB is ~10X the rate for uncompressed MyRocks and ~100X the rate for compressed MyRocks. I think the MyRocks indexes are larger than cache so I don't understand why this difference is so large. But I am happy about it.

This graph shows KB written to storage per insert. The values are relative to InnoDB 5.6.35. InnoDB writes ~15X more to storage per insert compared to MyRocks.


Read-write, 100 inserts/second

This section has results for the read-write test when the writer is limited to 100 inserts/second, there are 16 tables and a reader/writer pair per table. The graph has the QPS for short range queries with values relative to the value for InnoDB 5.6.35. MyRocks gets ~4X more QPS than InnoDB here while it got 10X or more in the previous section. The reason is that the write rate is lower on this test, so InnoDB uses less write IO and has more capacity for reads.



No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...