tl;dr - both are expected
- MyRocks insert performance doesn't degrade when the database is larger than memory
- InnoDB in MySQL 5.7 scales much better than in MySQL 5.6 for in-memory workloads
- InnoDB in MySQL 5.7 wins for in-memory, MyRocks wins for io-bound
Configuration
This test does inserts in PK order into a table with 3 secondary indexes. Without the secondary indexes the workload is friendly to a b-tree. But secondary index for a b-tree is read-modify-write and the reads from and writes to storage will make InnoDB slower when the working set doesn't fit into RAM. InnoDB is likely to do better than other b-trees because of the insert buffer. With MyRocks, non-unique secondary index maintenance is read-free so performance doesn't drop much when the working set no longer fits in RAM.
I ran the insert benchmark using this script with two configurations. Both configurations used 16 clients, 16 tables and a client per table. For the io-bound configuration there was 50gb of RAM for the database and OS and 2B rows were inserted. For the in-memory configuration there was 256gb of RAM for the database and OS and all data was cached by the database cache.
The test server has 48 cores with HT enabled and fast NVMe flash. For MyRocks I used the FB MySQL repo. For InnoDB I used MySQL 5.6.26 and 5.7.10. I tried to use a good my.cnf for all engines and the MyRocks settings are similar to what we suggest. For the io-bound configurations I used a 10gb RocksDB block cache and 35gb InnoDB buffer pool. For the in-memory configurations all database caches were set to 190gb. The binlog was enabled for all tests, but sync on commit was disabled for the database redo log & binlog.
I disabled compression for all engines in this test. InnoDB still uses much more space on disk, probably because of fragmented leaf pages.
Average insert rates
The average insert rates for each configuration. It is odd that the insert rate for io-bound MyRocks is better than for in-memory and I am not sure I will debug it. There isn't much difference between InnoDB from MySQL 5.6 and 5.7 for the io-bound configuration. There is a huge difference between them for the in-memory configuration. I assume the difference is all of the performance work done upstream (thank you). MyRocks is much faster than InnoDB for the io-bound configuration. InnoDB in MySQL 5.7 is much faster than MyRocks for the in-memory configuration. I expect MyRocks to do better on the in-memory setup in the future.
Legend for the table:
- io-bound - average inserts/second for the io-bound configuration
- in-memory - average inserts/second for the in-memory configuration
- size - database size in GB at test end
- engine - database engine
115234 103088 226 MyRocks
65604 267523 362 InnoDB-5.7.10
52812 111259 362 InnoDB-5.6.26
IO-bound results
I have two graphs to show the insert rates over time. This is for data from one of the 16 clients rather than the global rate because my test script forgot to collect that. The first graph uses log scale for the x-axis. The second graph does not. The first graph makes it easier to see how throughput drops for InnoDB as the database gets larger. While it is odd that the throughput rate for InnoDB 5.7 picks up at test end, that can occur if the thread I monitored ran longer than other threads and had less competition for HW resources near test end. The graph for MyRocks stops earlier than the InnoDB graphs because it finishes the insert of 2B rows much earlier.
As a bonus for MyRocks, it writes about 1/4 the amount to storage per inserted row when compared to InnoDB. That makes SSD last longer and using less IO for writes saves more IO for reads.
Mark, how big were the data disks in size? The shape of the curve looks somewhat like certain ssd benchmarks, though 20k seconds in is quite far for ssd performance to jump up again.
ReplyDeleteMyRocks seems to have less consistent performance, is that something that's a big worry?
Not sure what you mean by "data disks in size".
DeleteThe spike at the end for InnoDB with MySQL 5.7 is probably from having some clients (there are 16) run longer than others. The ones that run longer will get better throughput because fewer clients are running at that point.
MyRocks variance can be reduced via write throttling options, but I didn't use them here. This is something we will improve.
Ah, good point about multiple clients. I didn't connect the two.
DeleteI need to fix my test scripts to collect the global insert rate and avoid confusion in future graphs. Have been too busy or lazy to do that. I use mstat.py for that - https://github.com/mdcallag/mytools/tree/master/mstat
Delete