Thursday, February 23, 2017

Using modern sysbench to compare MyRocks and InnoDB on a small server

I used sysbench to compare MyRocks and InnoDB on a small server. I ran tests for an IO-bound and in-memory configuration and share results for the IO-bound setup here.

tl;dr
  • On some workloads InnoDB is faster than MyRocks but this does not reproduce on the faster/newer CPUs that I use at work. I wasn't able to explain this but it looks like the memory-latency bound functions in RocksDB account for more of the CPU overhead on my home servers. Maybe I need to try Intel vTune, I just wish it didn't cost $899.
  • Performance with the default my.cnf is very good for all of the engines. The optimized my.cnf is more useful on the larger servers I use at work. One change in the optimized my.cnf increases the RocksDB page size from 4kb to 16kb which can hurt throughput on IO-bound workloads.
  • MyRocks did much better on inserts, updates and deletes.
  • InnoDB did much better on range scans and better on point selects. There are some changes in MySQL 5.7 that makes range scans faster. The goal for MyRocks is much better space & write efficiency than InnoDB (use less SSD, SSD lasts longer) with similar or good-enough read efficiency. We achieved the space and write efficiency goals. There is more work to be done for read efficiency.
  • There isn't much of a difference for InnoDB between MySQL 5.6.35 and 5.7.10. The difference is more obvious on workloads with more concurrency using larger servers.
  • There is little overhead from compression with MyRocks. There is a lot with InnoDB, especially on write-heavy tests where uncompressed InnoDB is much faster than compressed InnoDB. 
Details

The small server is an Intel NUC (NUC5i3ryh) with a 5th generation core i3 that runs at 2.1GHz. I use these at home because they are small, quiet, efficient and affordable. The server has 2 CPU cores, 4 with HT enabled, 8G of RAM, a 110gb Samsung 850 EVO m.2 SSD and a 7200 RPM disk. The OS is Ubuntu 16.04.

Tests are run for three storage engines. The first is myrocks-5635 which is MyRocks from FB MySQL merged to upstream MySQL 5.6.35. The second is innodb-5635 which is InnoDB from upstream MySQL 5.6.35. The third is innodb-5710 which is InnoDB from upstream MySQL 5.7.10. Command lines for cmake to build each binary are here.

The test was repeated three times for each engine: with the default my.cnf, with an optimized my.cnf and no compression and with an optimized my.cnf and zlib compression. This test is the IO-bound configuration with 4 tables and 40M rows per table. The database is larger than RAM but my test scripts didn't record the database size (need to fix that).
  • myrocks-5635 - the my.cnf files for MyRocks are here. Snappy compression is used for all levels in the default my.cnf. For the optimized my.cnf with zlib compression there is no compression for L0-L2, LZ4 compression for L3 to the next to last level and then zlib for the last level of the LSM tree. Prior to the insert-only tests the database was ~34gb without compression and ~18gb with zlib compression. Compaction IO stats are here, but suffers from a bug that is fixed but not yet in my build.
  • innodb-5635 - the my.cnf files for InnoDB in MySQL 5.6.35 are here
  • innodb-5710 - the my.cnf files for InnoDB in MySQL 5.7.10 are here.
  • innodb-5717 - the my.cnf file for InnoDB in MySQL 5.7.17 is here. In this post I don't share results for 5.7.17 but future posts use it.
  • innodb-801 - the my.cnf file for InnoDB in MySQL 8.0.1 is here. In this post I don't share results for 8.0.1 but future posts use it.
The benchmark client is modern sysbench although I have begun to use my fork. This is run by a helper script that uses a sequence of  workloads via bundled lua scripts. All tests use the uniform distribution except for update-nonindex-special. The default in sysbench is a skewed distribution. But skew means that IO-bound tests (database >> RAM) are less IO-bound because there are more cache hits. So I switched.

Tests are run in this order:
  • prepare - this isn't a test, it creates and loads the tables. For the IO-bound test there were 4 tables with 40M rows per table. The sysbench table has one secondary index.'
  • update-inlist - each update statement updates 100 rows selected by exact match on the PK via an in-list
  • update-one - each update statement updates one row. The same row in the database gets all updates.
  • update-index - uses oltp_update_index.lua to run an update-only workload and secondary index maintenance is required. MyRocks does better on this test because non-unique secondary index maintenance is read-free for it.
  • update-nonindex - uses oltp_update_non_index.lua to run an update-only workload and secondary index maintenance is not required
  • update-nonindex-special - like update-nonindex but uses the special distribution
  • delete - uses oltp_delete.lua to run a sequence of delete row, re-insert row operations. Upstream doesn't use transactions (yet) but the diff to fix that is small.
  • write-only - uses oltp_write_only.lua to get of the writes but none of the reads from oltp_read_write.lua. I have stopped running this test.
  • read-write.rangeX - uses oltp_read_write.lua --range-size=X. I set X too 100 and then 10,000.
  • read-only.rangeX - uses oltp_read_only.lua with --range-size=X. I set X to 10, 100, 1000 and then 10,000. I have changed to only run this for X set to 100 and 10,000. The read-only.range10000 is also run before the update tests so the performance can be measured before fragmentation. The test run before updates is called read-only.pre.
  • point-query - uses oltp_point_select.lua. The workload is to fetch all columns in one row by primary key. This test is run before and after the update tests so the performance can be measured before fragmentation. The test run before updates is called point-query.pre.
  • random-points - uses oltp_inlist_select.lua. Each select statement fetches 100 rows found by exact match on the PK using an in-list. This test is run before and after the update tests so the performance can be measured before fragmentation. The test run before updates is called random-points.pre.
  • scan - this does a full scan of the primary index and is done before and after the update tests to determine the impact from fragmentation.
  • hot-points - uses oltp_inlist_select.lua. Each select statement fetches 100 rows found by exact match on the PK. This is similar to random-points except this fetches the same 100 rows for all queries. The working set for this test is always cached.
  • insert - uses oltp_insert.lua for an insert-only workload
Several of the read-heavy tests are run before and after the write-heavy tests. That is done to understand the impact of fragmentation on query efficiency as the database is fragmented by the write-heavy tests. Some of the read-heavy tests can be slower if the database is still doing B-Tree write back or LSM compaction, and that can happen for tests run before and after the write-heavy tests. While the read-heavy tests are run before the write-heavy tests, they are still run after the load which is also write-heavy and the script above doesn't do much to wait for writes to stop after the load.

Sample command lines:
# myrocks
bash all_small.sh 4 40000000 600 600 300 rocksdb 1 0 ~/b/myrocks/bin/mysql \
    none ~/b/sysbench/share/sysbench

# InnoDB in MySQL 5.6.35 without compression
bash all_small.sh 4 40000000 600 600 300 innodb 1 0 ~/b/orig5635/bin/mysql \
    none ~/b/sysbench/share/sysbench

# InnoDB in MySQL 5.6.35 with zlib compression
bash all_small.sh 4 40000000 600 600 300 innodb 1 0 ~/b/orig5635/bin/mysql \
    key_block_size=8 ~/b/sysbench/share/sysbench

Update-only with secondary index maintenance


The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
4162    6655    8585    myrocks-5635
 543     712     810    innodb-5635
 610     714     816    innodb-5710
- optimized my.cnf without compression
4236    6762    8394    myrocks-5635
 552     687     810    innodb-5635
 579     699     791    innodb-5710
- optimized my.cnf with zlib compression
3540    5665    7148    myrocks-5635
 218     403     533    innodb-5635
 237     389     469    innodb-5710

Summary:
  • MyRocks is best because non-unique secondary index maintenance is write-only (read-free)
  • The optimized my.cnf didn't help
  • Compressed InnoDB is worse than uncompressed InnoDB. The per-index mutex might be the largest problem. This isn't explained by (de)compression latency.
  • InnoDB in MySQL 5.6 and 5.7 have similar performance

Update-only without secondary index maintenance

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
4596    7726    10100   myrocks-5635
2224    2892     3295   innodb-5635
2327    2886     3176   innodb-5710
- optimized my.cnf without compression
4578    7202     9616   myrocks-5635
2887    3835     4364   innodb-5635
2694    3419     4008   innodb-5710
- optimized my.cnf with zlib compression
4031    6320     7998   myrocks-5635
 484     768      888   innodb-5635
 500     687      815   innodb-5710

Summary:
  • MyRocks is best but the difference with InnoDB is smaller because no engines have to do secondary index maintenance for this test.
  • The optimized my.cnf helped InnoDB but hurt MyRocks
  • Compressed InnoDB is much worse than uncompressed InnoDB. The per-index mutex might be the largest problem.
  • InnoDB in MySQL 5.6 is slightly faster than in 5.7

Delete

The numbers in the table are the QPS for 1 thread. I can't use more threads until I update the test script to use transactions for each delete/reinsert pair.

1       concurrency
- default my.cnf
6294    myrocks-5635
 963    innodb-5635
1010    innodb-5710
- optimized my.cnf without compression
5821    myrocks-5635
1077    innodb-5635
1090    innodb-5710
- optimized my.cnf with zlib compression
5453    myrocks-5635
 644    innodb-5635
 588    innodb-5710

Summary:
  • MyRocks is best. That might be because secondary index maintenance is write-only (read-free).
  • The optimized my.cnf didn't make a big difference
  • Compressed InnoDB is much worse than uncompressed InnoDB. The per-index mutex might be the largest problem.
  • InnoDB in MySQL 5.6 and 5.7 have similar performance

Write-only

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
8813    12937   16351   myrocks-5635
1808     2151    2344   innodb-5635
1895     2172    2340   innodb-5710
- optimized my.cnf without compression
8062    12565   16010   myrocks-5635
1886     2158    2644   innodb-5635
1830     2160    2493   innodb-5710
- optimized my.cnf with zlib compression
7250    10977   13965   myrocks-5635
 791     1030    1214   innodb-5635
 724      972    1088   innodb-5710

Summary:
  • MyRocks is best. That might be because secondary index maintenance is write-only (read-free).
  • The optimized my.cnf didn't make a big difference
  • Compressed InnoDB is much worse than uncompressed InnoDB. The per-index mutex might be the largest problem.
  • In some cases InnoDB in MySQL 5.6 is slightly faster than 5.7

Read-write with --range-size=100

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
4740    8490    11516   myrocks-5635
2800    3649     4445   innodb-5635
3392    4424     5036   innodb-5710
- optimized my.cnf without compression
5480    9508    13005   myrocks-5635
3370    4729     5555   innodb-5635
3573    4814     5614   innodb-5710
- optimized my.cnf with zlib compression
5157    8525    11883   myrocks-5635
1756    2415     2961   innodb-5635
1637    2324     2769   innodb-5710

Summary:
  • MyRocks is best because it processes writes faster than InnoDB and the writes dominate the workload
  • The optimized my.cnf helped
  • Compressed InnoDB is much worse than uncompressed InnoDB, perhaps because of the per-index mutex.
  • InnoDB in MySQL 5.6 and 5.7 have similar performance

Read-write with --range-size=10000

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
174     305     416     myrocks-5635
208     348     484     innodb-5635
346     550     723     innodb-5710
- optimized my.cnf without compression
207     370     500     myrocks-5635
241     425     596     innodb-5635
374     607     792     innodb-5710
- optimized my.cnf with zlib compression
194     363     469     myrocks-5635
168     298     416     innodb-5635
213     369     484     innodb-5710

Summary:
  • Compared to oltp_read_write.lua with --range-size=100, the bottleneck here switches from writes to reads and InnoDB is faster than MyRocks for long range scans.
  • The optimized my.cnf helps MyRocks
  • Compressed InnoDB is much worse than uncompressed InnoDB. The largest problem might be the per-index mutex but another problem is (de)compression latency.
  • InnoDB in MySQL 5.7 is much faster than in 5.6.

Read-only with --range-size=10

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
 8407   13967   19378   myrocks-5635
 7725   14452   21220   innodb-5635
 9962   17029   24023   innodb-5710
- optimized my.cnf without compression
 8089   13189   16243   myrocks-5635
11219   19683   27638   innodb-5635
11536   19198   26926   innodb-5710
- optimized my.cnf with zlib compression
 8076   12884   18263   myrocks-5635
 7549   12009   17570   innodb-5635
 7380   11700   17173   innodb-5710

Summary:
  • InnoDB is faster than MyRocks because of the read overhead from an LSM.
  • MyRocks QPS on the read-only tests has extra variance because the state of the memtable (number of keys) and L0 (number of files) has more variance when the read-only test follows write heavy tests.
  • On this test InnoDB does better than MyRocks on my older/slower CPU (5th generation core i3) that I have at home while MyRocks does better on newer Intel Xeon CPUs I get at work.
  • Compressed InnoDB is worse than uncompressed InnoDB. I assume this is decompression latency.
  • InnoDB in MySQL 5.6 and 5.7 has similar performance

Read-only with --range-size=100

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
5130     8834   12320   myrocks-5635
5818    10236   15274   innodb-5635
7682    13208   18709   innodb-5710
- optimized my.cnf without compression
5683     9477   12781   myrocks-5635
7455    13236   19261   innodb-5635
8514    14517   20507   innodb-5710
- optimized my.cnf with zlib compression
5445     8872   12927   myrocks-5635
5140     8150   12456   innodb-5635
5432     8463   12911   innodb-5710

Summary:
  • See summary for oltp_read_only.lua with --range-size=10
  • The difference between InnoDB and MyRocks gets larger as the range scan gets larger

Read-only with --range-size=10000

The numbers in the table are the QPS for 1, 2 and 4 threads.
1       2       4       concurrency
- default my.cnf
138     240     325     myrocks-5635
175     314     445     innodb-5635
306     519     643     innodb-5710
- optimized my.cnf without compression
161     285     363     myrocks-5635
208     378     524     innodb-5635
324     520     685     innodb-5710
- optimized my.cnf with zlib compression
156     290     377     myrocks-5635
146     270     365     in-5636.zlib.opt
191     349     430     innodb-5710

Summary:
  • See summary for oltp_read_only.lua with --range-size=10
  • The difference between InnoDB and MyRocks gets larger as the range scan gets larger
  • InnoDB in MySQL 5.7 is faster than in 5.6. Something was done to make range scans faster.

Point-select

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
10839   18506   25871   myrocks-5635
 8992   16318   24090   innodb-5635
11332   19621   27239   innodb-5710
- optimized my.cnf without compression
10157   16694   22271   myrocks-5635
12944   23440   33489   innodb-5635
13123   22321   31288   innodb-5710
- optimized my.cnf with zlib compression
 9790   16486   22853   myrocks-5635
 7878   13071   18917   innodb-5635
 7484   12358   17911   innodb-5710

Summary:
  • InnoDB is faster than MyRocks
  • The optimized my.cnf helps InnoDB but hurts MyRocks
  • Compressed InnoDB is slower than uncompressed InnoDB. I assume this is decompression latency.
  • InnoDB in MySql 5.6 and 5.7 has similar performance

Insert-only

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
7351    10654   12527   myrocks-5635
2120     6343    8264   innodb-5635
2694     6337    7784   innodb-5710
- optimized my.cnf without compression
7556    11341   12556   myrocks-5635
2356     4437    4540   innodb-5635
2227     4615    4807   innodb-5710
- optimized my.cnf with zlib compression
7210    10816   11273   myrocks-5635
3837     4732    5370   innodb-5635
2979     3885    4528   innodb-5710

Summary:
  • MyRocks is faster perhaps because secondary index maintenance is write-only (read-free).
  • The optimized my.cnf hurts InnoDB
  • Compressed InnoDB is slower than uncompressed InnoDB. I assume this is from both the per-index mutex and (de)compression latency.
  • InnoDB in MySQL 5.6 and 5.7 has similar performance

No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...