Wednesday, November 29, 2017

Sysbench, IO-bound, small server: MyRocks and InnoDB

In this post I compare MyRocks and InnoDB using IO-bound sysbench and a small server. The goal is to understand where MyRocks differs from InnoDB.  I previously published more results for many versions of MyRocks and InnoDB. Here I use MyRocks from June 2017 and InnoDB from upstream 5.6.35 and 5.7.17.

tl;dr
  • There is more variance in QPS on IO-bound sysbench than on in-memory sysbench
  • Not much QPS is lost when compression is used with MyRocks
  • Two things look better in MySQL 5.7 -- InnoDB range scans and optimization of queries with large in-lists.
  • For many of the workloads InnoDB writes between 5X and 20X more to storage per transaction. An SSD will last longer with MyRocks.
  • Full-scan perf from MyRocks without compression matches InnoDB-5.7 and is much better than InnoDB-5.6 when filesystem readahead is enabled. Now we need to make that feature work for real.
  • MyRocks QPS was >= than InnoDB on most of the write-heavy tests. Read-free index maintenance makes a big difference for MyRocks on some of them.
  • InnoDB QPS was >= MyRocks on most of the range-scan tests (read-only, read-write)
  • InnoDB QPS was > MyRocks on the point-query tests. The 5.7 optimizer might help here.
  • InnoDB QPS was > MyRocks on most of the inlist-query tests

Configuration

The tests used MyRocks from FB MySQL which is currently based on MySQL 5.6.35. The build is from June 16 with git hash 52e058 for FB MySQL and 7e5fac for RocksDB. Upstream 5.6.35 and 5.7.17 was used for InnoDB.
All tests used jemalloc with mysqld. My use of sysbench is described here. The my.cnf files are here for the i3 NUC and i5 NUC. I tried to tune my.cnf for all engines. For all tests the binlog was enabled but fsync was disabled for the binlog and database redo log.

Sysbench is run with 2 tables, 80M rows/table on the i3 NUC and 160M rows/table on the i5 NUC. Each test is repeated for 1 and 2 clients. Each test runs for 600 seconds except for the insert-only test which runs for 300 seconds. The database is much larger than RAM.

I repeat tests on an i5 NUC and i3 NUC. The i5 NUC has more RAM, a faster SSD and faster CPU than the i3 NUC, but I disabled turbo boost on the i5 NUC many months ago to reduce variance in performance and with that the difference in CPU performance between these servers is smaller. The SSD on the i3 NUC is slower than on the i5 NUC. InnoDB is more dependent than MyRocks on IO performance and in some tests below MyRocks does much better on the i3 NUC than the i5 NUC relative to InnoDB.

Tests are repeated for MyRocks without and with compression. The compression configuration is none for L0/L1/L2 and then with LZ4 for the middle levels of the LSM tree and then zstandard for the max level. In the rest of this post that is described as zstandard compression.

Results

All of the data for the tests is on github for the i3 NUC and the i5 NUC. Results for each test are listed separately below. The graphs have the relative QPS where that is the QPS for a configuration relative to the base case. The base case is InnoDB from upstream 5.6.35. The base case is faster when the relative QPS is less than 1. The tables that follow have the absolute and relative QPS.
Graphs

The graphs have the QPS relative to the InnoDB from MySQL 5.6.35. The name MyRocks.none is for MyRocks without compression and MyRocks.zstd is for MyRocks with zstandard compression.

There are 4 types of tests and graphs for each type: write-heavy, scan-heavy, point-query, inlist-query. The results within each group are not as similar as for the in-memory tests, so there are more graphs here. The tests are explained here.

The write-heavy group includes update-inlist, update-one, update-index, update-nonindex, delete and insert. The graphs are the relative QPS for update-index, update-nonindex and insert. MyRocks does 3X to 10X better than InnoDB on update-index because non-unique secondary index maintenance is read-free for it.
The scan-heavy group includes a full scan of the PK index, read-write with range-size set to 100 and 10,000 and then read-only with range-size set to 100 and 10,000. The graphs are the relative QPS for read-write with range-size=100, read-only with range-size=10,000 and then full-scan. The results for read-only and full-scan are for the test run after the write-heavy tests. The results for full-scan include two extra configurations that both enable filesystem readahead during the scan: MyRocks.none.ra, MyRocks.zstd.ra.
The point-query group includes the point-query test run before and then after the write-heavy tests. The graph is the relative QPS for the test run after the write-heavy tests.
The inlist-query group includes the hot-points test and the random-points tests. The random-points result is from the test run after the write-heavy tests. The graph is the relative QPS.
full-scan

The sections that follow have the QPS and relative QPS. The relative QPS is the QPS for the test with 1 client relative to the QPS for InnoDB from MySQL 5.6.35 (InnoDB-5.6). Values are provided for the i3 and i5 NUC. 

For full-scan results are provided for two extra configurations that both enable filesystem readahead during the scan: MyRocks.none.ra, MyRocks.zstd.ra. As seen below readahead is great for full-scan. Unfortunately that feature is not ready for production. I don't know if filesystem readahead is the right solution in this case, but it was easy to use for a benchmark. For a range scan there is an iterator open on each level of the LSM tree and RocksDB does page at a time reads. For a long range scan those reads can be much larger than page at a time. Issue 723 is open for this.

The full scan of the PK index is done before and after the write-heavy tests. Note that tests run before the write-heavy tests are still run immediately after the initial load. So in both cases there can be page write-back activity with InnoDB and compaction with MyRocks. But from data I haven't shared there were not writes in progress for the i5 NUC. 

Summary of full-scan throughput:

  • The best perf is from MyRocks without compression when filesystem readahead is used
  • InnoDB-5.7 is better than InnoDB-5.6 on the i5 NUC but worse than it on the i3 NUC
Legend:
* Mrps - scan rate in millions of rows per second

* ratio - ratio of Mrps for this engine vs InnoDB-5.6

before write-heavy
i3 NUC          i5 NUC
Mrps    ratio   Mrps    ratio   engine
1.019   1.00    1.766   1.00    InnoDB-5.6
0.320   0.31    2.424   1.37    InnoDB-5.7
0.879   0.86    1.194   0.68    MyRocks.none
1.927   1.89    2.318   1.31    MyRocks.none.ra
0.714   0.70    0.860   0.49    MyRocks.zstd
1.006   0.99    1.280   0.72    MyRocks.zstd.ra

after write-heavy
i3 NUC          i5 NUC
Mrps    ratio   Mrps    ratio   engine
0.914   1.00    1.786   1.00    InnoDB-5.6
0.829   0.91    2.406   1.35    InnoDB-5.7
0.610   0.67    1.126   0.63    MyRocks.none
0.969   1.06    2.133   1.19    MyRocks.none.ra
0.477   0.52    0.816   0.46    MyRocks.zstd
0.963   1.05    1.212   0.68    MyRocks.zstd.ra

Data from iostat and vmstat helps to understand the performance differences and the benefit from readahead for MyRocks. We need to make that work for real. On the i5 NUC:
  • InnoDB-5.6 and MyRocks use ~2X more CPU per row than InnoDB-5.7
  • InnoDB-5.7 gets more IO throughput: ~2.1X more vs MyRocks, ~1.4X more vs InnoDB-5.6
  • Filesystem readahead reduces the CPU overhead and increases the IO read rate for MyRocks

Legend:
* CPU.avg - average CPU utilization
* CPU/row - cost per row scanned
* rGB - GB read from storage
* rMB/s - average read IO rate from storage

before write-heavy
CPU.avg CPU/row rGB     rMB/s   engine
28.0    48.96   57      219.6   MyRocks.none
50.5    22.73   58      437.6   MyRocks.none.ra
45.4    53.91   59      337.4   InnoDB-5.6
27.7    24.26   59      460.0   InnoDB-5.7

after write-heavy
CPU.avg CPU/row rGB     rMB/s   engine
27.2    50.30   60      219.1   MyRocks.none
49.4    24.06   62      427.2   MyRocks.none.ra
46.0    54.01   59      338.8   InnoDB-5.6
29.0    25.36   59      457.6   InnoDB-5.7

update-inlist

MyRocks does better than InnoDB and the difference here is larger than for update-nonindex. Both tests do updates that don't require secondary index maintenance but this test updates 100 rows/statement versus 1/statement for update-nonindex. The relative time above the storage engine is larger here, and a more efficient engine has less of an impact here. On the i5 NUC:
  • CPU/update is ~1.5X larger for InnoDB-5.6 and ~1.1X larger for InnoDB-5.7 than MyRocks
  • KB written to storage per update is ~10X larger for InnoDB than MyRocks

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
212     1.00    307     1.00    InnoDB-5.6
213     1.00    392     1.28    InnoDB-5.7
445     2.10    430     1.40    MyRocks.none
363     1.71    458     1.49    MyRocks.zstd

update-one

InnoDB does better than MyRocks. This test updates the same row repeatedly. There are no reads from storage from processing the update, but MyRocks compaction might read from storage. The update statement doesn't require secondary index maintenance. On the i5 NUC:
  • CPU/update is ~1.1X larger for InnoDB-5.7 and MyRocks than InnoDB-5.6
  • KB written to storage per update is similar for MyRocks and InnoDB

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
9120    1.00    10988   1.00    InnoDB-5.6
7839    0.86     9570   0.87    InnoDB-5.7
7656    0.84     8508   0.77    MyRocks.none
7774    0.85     8749   0.80    MyRocks.zstd

update-index

MyRocks is 3X to 10X faster than InnoDB because secondary index maintenance is read-free. On the i5 NUC:
  • CPU/update is ~4X larger for InnoDB than MyRocks
  • KB written to storage per update is ~30X larger for InnoDB than MyRocks
  • Storage read operations per update is ~2X larger for InnoDB than MyRocks

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
 220     1.00    850    1.00    InnoDB-5.6
 312     1.42    924    1.09    InnoDB-5.7
2477    11.26   3004    3.53    MyRocks.none
2296    10.44   2778    3.27    MyRocks.zstd

update-nonindex

MyRocks and InnoDB have similar performance here. Secondary index maintenance isn't done for this test, so MyRocks doesn't gain from read-free index maintenance. See the comment for update-inlist. On the i5 NUC:
  • Storage read operations per update is ~0.9 for InnoDB and MyRocks
  • CPU/update is similar for InnoDB and MyRocks
  • KB written to storage per update is ~5X larger for InnoDB than MyRocks

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
1526    1.00    2683    1.00    InnoDB-5.6
1556    1.02    2773    1.03    InnoDB-5.7
1391    0.91    2888    1.08    MyRocks.none
2237    1.47    2715    1.01    MyRocks.zstd

delete

MyRocks does better. On the i5 NUC:
  • CPU/delete is ~2X larger for InnoDB than MyRocks
  • KB written to storage per delete is ~13X larger for InnoDB than MyRocks

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
2729    1.00    5068    1.00    InnoDB-5.6
2763    1.01    5115    1.01    InnoDB-5.7
7666    2.81    8790    1.73    MyRocks.none
7281    2.67    8268    1.63    MyRocks.zstd

read-write with range-size=100

MyRocks is slightly better than InnoDB. I was surprised by this and my guess is that MyRocks efficiency on insert/update/delete is larger than the InnoDB efficiency on range scans. Much of the write IO here can be for rows modified by the previous tests. On the i5 NUC:
  • CPU/query is ~1.2X larger for InnoDB than MyRocks
  • KB written to storage per query is ~20X larger for InnoDB than MyRocks

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
2028    1.00    2799    1.00    InnoDB-5.6
2208    1.09    3077    1.10    InnoDB-5.7
2565    1.26    3126    1.12    MyRocks.none
2517    1.24    2901    1.04    MyRocks.zstd

read-write with range-size=10000

InnoDB-5.7 is the best here. Unlike the previous test, tt is faster than MyRocks because the range scan here is longer (10,000 vs 100 rows). Someone improved range scan performance for InnoDB in 5.7.  On the i5 NUC:
  • CPU/query is ~2X larger for MyRocks and InnoDB-5.6 than InnoDB-5.7
  • KB written to storage per query is ~12X larger for InnoDB than MyRocks

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
169     1.00    217     1.00    InnoDB-5.6
257     1.52    335     1.54    InnoDB-5.7
146     0.86    202     0.93    MyRocks.none
132     0.78    163     0.75    MyRocks.zstd

read-only with range-size=100

InnoDB is faster here while MyRocks did better on read-write with range-size=100. I assume this is explained by MyRocks benefiting from faster insert, update, delete on the read-write test. InnoDB in 5.7 continues to benefit from improvements to range-scan performance. On the i5 NUC:
  • CPU/query was ~1.4X larger for InnoDB-5.6 and ~1.8X larger for MyRocks than InnoDB-5.7

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
2885    1.00    3687    1.00    InnoDB-5.6
3272    1.13    4954    1.34    InnoDB-5.7
2464    0.85    3011    0.82    MyRocks.none
2406    0.83    2779    0.75    MyRocks.zstd

read-only.pre with range-size=10000

InnoDB in 5.7 continues to benefit from improvements to range-scan performance. MyRocks.none did better than I expected for this test, perhaps because it is run before the write-heavy tests. On the i5 NUC:
  • CPU/query was ~1.7X larger for InnoDB-5.6 and ~1.5X larger for MyRocks than InnoDB-5.7
  • Storage read operations and read KB per query were ~1.2X larger for InnoDB than MyRocks

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
139     1.00    173     1.00    InnoDB-5.6
212     1.53    273     1.58    InnoDB-5.7
137     0.99    174     1.01    MyRocks.none
112     0.81    138     0.80    MyRocks.zstd

read-only with range-size=100000

InnoDB in MySQL 5.7 continues to have the best range-scan performance. The MyRocks QPS here is less compared to the same test from the previous section. The tests in the previous section are run before write-heavy tests while tests here are run after them. It costs more to search the LSM structures after random updates. I have written more about mistakes to avoid when doing a benchmark with an LSM. On the i5 NUC:
  • CPU/query is 1.06X larger here for MyRocks.none compared to the previous test.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
140     1.00    178     1.00    InnoDB-5.6
210     1.50    275     1.54    InnoDB-5.7
102     0.73    166     0.93    MyRocks.none
105     0.75    132     0.74    MyRocks.zstd

point-query.pre

InnoDB continues to get the best QPS but the difference between MySQL 5.6 and 5.7 is smaller than it was for range queries. On the i5 NUC:
  • CPU/query is ~1.2X larger for InnoDB-5.6 and ~1.4X larger for MyRocks than InnoDB-5.7
  • Storage read operations per query are ~1.06 larger for InnoDB than MyRocks.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
3879    1.00    5250    1.00    InnoDB-5.6
4264    1.10    6198    1.18    InnoDB-5.7
4177    1.08    4370    0.83    MyRocks.none
3354    0.86    3969    0.76    MyRocks.zstd

point-query

InnoDB continues to get the best QPS but the difference between MySQL 5.6 and 5.7 is smaller than it was for range queries. The MyRocks QPS here is less compared to the same test from the previous section, which is expected for read-heavy tests that follow write-heavy tests. On the i5 NUC the iostat and vmstat metrics are similar to the result above for point-query.pre with two exceptions:
  • CPU/query is ~1.5X larger for MyRocks than InnoDB-5.7. This is larger than above.
  • Storage read operations/query are similar rather than 6% larger for InnoDB

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
3896    1.00    5310    1.00    InnoDB-5.6
4332    1.11    6155    1.16    InnoDB-5.7
2361    0.61    3966    0.75    MyRocks.none
2742    0.70    3707    0.70    MyRocks.zstd

random-points.pre

InnoDB-5.7 does the best. It is odd that MyRocks does better than InnoDB-5.6 here but not above on point-query. Both do point lookups but this test does 100 lookups per SELECT while the point-query tests do 1 per SELECT. On the i5 NUC:
  • CPU/query is ~4.8X larger for InnoDB-5.6 and ~3.4X larger for MyRocks than InnoDB-5.7. I wonder if a change in the optimizer explains this because this looks like a difference between MySQL 5.6 and 5.7 rather than between engines.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
56      1.00     50     1.00    InnoDB-5.6
69      1.23    105     2.10    InnoDB-5.7
65      1.16     57     1.14    MyRocks.none
50      0.89     61     1.22    MyRocks.zstd

random-points

Results are similar to random-points.pre: InnoDB-5.7 does the best and MySQL 5.6 uses more CPU/query than 5.7. On the i5 NUC:
  • CPU/query is ~3.5X larger for InnoDB-5.6 and ~2.6X larger for MyRocks than InnoDB-5.7. See the comment in the previous section.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
56      1.00     64     1.00    InnoDB-5.6
51      0.91    100     1.56    InnoDB-5.7
36      0.64     62     0.97    MyRocks.none
41      0.73     57     0.89    MyRocks.zstd

hot-points

InnoDB gets at least 3X more QPS than MyRocks. This test is always in-memory and the QPS here is similar to the QPS from in-memory sysbench. On the i5 NUC:
  • CPU/query is ~3X larger for MyRocks than InnoDB.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
3609    1.00    4167    1.00    InnoDB-5.6
3455    0.96    3771    0.90    InnoDB-5.7
1000    0.28    1125    0.27    MyRocks.none
1117    0.31    1088    0.26    MyRocks.zstd

insert

MyRocks gets a higher insert rate. On the i5 NUC:
  • Storage read operations per insert are 6X to 8X larger for InnoDB than MyRocks
  • KB written to storage per insert are ~3X larger for InnoDB than MyRocks
  • CPU/insert is ~1.1X larger for InnoDB-5.7 and MyRocks than InnoDB-5.6

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
5097    1.00    7991    1.00    InnoDB-5.6
5730    1.12    7234    0.91    InnoDB-5.7
7867    1.54    8652    1.08    MyRocks.none
7828    1.54    8299    1.04    MyRocks.zstd

No comments:

Post a Comment

Fixing some of the InnoDB scan perf regressions in a MySQL fork

I recently learned of Advanced MySQL , a MySQL fork, and ran my sysbench benchmarks for it. It fixed some, but not all, of the regressions f...