Friday, September 1, 2017

In-memory sysbench, a larger server and contention - part 2

In this post I document performance problems in MyRocks, InnoDB and TokuDB using in-memory sysbench on a large server. I previously shared results for in-memory sysbench with less and more contention. In this post I explain the tests where QPS drops significantly when moving from a test with 8 tables to 1 table. In a future post I will repeat the analysis for IO-bound sysbench. Note that while I don't include InnoDB from MySQL 8.0.2 in this analysis, it is similar to 5.7.17.

Which engines lose QPS because of contention on the sysbench 1 table tests at high concurrency?
  • all engines lose QPS on the update-one test
  • InnoDB and TokuDB lose QPS on the random-points test. MyRocks does not.
  • all engines lose QPS on the hot-points test
  • InnoDB and TokuDB lose QPS on the insert-only test. MyRocks does not.
While I previously wrote that in-memory sysbench is the worst-case for MyRocks, it is interesting to find a few cases where MyRocks does better than InnoDB.

Configuration

I compare results from in-memory sysbench using 8 tables and 1 table. There is more contention for internal data structures and rows when sysbench uses 1 table rather than 8 for tests run with mid and high concurrency. I explained these tests in previous posts on sysbench with 8 tables and 1 table. I repeated tests using 1 to 64 clients on a server with 48 HW threads and I consider 32 or more clients to be high concurrency, 8 clients to be mid concurrency and 1 client to be low concurrency.

I run many (~10) sysbench tests (microbenchmarks) because modern sysbench makes that easy with Lua (thanks Alexey). Here I show tests where QPS at high concurrency suffers for tests with 1 table because with fewer tables there is more contention to internal data structures, database pages and rows. The tests for which contention is a problem are update-one, random-points, hot-points and insert-only. My usage of sysbench is explained here but I will briefly describe these tests:
  • update-one - all updates are to the same row in each table (the row with id=1). For the test with 1 table there is only one row that gets all updates which becomes a hot spot.
  • random-points - each query is a SELECT statement with an in-list that matches 100 rows by equality on the PK. The test used uniform distribution to generate the keys to find so there are no row hot spots, but there is a table hot spot when sysbench is run with one table.
  • hot-points - like random-points but this searches for the same 100 keys in every query. So this has a row hot spots.
  • insert-only - inserts are done in PK order. Secondary index maintenance is required and values for that column (k) are inserted in random order. There can be hot spots on the right-hand side of the PK index.

Guide to results

Below I share QPS for each test at low, mid and high concurrency where low is 1 connection, mid is 8 connections and high is 48 connections. The database is cached and sysbench shares the server with mysqld. There is no think time in the sysbench client when running a test, there are no stalls for reads from storage because all data can fit in the database cache and there are a few chances for stalls on writes.

For each test I list the QPS at 1, 8 and 48 connections twice - first for sysbench run with 8 tables and then for it run with 1 table. When using 8 tables there are 1M rows/table and with 1 table there is 8M rows in that table. I used MyRocks based on MySQL 5.6.35, InnoDB from upstream 5.6.35 and 5.7.17 and then TokuDB from Percona Server 5.7.17.

After the QPS results there is a section that lists QPS ratios where I highlight how QPS drops when moving from 8 tables to 1 table. When the QPS ratio is less than 1.0 there might be a performance problem.

update-one

For this test the QPS ratio section has the QPS for the engine at 1 table divided by the QPS for the engine at 8 tables. For this test all engines have a problem at mid and high concurrency as the QPS ratios are less than 0.5. Can I be happy that MyRocks suffers the least? This is a hard problem to fix because updates to one row must be serialized. For all tests the binlog was enabled and sync-on-commit was disabled for the binlog and database log. I hope that commutative updates are eventually supported in MyRocks to improve QPS for concurrent updates to a few rows.

QPS
1       8       48      concurrency/engine
- 8 tables
 8672   43342   39902   myrocks
10472   49717   52468   inno5635
 9670   51181   62626   inno5717
 2912   13736   19551   toku5717
- 1 table
 9072   17348   13055   myrocks
10521   17092   13288   inno5635
 9535   14411   13019   inno5717
 2926    3254    3077   toku5717

QPS ratio
rocks   inno56  inno57  toku
1.046   1.004   0.986   1.004   1 connection - low concurrency
0.400   0.343   0.281   0.236   8 connections - mid concurrency
0.327   0.253   0.207   0.157   48 connections - high concurrency

random-points

For this test the QPS ratio section has the QPS for the engine at 1 table divided by the QPS for the engine at 8 tables. For this test MyRocks does not have a problem for 1 table while InnoDB and TokuDB have a small problem at mid concurrency and a big problem at high concurrency. PMP output for TokuDB with 1 table & 48 connections is here and shows mutex contention. PMP output for InnoDB with 1 table & 48 connections is here and shows contention on rw-locks.

QPS
1       8       48      concurrency/engine
- 8 tables
 897     6871   23189   myrocks
2028    12693   16358   inno5635
1872    13925   47773   inno5717
1529    11824   36786   toku5717
- 1 table
 972     7411   25003   myrocks
1910    10313   12239   inno5635
1764    11931   17690   inno5717
1400     8669    8401   toku5717

QPS ratio
rocks   inno56  inno57  toku
1.083   0.941   0.942   0.915   1 connection - low concurrency
1.078   0.812   0.856   0.733   8 connections - mid concurrency
1.078   0.748   0.370   0.228   48 connections - high concurrency

hot-points

For this test the QPS ratio section is different than the above. The section here has two sets of numbers -- one for 8 tables and one for 1 table. The values are the QPS for the test divided by the QPS for the random-points test at 8 tables. When the value is less than one the engine gets less QPS than expected for this test.

For both 8 tables and 1 table all engines get less QPS on the hot-points test than on the random-points test. The loss is much greater for the 1 table test than the 8 table test. I filed issue 674 for MyRocks to make this better, but it really is an issue with RocksDB and mutex contention in the sharded LRU. PMP output for TokuDB with 1 table and 48 connections is here and it looks like the same problem as for random-points. PMP output for InnoDB with 1 table and 48 connections is here and the problem might be the same as in random-points.

QPS
1       8       48      concurrency/engine
- 8 tables
1376    10256   28762   myrocks
2863    13588   15630   inno5635
2579    17899   50430   inno5717
1989    14091   36737   toku5717
- 1 table
1577     8489    8691   myrocks
2845     8787   10947   inno5635
2574    11904   16505   inno5717
1802     7318    7788   toku5717

QPS ratio for 8 tables
rocks   inno56  inno57  toku
1.534   1.411   1.377   1.300   1 connection - low concurrency
1.492   1.070   1.285   1.191   8 connections - mid concurrency
1.240   0.955   1.055   0.998   48 connections - high concurrency

QPS ratio for 1 table
rocks   inno56  inno57  toku
1.758   1.402   1.375   1.178   1 connection - low concurrency
1.235   0.692   0.854   0.618   8 connections - mid concurrency
0.374   0.669   0.345   0.211   48 connections - high concurrency

insert-only

For this test the QPS ratio section has the QPS for the engine at 1 table divided by the QPS for the engine at 8 tables. For this test MyRocks does not lose QPS while InnoDB and TokuDB do. For all tests the binlog was enabled and sync-on-commit was disabled for the binlog and database log. While I used PMP to explain the performance problems above I won't do that here for TokuDB and InnoDB.

QPS
1       8       48      concurrency/engine
- 8 tables
 9144   46466    65777  myrocks
12317   59811    59971  inno5635
10539   61522   115598  inno5717
 3199   17164    34043  toku5717
- 1 table
 9329   47629    67704  myrocks
12273   55445    37180  inno5635
10529   61235    59690  inno5717
 3156   17193    25754  toku5717

QPS ratio
rocks   inno56  inno57  toku
1.020   0.996   0.999   0.986   1 connection -low concurrency
1.025   0.927   0.995   1.001   8 connections - mid concurrency
1.029   0.619   0.516   0.756   48 connections - high concurrency

No comments:

Post a Comment

Fixing some of the InnoDB scan perf regressions in a MySQL fork

I recently learned of Advanced MySQL , a MySQL fork, and ran my sysbench benchmarks for it. It fixed some, but not all, of the regressions f...