Small Datum: In-memory sysbench, a larger server and contention

In this post I document performance problems in MyRocks, InnoDB and TokuDB using in-memory sysbench on a large server. I previously shared results for in-memory sysbench with less and more contention. In this post I explain the tests where QPS drops significantly when moving from a test with 8 tables to 1 table. In a future post I will repeat the analysis for IO-bound sysbench. Note that while I don't include InnoDB from MySQL 8.0.2 in this analysis, it is similar to 5.7.17.

Which engines lose QPS because of contention on the sysbench 1 table tests at high concurrency?

all engines lose QPS on the update-one test
InnoDB and TokuDB lose QPS on the random-points test. MyRocks does not.
all engines lose QPS on the hot-points test
InnoDB and TokuDB lose QPS on the insert-only test. MyRocks does not.

While I previously wrote that in-memory sysbench is the worst-case for MyRocks, it is interesting to find a few cases where MyRocks does better than InnoDB.

Configuration

I compare results from in-memory sysbench using 8 tables and 1 table. There is more contention for internal data structures and rows when sysbench uses 1 table rather than 8 for tests run with mid and high concurrency. I explained these tests in previous posts on sysbench with 8 tables and 1 table. I repeated tests using 1 to 64 clients on a server with 48 HW threads and I consider 32 or more clients to be high concurrency, 8 clients to be mid concurrency and 1 client to be low concurrency.

I run many (~10) sysbench tests (microbenchmarks) because modern sysbench makes that easy with Lua (thanks Alexey). Here I show tests where QPS at high concurrency suffers for tests with 1 table because with fewer tables there is more contention to internal data structures, database pages and rows. The tests for which contention is a problem are update-one, random-points, hot-points and insert-only. My usage of sysbench is explained here but I will briefly describe these tests:

update-one - all updates are to the same row in each table (the row with id=1). For the test with 1 table there is only one row that gets all updates which becomes a hot spot.
random-points - each query is a SELECT statement with an in-list that matches 100 rows by equality on the PK. The test used uniform distribution to generate the keys to find so there are no row hot spots, but there is a table hot spot when sysbench is run with one table.
hot-points - like random-points but this searches for the same 100 keys in every query. So this has a row hot spots.
insert-only - inserts are done in PK order. Secondary index maintenance is required and values for that column (k) are inserted in random order. There can be hot spots on the right-hand side of the PK index.

Guide to results

Below I share QPS for each test at low, mid and high concurrency where low is 1 connection, mid is 8 connections and high is 48 connections. The database is cached and sysbench shares the server with mysqld. There is no think time in the sysbench client when running a test, there are no stalls for reads from storage because all data can fit in the database cache and there are a few chances for stalls on writes.

For each test I list the QPS at 1, 8 and 48 connections twice - first for sysbench run with 8 tables and then for it run with 1 table. When using 8 tables there are 1M rows/table and with 1 table there is 8M rows in that table. I used MyRocks based on MySQL 5.6.35, InnoDB from upstream 5.6.35 and 5.7.17 and then TokuDB from Percona Server 5.7.17.

After the QPS results there is a section that lists QPS ratios where I highlight how QPS drops when moving from 8 tables to 1 table. When the QPS ratio is less than 1.0 there might be a performance problem.

update-one

For this test the QPS ratio section has the QPS for the engine at 1 table divided by the QPS for the engine at 8 tables. For this test all engines have a problem at mid and high concurrency as the QPS ratios are less than 0.5. Can I be happy that MyRocks suffers the least? This is a hard problem to fix because updates to one row must be serialized. For all tests the binlog was enabled and sync-on-commit was disabled for the binlog and database log. I hope that commutative updates are eventually supported in MyRocks to improve QPS for concurrent updates to a few rows.

QPS
1 8 48 concurrency/engine

- 8 tables

8672 43342 39902 myrocks

10472 49717 52468 inno5635

9670 51181 62626 inno5717

2912 13736 19551 toku5717

- 1 table

9072 17348 13055 myrocks

10521 17092 13288 inno5635

9535 14411 13019 inno5717

2926 3254 3077 toku5717

QPS ratio

rocks inno56 inno57 toku

1.046 1.004 0.986 1.004 1 connection - low concurrency

0.400 0.343 0.281 0.236 8 connections - mid concurrency

0.327 0.253 0.207 0.157 48 connections - high concurrency

random-points

For this test the QPS ratio section has the QPS for the engine at 1 table divided by the QPS for the engine at 8 tables. For this test MyRocks does not have a problem for 1 table while InnoDB and TokuDB have a small problem at mid concurrency and a big problem at high concurrency. PMP output for TokuDB with 1 table & 48 connections is here and shows mutex contention. PMP output for InnoDB with 1 table & 48 connections is here and shows contention on rw-locks.

QPS

1 8 48 concurrency/engine

- 8 tables

897 6871 23189 myrocks

2028 12693 16358 inno5635

1872 13925 47773 inno5717

1529 11824 36786 toku5717

- 1 table

972 7411 25003 myrocks

1910 10313 12239 inno5635

1764 11931 17690 inno5717

1400 8669 8401 toku5717

QPS ratio

rocks inno56 inno57 toku

1.083 0.941 0.942 0.915 1 connection - low concurrency

1.078 0.812 0.856 0.733 8 connections - mid concurrency

1.078 0.748 0.370 0.228 48 connections - high concurrency

hot-points

For this test the QPS ratio section is different than the above. The section here has two sets of numbers -- one for 8 tables and one for 1 table. The values are the QPS for the test divided by the QPS for the random-points test at 8 tables. When the value is less than one the engine gets less QPS than expected for this test.

For both 8 tables and 1 table all engines get less QPS on the hot-points test than on the random-points test. The loss is much greater for the 1 table test than the 8 table test. I filed issue 674 for MyRocks to make this better, but it really is an issue with RocksDB and mutex contention in the sharded LRU. PMP output for TokuDB with 1 table and 48 connections is here and it looks like the same problem as for random-points. PMP output for InnoDB with 1 table and 48 connections is here and the problem might be the same as in random-points.

QPS

1 8 48 concurrency/engine
- 8 tables

1376 10256 28762 myrocks

2863 13588 15630 inno5635

2579 17899 50430 inno5717

1989 14091 36737 toku5717

- 1 table

1577 8489 8691 myrocks

2845 8787 10947 inno5635

2574 11904 16505 inno5717

1802 7318 7788 toku5717

QPS ratio for 8 tables

rocks inno56 inno57 toku

1.534 1.411 1.377 1.300 1 connection - low concurrency

1.492 1.070 1.285 1.191 8 connections - mid concurrency

1.240 0.955 1.055 0.998 48 connections - high concurrency

QPS ratio for 1 table

rocks inno56 inno57 toku

1.758 1.402 1.375 1.178 1 connection - low concurrency

1.235 0.692 0.854 0.618 8 connections - mid concurrency

0.374 0.669 0.345 0.211 48 connections - high concurrency

insert-only

For this test the QPS ratio section has the QPS for the engine at 1 table divided by the QPS for the engine at 8 tables. For this test MyRocks does not lose QPS while InnoDB and TokuDB do. For all tests the binlog was enabled and sync-on-commit was disabled for the binlog and database log. While I used PMP to explain the performance problems above I won't do that here for TokuDB and InnoDB.

QPS

1 8 48 concurrency/engine
- 8 tables

9144 46466 65777 myrocks

12317 59811 59971 inno5635

10539 61522 115598 inno5717

3199 17164 34043 toku5717

- 1 table

9329 47629 67704 myrocks

12273 55445 37180 inno5635

10529 61235 59690 inno5717

3156 17193 25754 toku5717

QPS ratio

rocks inno56 inno57 toku

1.020 0.996 0.999 0.986 1 connection -low concurrency

1.025 0.927 0.995 1.001 8 connections - mid concurrency

1.029 0.619 0.516 0.756 48 connections - high concurrency

Small Datum

Friday, September 1, 2017

In-memory sysbench, a larger server and contention - part 2

No comments:

Post a Comment

Postgres 18 beta2: large server, Insert Benchmark, part 2