First there was the RUM Conjecture - an index structure can't be optimal for all of read, write & space efficiency. It became the CRUM Conjecture when I added a C for cache amplification because some algorithms require more data in cache per indexed row if they are to be performant and DRAM is costly (see key-value separation).
Can I add another letter to it and rename it the CRUMT conjecture where T is for Tuning amplification? By this I mean the cost (in HW and my time) from repeating tests after changing config options. I spend much time repeating tests and the common reasons are 1) I made a mistake and 2) I want to change a config option. While my mistakes are hard to avoid, it would be nice to spend less time tweaking options and self-tuning index structures will be great for production.
There are efforts in place (thanks OtterTune) to apply ML to the problem. But I hope for an additional effort to move the cleverness into the DBMS. Don't treat this as a black box. Figure out cost models, add objective functions and SLAs and then build things that are better at self-tuning.
Small Datum
Wednesday, January 13, 2021
T is for Tuning amplification
Monday, January 11, 2021
Sysbench: Postgres vs MySQL and the impact of regressions
I compare Postgres and MySQL using an in-memory and low-concurrency workload via sysbench. This reuses the data from my previous posts (here and here).
For this workload Postgres used to be slower than MySQL and now it is faster. I hesitate to do direct comparisons because I like both DBMS and don't want to start pointless debates but I care about efficiency and hope that more is done to prevent regressions in MySQL going forward. I also worry that Postgres is faster than MySQL for this workload gets truncated to Postgres is faster than MySQL when this post is discussed elsewhere.
This post compares performance between MySQL 5.6.49 and Postgres 11.10 and then between MySQL 8.0.21 and Postgres 13.1 to document what has changed over the past 3 major versions: MySQL 5.6, 5.7, and 8.0 and then Postgres 11, 12 and 13. I used MySQL 8.0.21 rather than 8.0.22 to avoid the impact from bug 102037. The workload is in-memory & low-concurrency sysbench on a small server. The sysbench tests are microbenchmarks. Context matters so be careful about projecting the conclusions here to your use case.
Summary for this workload:
- MySQL 5.6 does better than Postgres 11.10 for most tests because it uses less CPU/operation
- MySQL 8.0 does worse than Postgres 13.1 for most tests because it uses more CPU/operation
Postgres 11.10 was usually slower than MySQL 5.6. Now that has changed and Postgres 13.1 is usually faster than MySQL 8.0.21. The reason for this change is new CPU overhead in MySQL 5.7 and 8.0. You can see that below where the throughput ratios (QPS, insert/s, transactions/s) increase from the comparison with MySQL 5.6 to the comparison with MySQL 8.0 and most of the ratios are now greater than 1.
Load:
- For the first 4 tests that do point queries
- QPS ratios are 0.98, 0.92, 0.63, 0.71 for My56
- QPS ratios are 1.41, 1.20, 0.83, 0.76 for My80
- CPU/query overhead explains the change from My56 to My80
- The next 3 tests have range scans from oltp_read_write.lua with ranges of size 10, 100 & 10,000
- QPS ratios are 0.98, 1.09, 0.92 for My56
- QPS ratios are 1.40, 1.10, 0.71 for My80
- CPU/query overhead explains the change from My56 to My80
- The next 2 tests do point queries via in-lists that are covering and not covering for the PK index
- QPS ratios are 0.70, 0.62 for My56
- QPS ratios are 0.91, 0.81 for My80
- CPU/query overhead explains the change from My56 to My80
- The next 2 tests are similar to the previous but use the secondary index
- QPS ratios are 1.10, 1.27 for My56
- QPS ratios are 1.41, 1.66 for My80
- CPU/query overhead explains the change from My56 to My80
- The next 2 tests do range queries that are covering and not covering for the PK index
- QPS ratios are 0.90, 0.77 for My56
- QPS ratios are 1.21, 1.12 for My80
- CPU/query overhead explains the change from My56 to My80
- The next 2 tests are similar to the previous but use the secondary index
- For the next 5 tests that are update-only
- QPS ratios are 0.96, 6.79, 1.28, 1.13, 1.28 for My56
- QPS ratios are 1.48, 3.93, 1.87, 1.98, 1.89 for My80
- The largest difference is for the 2nd test (update-index) that requires index maintenance. For Postgres that looks like an insert internally and the old index entry becomes an old version. For InnoDB that is a delete-mark of the old entry and an insert for the new entry (see my notes on MVCC GC).
- CPU/statement overhead explains the change from My56 to My80. Write-amp (wKB/o) is worse for MySQL especially for the update-index test.
- The next test is write-only that has the writes from oltp_read_write.lua
- QPS ratio is 1.41 for My56
- QPS ratio is 1.81 for My80
- CPU/transaction overhead explains the change from My56 to My80. Write-amp (wKB/o) is also worse for MySQL.
- The next 2 tests are the traditional sysbench tests with ranges of size 10 & 100
- The next 3 tests have range scans from oltp_read_write.lua with ranges of size 10, 100 & 10,000
- QPS ratio is 0.88, 0.94, 0.67 for My56
- QPS ratio is 1.26, 0.96, 0.53 for My80
- CPU/query overhead explains the change from My56 to My80
- The next 5 tests do point queries
- QPS ratios are 0.98, 0.80, 0.60, 0.67, 0.73 for My56
- QPS ratios are 1.40, 1.17, 0.77, 0.72, 1.15 for My80
- CPU/query overhead explains the change from My56 to My80
- The next 2 tests do point queries via in-lists that are covering and not covering for the PK index
- QPS ratios are 0.68, 0.58 for My56
- QPS ratios are 0.88, 0.76 for My80
- CPU/query overhead explains the change from My56 to My80
- The next 2 tests are similar to the previous test but use the secondary index
- QPS ratios are 1.09, 1.23 for My56
- QPS ratios are 1.38, 1.61 for My80
- CPU/query overhead explains the change from My56 to My80
- The next 2 tests do range queries that are covering and not covering for the PK index
- QPS ratios are 0.80, 0.57 for My56
- QPS ratios are 1.07, 0.83 for My80
- CPU/query overhead explains the change from My56 to My80
- The next 2 tests are similar to the previous but use the secondary index
- QPS ratio is 1.48 for delete and 0.95 for insert for My56
- QPS ratio is 2.29 for delete and 1.46 for insert for My80
- CPU/statement overhead explains the change from My56 to My80
Sunday, January 10, 2021
Sysbench: in-memory Postgres, Postgres is boring
While writing and rewriting perf reports for MyRocks and InnoDB I have also been running sysbench tests with Postgres. I am happy to claim that Postgres is boring as there are no CPU regressions from version 11 to 13.
I ran tests for Postgres versions 11.10, 12.4 and 13.1 on a small server using sysbench and an in-memory, low-concurrency workload. Workloads with more IO and concurrency are great, but this is the first step. A previous post on how I run these tests. There is one table with 10M rows, tests are run for 1, 2 and 3 clients and each test is run for 180 seconds.
I spent time in 2020 evaluating configuration options for Postgres and settled on the x5 config. Vacuum analyze is run twice -- after the load and again after the write-heavy phase. Along with vacuum analyze, other things are done to write back most of the dirty pages via checkpoint and sleeping (see here).
The summary:
- There are no regressions
- There was a significant (~20%) improvement on one test that is update-only with all updates limited to the same row
Results
Load:
- QPS ratios are 1.05, 1.01, 0.97, 0.96 for the first 4 tests (here to here)
- QPS ratios are 1.02, 1.01, 0.97 for the next 3 tests (here to here)
- These have the range scans from oltp_read_write.lua with ranges of size 10, 100 & 10,000
- CPU/query ratios are 0.99, 0.99, 1.03 (here to here).
- QPS ratios are 0.98, 0.96 for the next 2 tests (here to here)
- These do point queries via in-lists that are covering and then not covering for the PK index
- CPU/query ratios are 1.02, 1.04 (here to here).
- QPS ratios are 0.97, 0.96 for the next 2 tests (here to here)
- These are similar to the previous test, but use the secondary index.
- CPU/query ratios are 1.03, 1.03 (here to here)
- QPS ratios are 1.00, 0.99 for the next 2 tests (here to here)
- These do range queries that are covering and then not covering for the primary index
- CPU/query ratios are 1.00, 1.01 (here to here)
- QPS ratios are 1.01, 1.00 for the next 2 tests (here to here)
- QPS ratios are 0.99, 0.96, 1.00, 1.20, 1.00 for the next 5 tests (here to here)
- These are update-only. The big gain (1.20) is for a test that does all updates to the same row.
- CPU/statement ratios are 1.01, 1.05, 1.01, 0.86, 1.00 (here to here)
- QPS ratio is 1.00 for the next test, write-only (see here)
- This has the writes from oltp_read_write.lua.
- CPU/transaction ratio is 1.02 (see here)
- QPS ratios are 1.00, 1.01 for the next two tests, read-write (here to here)
- QPS ratio is 1.03, 1.01, 1.00 for the next 3 tests, read-only (here to here)
- These have the range scans from oltp_read_write.lua with ranges of size 10, 100 & 10,000
- CPU/transaction ratios are 0.97, 0.99, 1.00 (here to here)
- QPS ratios are 1.05, 1.01, 0.94, 0.96, 1.00 for the next 5 tests (here to here)
- These do a variety of point queries.
- CPU/query ratios are 0.96, 0.98, 1.06, 1.04, 1.01 (here to here)
- QPS ratios are 0.97, 0.97 for the next 2 tests (here to here)
- These do point queries via in-lists that are covering and then not covering for the PK index
- CPU/query ratios are 1.02, 1.03 (here to here)
- QPS ratios are 0.98, 0.97 for the next 2 tests (here to here)
- These are similar to the previous test, but use the secondary index.
- CPU/query ratios are 0.99, 1.02 (here to here)
- QPS ratios are 1.00, 1.01 for the next 2 tests (here to here)
- These do range queries that are covering and then not covering for the PK index
- CPU/query ratios are 1.00, 0.99 (here to here)
- QPS ratios are 1.00, 1.00 for the next 2 tests (here to here)
Saturday, January 9, 2021
Sysbench: in-memory InnoDB, MySQL 5.6, 5.7 & 8.0, take 3
This is my third attempt to report in-memory sysbench results for InnoDB in MySQL 5.6, 5.7 and 8.0. In my second attempt I changed my.cnf so that both versions used the same value for eq_range_index_dive_limit. In this attempt I get InnoDB to write back dirty pages (see here) after the write-heavy tests are run (before the read-only tests are repeated). The main reason for the change was to reduce variance in MyRocks results as explained here, but I hope it helps InnoDB as well.
Will MySQL continue to get 15% less QPS in each new major version as it has for 5.6 to 5.7 to 8.0? This is true for in-memory and low-concurrency workloads, and likely true for any workload for which CPU overhead is an issue. Workloads that are IO-bound and/or limited by contention will suffer less from this and also benefit from improvements upstream has made for mutex contention.
The summary from my previous post is mostly true here:
- InnoDB in 5.7 gets ~85% or better of the QPS vs 5.6
- InnoDB in 8.0 gets ~70% or more of the QPS vs 5.6
- For write-heavy there is a large regression from 5.7 to 8.0 that I have yet to explain
- For queries with large in-lists there is a CPU regression bug in 8.0.22 -- bug 102037
- Regressions from 5.6 to 8.0 for low-concurrency, in-memory workloads are worse for InnoDB than for MyRocks. I didn't expect this, but there have been some big changes to InnoDB in 8.0. I assumed that most regressions are from code above InnoDB, but that isn't true.
Results
Load:
- Inserts/second ratio is 0.65 (~104k/s -> ~68k/s, see here)
- CPU/insert ratio is 1.67 (see here)
- The regression is new in 8.0 and CPU overhead is the problem
- There were many reads from storage for 5.6, 5.7 and 8.0.18, but not 8.0.21 or 8.0.22. That IO was from creating the secondary index which is done after the table has been loaded.
- Inserts/second ratio for 5.7.31 is 0.99
- QPS ratios are 0.74, 0.71, 0.59, 0.24 for the first 4 tests (here to here)
- These do point queries
- CPU/query ratios are: 1.36, 1.43, 1.72, 4.06 (here to here)
- The large regressions (0.59, 0.24) are from bug 102037 which is new in 8.0.22 and QPS ratios are for 8.0.21 are 0.72, 0.69, 0.73, 0.89
- QPS ratios for 5.7.31 are 0.85, 0.87, 0.87, 1.07
- QPS ratios are 0.73, 1.01, 1.25 for the next 3 tests (here to here)
- These have the range scans from oltp_read_write.lua with ranges of size 10, 100 & 10,000
- CPU/query ratios are 1.34, 1.03, 0.83 (here to here).
- QPS ratios for 5.7.31 are 0.86, 1.12, 1.46
- QPS ratios are 0.59, 0.59 for the next 2 tests (here to here)
- These do point queries via in-lists that are covering and then not covering for the PK index
- CPU/query ratios are 1.72, 1.74 (here to here).
- 8.0.22 suffers from bug 102037 and QPS ratios for 8.0.21 are 0.75, 0.74.
- QPS ratios for 5.7.31 are 0.86, 0.87
- QPS ratios are 0.66, 0.65 for the next 2 tests (here to here)
- These are similar to the previous test, but use the secondary index.
- CPU/query ratios are 1.51, 1.53 (here to here)
- 8.0.22 suffers from bug 102037 and QPS ratios for 8.0.21 are 0.76, 0.74.
- QPS ratios for 5.7.31 are 0.88, 0.88
- QPS ratios are 0.74, 0.69 for the next 2 tests (here to here)
- These do range queries that are covering and then not covering for the primary index
- CPU/query ratios are 1.35, 1.44 (here to here)
- QPS ratios for 5.7.31 are 0.84, 0.85
- QPS ratios are 0.77, 0.59 for the next 2 tests (here to here)
- QPS ratios are 0.70, 1.61, 0.76, 0.75, 0.75 for the next 5 tests (here to here)
- These are update-only
- HW efficiency metrics can be confusing (here to here)
- CPU/statement ratios for 8.0.22 are 2.75, 1.44, 2.21, 2.24. 2.23 and for 5.7.31 are 1.14, 0.77, 1.01, 1.03, 1.04. There is a big change from 5.7 to 8.0.
- wKB/statement ratios have more variance
- QPS ratios for 5.7.31 are 0.93, 1.80, 0.96, 0.95, 0.94
- QPS ratio is 0.84 for the next test, write-only. See here.
- This has the writes from oltp_read_write.lua.
- CPU/transaction ratio for 8.0.22 is 2.22 and for 5.7.31 is 1.00. See here.
- QPS ratios for 5.7.31 is 1.09
- QPS ratios are 0.74, 0.87 for the next two tests, read-write (here to here)
- QPS ratio is 0.73, 1.00, 1.25 for the next 3 tests, read-only (here to here)
- These have the range scans from oltp_read_write.lua with ranges of size 10, 100 & 10,000
- CPU/transaction ratios are 1.37, 1.04, 0.83 (here to here)
- QPS ratios for 5.7.31 are 0.88, 1.11, 1.46
- QPS ratios are 0.74, 0.71, 0.58, 0.24, 0.49 for the next 5 tests (here to here)
- These do a variety of point queries.
- CPU/query ratios are 1.38, 1.41, 1.76, 4.05, 1.90 (here to here)
- QPS ratios for 8.0.21 are 0.73, 0.70, 0.73, 0.90, 0.63. The large regressions in QPS ratios (0.58, 0.24, 0.49) are from bug 102037 which is new in 8.0.22.
- QPS ratios for 5.7.31 are 0.85, 0.87, 0.87, 1.07, 0.85
- QPS ratios are 0.59, 0.59 for the next 2 tests (here to here)
- These do point queries via in-lists that are covering and then not covering for the PK index
- CPU/query ratios are 1.72, 1.74 (here to here)
- QPS ratios for 8.0.21 are 0.74, 0.74 (bug 102037 hurts 8.0.22)
- QPS ratios for 5.7.31 are 0.86, 0.87
- QPS ratios are 0.68, 0.66 for the next 2 tests (here to here)
- These are similar to the previous test, but use the secondary index.
- CPU/query ratios are 1.48, 1.55 (here to here)
- QPS ratios for 8.0.21 are 0.78, 0.75 (bug 102037 hurts 8.0.22)
- QPS ratios for 5.7.31 are 0.89, 0.87
- QPS ratios are 0.75, 0.70 for the next 2 tests (here to here)
- These do range queries that are covering and then not covering for the PK index
- CPU/query ratios are 1.29, 1.42 (here to here)
- QPS ratios for 5.7.31 are 0.85, 0.85
- QPS ratios are 0.77, 0.59 for the next 2 tests (here to here)
Friday, January 8, 2021
Sysbench: in-memory MyRocks, MySQL 5.6 & 8.0, take 3
This is my third attempt to report in-memory sysbench results for MyRocks in MySQL 5.6.35 and 8.0.17. In my second attempt I changed my.cnf so that both versions used the same value for eq_range_index_dive_limit. In this attempt I make sure the LSM tree shape is in a better shape during the read-only tests. Both attempts resolve some of the regressions between 5.6 and 8.0 reported in my first attempt.
The summary:
- QPS for read-only tests in 8.0 is ~10% less than 5.6
- QPS for write-only tests in 8.0 is 20% to 30% less than 5.6
- QPS for the traditional sysbench test is similar between 5.6 and 8.0
- Long range scans are more efficient in 8.0
In the x6a my.cnf that I have been using the constraints are: 256M memtable, <= 4 files in the L0, 1G in the L1 and each level after L1 is 10X larger than the previous. I also disabled compression and each L0 file should be ~256M.
In my tests either the database is fully cached or all but the max level of the LSM tree is cached. So extra data in the smaller levels leads to more CPU overhead on queries, but not more reads from storage. The read penalty is real for LSM trees, but mostly about CPU despite what you might read on the interwebs.
I run sysbench tests in a sequence and each test runs for a fixed amount of time. For tests that do writes, the write rate isn't fixed so when there is variance in the write rate then the amount of writes done by each test can vary. A side-effect of this is that the LSM tree shape can vary (number of KV pairs in the memtable, number of files in L0, amount of data in Ln (n>1). For the Read-only after write-heavy tests this can have an impact on performance (I might do a post on this soon) and in some tests where I saw regressions with 8.0 the problem was that the L0 had several files for 8.0 while it was empty for the tests with 5.6.
I have seen this problem before and filed a feature request for it. I hope that a future RocksDB, or MyRocks, can be more clever and notice when workloads shift to a read-only or read-mostly phase and make the LSM tree more efficient for reads (shrink/flush the memtable, compact all data from L0 into L1, etc). For now I updated my helper scripts to do that via the following command which is run after the load and then again after the write-heavy tests finish. To be fair, for Postgres and InnoDB at that point my helper scripts prod them to write back dirty pages:
set global rocksdb_force_flush_memtable_and_lzero_now=1
This hasn't been a problem for me with the insert benchmark for two reasons. First, tests that do writes are run for a fixed number of writes rather than a fixed duration. Second, the insert benchmark test steps are either insert-only or inserts+reads. The issue is more likely for read-only test steps.
Results
Load:
- QPS ratios are 0. 90, 0.90, 0.90, 0.89 for the first 4 tests (here to here)
- QPS ratios are 0.96, 0.99, 1.13 for the next 3 tests (here to here)
- These do the range scans from oltp_read_write.lua with ranges of size 10, 100 & 10,000
- CPU/query ratios are 1.08, 1.02, 0.92 (here to here). Long scans are better in 8.0 vs 5.6.
- QPS ratios are 0.91, 0.89 for the next 2 tests (here to here)
- These do point queries via in-lists that are covering and then not covering for the PK index
- CPU/query ratios are 1.12, 1.14 (here to here).
- QPS ratios are 0.92, 0.93 for the next 2 tests (here to here)
- These are similar to the previous test, but use the secondary index.
- CPU/query ratios are 1.08, 1.09 (here to here)
- QPS ratios are 0.98, 0.86 for the next 2 tests (here to here)
- These do range queries that are covering and then not covering for the primary index
- CPU/query ratios are 1.07, 1.18 (here to here)
- QPS ratios are 0.95, 0.90 for the next 2 tests (here to here)
- QPS ratios are 0.82, 0.87, 0.83, 0.80, 0.71 for the next 5 tests (here to here)
- QPS ratio is 0.95 for the next test, write-only. See here.
- This has the writes from oltp_read_write.lua.
- CPU/transaction ratio is 1.16. See here.
- QPS ratios are 1.02, 0.99 for the next two tests, read-write (here to here)
- QPS ratio is 0.98, 0.99, 1.10 for the next 3 tests, read-only (here to here)
- These have the queries (range scans) from oltp_read_write.lua with ranges of size 10, 100 and 10,000. Ratios here are similar to Read-only before write-heavy.
- CPU/transaction ratios are 1.06, 1.04, 0.95 (here to here)
- QPS ratios are 0.91, 1.01, 0.92, 0.92. 0.87 for the next 5 tests (here to here)
- These do a variety of point queries. The first 4 were run in Read-only before write heavy, and ratios here are similar.
- CPU/query ratios are 1.15, 1.03, 1.10, 1.09, 1.17 (here to here)
- QPS ratios are 0.94, 0.92 for the next 2 tests (here to here)
- These do point queries via in-lists that are covering and then not covering for the PK index
- CPU/query ratios are 1.08, 1.11 (here to here)
- QPS ratios are 0.95, 0.94 for the next 2 tests (here to here)
- These are similar to the previous test, but use the secondary index.
- CPU/query ratios are 1.06, 1.07 (here to here)
- QPS ratios are 0.94, 0.92 for the next 2 tests (here to here)
- These do range queries that are covering and then not covering for the primary index
- CPU/query ratios are 1.10, 1.05 (here to here)
- QPS ratios are 0.86, 0.93 for the next 2 tests (here to here)
Tuesday, January 5, 2021
Sysbench: in-memory MyRocks, MySQL 5.6 & 8.0, take 2
As I shared in my previous post, the default value for eq_range_index_dive_limit increased in MySQL 5.7 and that hurts several of the benchmark queries that I use. So I improved the my.cnf, repeated the tests and now get to write another perf report. This has results for in-memory sysbench with MyRocks from MySQL 5.6.35 and 8.0.17.
The summary:
- Tests in Read-only before write-heavy benefit from letting post-load compaction finish and the 8.0 results benefit from reducing eq_range_index_dive_limit to 10.
- Comparing 5.6.35 and 8.0.17
- QPS for Read-only before write-heavy in 8.0 is ~90% to ~100% vs 5.6
- QPS for Read-only after write-heavy in 8.0 is ~80% to ~90% vs 5.6. I don't know why fragmentation from writes hurts 8.0 more than 5.6
- QPS for write-only in 8.0 is ~70% to ~90% vs 5.6
- QPS for read-write in 8.0 is the same vs 5.6 (read-write does range queries that benefits 8.0 and everything else that benefits 5.6)
- The regressions from 5.6 to 8.0 are worse for InnoDB with 8.0.22 than MyRocks with 8.0.17. But some of the MyRocks regressions will get worse when it is ported to 8.0.22 and encounters bug 102037.
- There is a large regression in the test that does point queries (covered and not covered) using a secondary index for Read-only after write-heavy but not for Read-only before write-heavy. It would be useful to explain this. Search for QPS ratios are 0.52, 0.63 below
- Regressions for the Read-only after write-heavy tests might be from non-determinism. I have warned about variance in benchmarks for RocksDB and MyRocks when doing read-only tests. The problem is that the LSM tree isn't guaranteed to be in the same shape when tests are repeated and some shapes are worse than others. I think that was the case here where the LSM tree for 5.6.35 has no data in L0 while the tree for 8.0.17 has much data in L0. Both have data in L1 and L2. Therefore, there is more work for 8.0.17 on queries. See here. There are options to force RocksDB to flush the memtable and L0. I have used them in the past but forgot. Looks like I need to repeat the benchmark so they are used after the write-heavy tests. The option I want is --rocksdb-force-flush-memtable-and-lzero-now. This will get the LSM tree into a more deterministic state.
my.cnf changes
The my.cnf that I use previously included eq_range_index_dive_limit=10. I forgot it was there for a reason and removed it a few weeks ago while testing my.cnf options. That hurt results in my previous tests so I brought it back. I also compared SHOW GLOBAL VARIABLES output between 5.6 and 8.0 and changed a few options to make configurations more similar.
The changes for 8.0 in the cx6a config are:
- add eq_range_index_dive_limit=10 as the default is 200
- add binlog_checksum=crc32
- eq_range_index_dive_limit=10 to remind myself that this option matters
- add binlog_checksum=crc32
Other changes
I made other changes after the first round of tests:
- Run for 1, 2 and 3 clients for 180s each. Previously I used 1, 2, 3 and 4 clients at 90s each but my servers have 4 cores and CPU is frequently saturated at 4 clients.
- Change helper scripts to use the --random-points option for the covered point & range query tests
- Give time to perform all pending writes (buffer pool writeback, LSM compaction) after the initial load and before the read-only queries
Results
Load:
- QPS ratios are 0.88, 0.89, 0.90, 0.90 for the first 4 tests (here to here)
- These do point queries
- CPU/query ratios are: 1.16, 1.13, 1.11, 1.12 (here to here)
- Writeback was done before these tests started, unlike results in my previous post.
- QPS ratios are 0.96, 1.00, 1.14 for the next 3 tests (here to here)
- These do the range scans from oltp_read_write.lua with ranges of size 10, 100 & 10,000
- CPU/query ratios are 1.07, 1.01, 0.91 (here to here). Long scans are better in 8.0 vs 5.6.
- QPS ratios are 0.92, 0.90 for the next 2 tests (here to here)
- These do point queries via in-lists that are covering and then not covering for the PK index
- CPU/query ratios are 1.11, 1.10 (here to here).
- The in-lists had 100 values vs 10 for the test in the previous post. The 8.0.17 results here benefit from the change to eq_range_index_dive_limit.
- QPS ratios are 0.97, 0.95 for the next 2 tests (here to here)
- These are similar to the previous test, but use the secondary index.
- CPU/query ratios are 1.05, 1.06 (here to here)
- QPS ratios are 1.01, 0.88 for the next 2 tests (here to here)
- These do range queries that are covering and then not covering for the primary index. The range size was 100 vs 10 in the previous tests.
- CPU/query ratios are 1.02, 1.15 (here to here)
- QPS ratios are 1.00, 0.91 for the next 2 tests (here to here)
- QPS ratios are 0.81, 0.84, 0.80, 0.79, 0.78 for the next 5 tests (here to here)
- QPS ratio is 0.89 for the next test, write-only. See here.
- This has the writes from oltp_read_write.lua.
- CPU/transaction ratio is 1.14. See here.
- QPS ratios are 0.98, 0.99 for the next two tests, read-write (here to here)
- QPS ratio is 0.98, 0.95, 1.02 for the next 3 tests, read-only (here to here)
- These have the queries (range scans) from oltp_read_write.lua with ranges of size 10, 100 and 10,000. Results here are worse than Read-only before write-heavy.
- CPU/transaction ratios are 1.05, 1.09, 1.01 (here to here)
- QPS ratios are 0.89, 0.88, 0.87, 0.86, 0.79 for the next 5 tests (here to here)
- These do a variety of point queries. The first 4 were run in Read-only before write heavy, and results here are slightly worse.
- CPU/query ratios are 1.15, 1.15, 1.17, 1.16, 1.27 (here to here)
- QPS ratios are 0.87, 0.86 for the next 2 tests (here to here)
- These do point queries via in-lists that are covering and then not covering for the primary index. Results here are worse than Read-only before write-heavy.
- CPU/query ratios are 1.16, 1.18 (here to here)
- QPS ratios are 0.52, 0.63 for the next 2 tests (here to here)
- These are similar to the previous test, but use the secondary index. Results here are much worse than for Read-only before write-heavy and CPU overhead is the problem.
- HW efficiency metrics are interesting (here to here). CPU/query ratios are 1.93, 1.60. IO read (rKB/o) ratios are 2.55 for the covered index test -- both 5.6 and 8.0 are doing a lot of read IO for an index that should be cached.
- QPS ratios are 0.74, 0.72 for the next 2 tests (here to here)
- These do range queries that are covering and then not covering for the primary index. Results here are worse than Read-only before write-heavy.
- HW efficiency metrics are interesting (here to here). CPU/query ratios are 1.27, 1.31. Both 5.6 and 8.0 do a lot of read IO (rKB/o) for the covered query for an index that should be cached.
- QPS ratios are 0.86, 0.84 for the next 2 tests (here to here)
Monday, January 4, 2021
Sysbench: in-memory InnoDB, MySQL 5.6 & 8.0, take 2
I discovered one of the reasons for QPS regressions with in-memory Sysbench and MySQL 5.7+ that I explained in a previous post. The default value for eq_range_index_dive_limit was increased from 10 in 5.6 to 200 in 5.7. This option determines the amount of work (index searches) the optimizer might do to estimate predicate selectivity and going from 10 to 200 means up to ~20X more index searches for queries with large in-lists. I have sysbench scripts with queries that have 10, 100, 1000 and 10,000 values in an in-list, so the change in the option value effects the cases with 100, 1000 and 10,000 values.
I am wary of changing the default values for options, but change is inevitable and frequently a good thing. The best way to spot this is to compare output from SHOW GLOBAL VARIABLES. Of course, reading release notes is always a good idea if you have more time.
The summary from my previous post is mostly true here:
- QPS in 8.0 is frequently between 60% and 90% of what 5.6 gets because 8.0 uses more CPU
- Compared with my previous post, QPS is better here for the covered & not-covered, point & range queries in Read-only before write-heavy and Read-only after write-heavy (thanks to setting eq_range_index_dive_limit=10).
- There is a CPU regression new in 8.0.22 that hurts queries with large in-lists (bug 102037)
my.cnf changes
The my.cnf that I use for 5.7 and 8.0 used to have eq_range_index_dive_limit=10. I forgot it was there for a good reason and removed it a few weeks ago while testing my.cnf options. So I compared SHOW GLOBAL VARIABLES output between 5.6, 5.7 and 8.0 and changed a few options to make configurations more similar.
The changes for 8.0 in the cx6d config are:
- add eq_range_index_dive_limit=10 as the default is 200
- add innodb_buffer_pool_load_at_startup=OFF, innodb_buffer_pool_dump_at_shutdown=OFF. I don't want dump/restore to interfere with my workflow.
- innodb_purge_threads=1. My server is small and more threads has hurt perf in the past.
- innodb_log_file_size=1G. I use innodb_dedicated_server but still want to guarantee that all versions use a similar number amount of redo log space.
- eq_range_index_dive_limit=10 as the default is 200
- innodb_flush_method=O_DIRECT_NO_FSYNC because 8.0 uses this
- innodb_autoinc_lock_mode=2 because the default is 1 in 5.7 and 2 in 8.0
- innodb_flush_neighbors=0 because the default is 1 in 5.7 and 0 in 8.0
- innodb_buffer_pool_load_at_startup=OFF, innodb_buffer_pool_dump_at_shutdown=OFF
- innodb_purge_threads=1 to avoid CPU & mutex contention on a small server
- innodb_log_file_size=1G to match 8.0 as the default is much smaller
- eq_range_index_dive_limit=10 to remind myself that this option matters
- innodb_flush_method=O_DIRECT_NO_FSYNC because 8.0 uses this
- innodb_autoinc_lock_mode=2 because the default is 1 in 5.7 and 2 in 8.0
- innodb_flush_neighbors=0 because the default is 1 in 5.7 and 0 in 8.0
- innodb_checksum_algorithm=crc32 to match 8.0 as the default (innodb) is slower
- innodb_file_format=Barracuda, innodb_file_format_max=Barracuda to match 8.0
- innodb_log_file_size=1G to match 8.0 as the default is much smaller
Other changes
I made other changes after the first round of tests:
- Run for 1, 2 and 3 clients for 180s each. Previously I used 1, 2, 3 and 4 clients at 90s each but my servers have 4 cores and CPU is frequently saturated at 4 clients.
- Change helper scripts to use the --random-points option for the covered point & range query tests
- Give time to perform all pending writes (buffer pool writeback, LSM compaction) after the initial load and before the read-only queries
- Skipped tests for 8.0.17, 8.0.19 and 8.0.20 to save time. I still run for 5.6.49, 5.7.31, 8.0.18, 8.0.21 and 8.0.22
Results
Load:
- Inserts/second ratio is 0.72 (~100k/s -> ~70k/s)
- CPU/insert ratio is 1.58
- The insert/second ratio for 5.7 is 1.07, so loads are faster in 5.7 vs 5.6, and the regression begins in 8.0 and the problem is new CPU overhead
- There were significant reads from storage for 5.6, 5.7 and 8.0.18, but not 8.0.21 or 8.0.22. That IO was from creating the secondary index which is done after the table has been loaded, and iostat output shows that is limited to the end of the "prepare" step.
- QPS ratios are 0.74, 0.72, 0.59, 0.24 for the first 4 tests (up to here)
- These do point queries
- CPU/query ratios are: 1.36, 1.40, 1.72, 4.02. Up to here.
- The large regressions (0.59, 0.24) are from bug 102037 which is new in 8.0.22. For 8.0.21 the QPS ratios are 0.73, 0.70, 0.74, 0.90.
- Writeback was done before these tests started, unlike results in my previous post.
- QPS ratios are 0.74, 1.02, 1.26 for the next 3 tests (here to here)
- These have the range scans from oltp_read_write.lua with ranges of size 10, 100 & 10,000
- CPU/query ratios are 1.35, 1.02, 0.83 (here to here). Long scans are better in 8.0 vs 5.6, but worse in 8.0 vs 5.7.
- QPS ratios are 0.59, 0.59 for the next 2 tests (here to here)
- These do point queries via in-lists that are covering and then not covering for the PK index
- CPU/query ratios are 1.70, 1.73 (here to here).
- The in-lists had 100 values vs 10 for the test in the previous post. The 8.0.22 results here suffer more from bug 102037 but benefit from the change to eq_range_index_dive_limit. The QPS ratios for 8.0.21 are 0.75, 0.74.
- QPS ratios are 0.66, 0.65 for the next 2 tests (here to here)
- These are similar to the previous test, but use the secondary index.
- CPU/query ratios are 1.53, 1.53 (here to here)
- The in-lists had 100 values vs 10 for the test in the previous post. The QPS ratios for 8.0.21 are 0.77, 0.75. See the previous bullet point.
- QPS ratios are 0.75, 0.70 for the next 2 tests (here to here)
- These do range queries that are covering and then not covering for the primary index. The range size was 100 vs 10 in the previous tests.
- CPU/query ratios are 1.35, 1.42 (here to here)
- QPS ratios are 0.78, 0.59 for the next 2 tests (here to here)
- QPS ratios are 0.71, 2.10, 0.75, 0.75, 0.75 for the next 5 tests (here to here)
- These are update-only
- HW efficiency metrics can be confusing (here to here). CPU/statement ratios are 2.72, 1.22, 2.24, 2.25, 2.25 and wKB/statement ratios are 1.95, 0.91, 0.99, 1.00, 1.44.
- 8.0 is faster than 5.6 for update-index (QPS ratio is 2.10), but slower in the other tests.
- QPS ratio is 0.77 for the next test, write-only. See here.
- This has the writes from oltp_read_write.lua.
- CPU/transaction ratio is 2.31. See here.
- QPS ratios are 0.71, 0.86 for the next two tests, read-write (here to here)
- QPS ratio is 0.75, 1.01, 1.25 for the next 3 tests, read-only (here to here)
- These have the queries (range scans) from oltp_read_write.lua with ranges of size 10, 100 and 10,000. Results here are similar to Read-only before write-heavy and 8.0 is better than 5.6 at long range scans, but worse at shorter ones.
- CPU/transaction ratios are 1.34, 1.05, 0.83 (here to here)
- QPS ratios are 0.75, 0.71, 0.59, 0.24, 0.49 for the next 5 tests (here to here)
- These do a variety of point queries. The first 4 were run in Read-only before write heavy, and results here are similar.
- CPU/query ratios are 1.36, 1.42, 1.77, 4.07, 1.91 (here to here)
- The large regressions in QPS ratios (0.59, 0.24, 0.49) are from bug 102037 which is new in 8.0.22. For 8.0.21 the QPS ratios are 0.74, 0.70, 0.73, 0.90, 0.63.
- QPS ratios are 0.59, 0.59 for the next 2 tests (here to here)
- These do point queries via in-lists that are covering and then not covering for the primary index. Results here are similar to Read-only before write-heavy.
- CPU/query ratios are 1.73, 1.74 (here to here)
- For the impact from bug 102037 the QPS ratios for 8.0.21 are 0.75, 0.74
- QPS ratios are 0.69, 0.66 for the next 2 tests (here to here)
- These are similar to the previous test, but use the secondary index. Results here are similar to Read-only before write-heavy.
- CPU/query ratios are 1.47, 1.54 (here to here).
- For the impact from bug 102037 the QPS ratios for 8.0.21 are 0.80, 0.76
- QPS ratios are 0.76, 0.71 for the next 2 tests (here to here)
- These do range queries that are covering and then not covering for the primary index. Results here are similar to Read-only before write-heavy.
- CPU/query ratios are 1.34, 1.40 (here to here)
- For the impact from bug 102037 the QPS ratios for 8.0.21 are
- QPS ratios are 0.78, 0.59 for the next 2 tests (here to here)