Wednesday, December 30, 2020

Sysbench: IO-bound InnoDB in MySQL 5.6, 5.7 and 8.0

This post has results for InnoDB in MySQL 5.6, 5.7 and 8.0 using an IO-bound & low-concurrency workload via sysbench. The setup is the same as I described for MyRocks and I wrote about in-memory sysbench + InnoDB in a previous post. The goal is to understand where performance changes across releases.

I will repeat a disclaimer from my previous post, these results are from microbenchmarks. They are useful but be careful to assume these results will reproduce for your workload, especially if your queries are more complex than those used by sysbench.

Summary:

  • The regressions here are not as bad as in the in-memory workload. In many tests below the QPS in 8.0 is better than 5.6 because the new CPU overheads are less significant for IO-bound.
  • I hope to repeat some of these tests to explain the odd results.
  • Covered point queries with a secondary index use too much CPU and IO in 8.0. Covered range queries also have regressions in 8.0. But versions of those queries that don't use a covering index don't have a problem in 8.0 for IO-bound, while they did for in-memory.
  • Some of the write-heavy tests are significantly faster in 8.0 

Overview

I tested MySQL versions 5.6.49, 5.7.31, and 8.0.17 through 8.0.22. As much as possible, I used the same my.cnf options for all versions. The my.cnf files are in the per-version directories in github and here is my.cnf for 8.0.22. I used the my.cnf options that were good for the insert benchmark (the cx6d configuration). For 8.0 ssl is disabled and the same charset/collation is used for all versions. Fsync is not done on commit to avoid becoming an fsync latency benchmark.

The in-memory workload used 1 table with 10M rows and each test ran for 90 seconds at 1, 2, 3 and 4 threads. The IO-bound workload used 1 table with 200M rows and each test ran for 300 seconds.

Results

The tests are in 4 groups based on the sequence in which they are run: read-only run before write-heavy, write-heavy, read-only run after write-heavy and insert/delete. 

I have scripts that generate 3 summaries -- absolute throughput, relative throughput and HW efficiency. Absolute throughput is the QPS/TPS that a test gets for 1, 2, 3 and 4 clients. Relative throughput is the QPS/TPS relative to the base case (MySQL 5.6.49). The HW efficiency report has absolute and relative results (base case is MySQL 5.6.49) for CPU and IO per operation.

I use ratios to explain performance. In this case MySQL 5.6.49 is the denominator and the numerator is the value for 5.7 or 8.0. A QPS ratio < 1 means the new version is slower. For HW efficiency I use CPU/operation and IO/operation (read & write). For CPU and IO per operation a ratio > 1 means the new version uses more CPU or IO per query. Below I only mention IO/operation ratios when 5.6 and 8.0 don't have similar values.

The results are in github for absolute throughputrelative throughput and HW efficiency. I annotate the relative throughput and HW efficiency results below. Unless called out, I am explaining the results for 8.0.22 relative to 5.6.49. 

Read-only before write-heavy:
  • QPS ratios are 1.02, 1.04, 1.10, 1.14 for the first 4 tests (up to here)
    • These do point queries
    • CPU/query ratios are: 0.96, 0.93, 0.79, 0.76. Up to here.
    • The regressions from bug 102037 for the in-memory workload are less of an issue here but the QPS ratio for random-points with range 1000 drops from 1.29 in 8.0.21 to 1.14 here.
  • QPS ratios are 1.14, 1.18, 1.36 for the next 3 tests (here to here
    • These have the range scans from oltp_read_write.lua with ranges of size 10, 100 & 10,000
    • CPU/query ratios are 0.89, 0.83, 0.67 (here to here). Long scans are better in 8.0 vs 5.6.
  • QPS ratios are 1.09, 1.07 for the next 2 tests (here to here)
    • These do point queries via in-lists that are covering and then not covering for the primary index. 
    • CPU/query ratios are 0.85, 0.88 (here to here).
  • QPS ratios are 0.50, 1.00 for the next 2 tests (here to here)
    • These are similar to the previous test, but use the secondary index. 
    • CPU/query ratios are 1.95, 0.97 while read IO/query ratios are 2.01, 1.00 (here to here). For the in-memory result the regression was ~2X for both covered and non-covered while here it only occurs for covered. The regression is in 5.7 and 8.0 but only for the covered query while it occurs for both the covered and non-covered queries with the in-memory workload.
  • QPS ratios are 1.12, 1.25 for the next 2 tests (here to here)
    • These do range queries that are covering and then not covering for the primary index
    • CPU/query ratios are 0.88, 0.78 (here to here)
  • QPS ratios are 0.79, 1.00 for the next 2 tests (here to here)
    • These are similar to the previous test, but use the secondary index. 
    • CPU/query ratios are 1.29, 0.95 while read IO/query ratios are 1.27 and 1.00 (here to here)
    • The in-memory workload has regression for both covered and non-covered, while only covered has a regression here. This is similar to the result above for point-queries using covered and non-covered queries.
Write-heavy
  • QPS ratios are 1.07, 9.50, 2.04, 0.91, 1.19 for the next 5 tests (here to here)
    • These are update-only
    • CPU/statement ratios are 0.87, 0.25, 0.79, 1.94, 0.83. Read IO/query ratios are 1.00, 0.44, 0.95, 1.00, 0.97 (here to here). 
    • The improvement for the 2nd and 3rd tests (update-index, update-nonindex) started in 5.7 and improved more in 8.0.
  • QPS ratio is 2.37 for the next test, write-only. See here.
    • This has the writes from oltp_read_write.lua. 
    • CPU/transaction ratio is 0.45 and read IO/query ratio is 0.76. See here.
  • QPS ratios are 1.18, 0.91 for the next two tests, read-write (here to here)
    • These are the traditional sysbench tests (oltp_read_write.lua) with ranges of size 10 and 100
    • CPU/transaction ratios are 0.91, 1.16 (here to here)
Read-only after write-heavy includes tests that were run before write-heavy:
  • QPS ratio is 0.92, 14.12, 1.36 for the next 3 tests, read-only (here to here)
    • These have the queries (range scans) from oltp_read_write.lua with ranges of size 10, 100 and 10,000. Results here are similar to Read-only before write-heavy and 8.0 is better than 5.6 at long range scans.
    • CPU/transaction ratios are 0.79, 0.17, 0.67 (here to here)
    • The huge speedup (14.12 QPS ratio) for the second test is confusing. It only happens for the 1-thread test and the QPS ratio for 2, 3, and 4 threads is ~2. The problem appears to be CPU/query (5.6 does ~5X more) while IO/query is OK.
  • QPS ratios are 2.04, 1.71, 4.62, 1.14, 0.50 for the next 5 tests (here to here)
    • These do a variety of point queries. The first 4 were run in Read-only before write heavy, and results here are similar.
    • CPU/query ratios are 0.51, 0.70, 0.32, 0.90, 1.89 (here to here) and that explains most of the differences.
    • The hot-points test doesn't do IO reads as the working set is small. So it still suffers from bug 102037 which is new in 8.0.22 
  • QPS ratios are 1.06, 1.05 for the next 2 tests (here to here)
    • These do point queries via in-lists that are covering and then not covering for the primary index. Results here are similar to Read-only before write-heavy.
    • CPU/query ratios are 0.87, 0.90 (here to here)
  • QPS ratios are 0.50, 1.00 for the next 2 tests (here to here)
    • These are similar to the previous test, but use the secondary index. 
    • CPU/query ratios are 1.93, 0.97 while read IO/query ratios are 2.01, 1.00 (here to here). Results here are similar to Read-only before write-heavy.
  • QPS ratios are 1.16, 1.24 for the next 2 tests (here to here)
    • These do range queries that are covering and then not covering for the primary index
    • Results here are similar to Read-only before write-heavy.
    • CPU/query ratios are 0.84, 0.80 (here to here)
  • QPS ratios are 0.78, 1.03 for the next 2 tests (here to here)
    • These are similar to the previous test, but use the secondary index. Results here are similar to Read-only before write-heavy.
    • CPU/query ratios are 1.31, 0.94 and read IO/query ratios are 1.29 and 1.00 (here to here)
    Insert/delete

    • QPS ratio is 1.80 for the delete test and 0.83 for the insert test
    • CPU/statement ratio is 0.86 for delete and 2.62 for insert
    • Read IO/statement ratio is 0.88 for delete and 234.69 for insert. For insert, the read IO ratio was ~2 for 5.7, then grew to ~5 in 8.0.19, then to ~16 for 8.0.20, then to ~200 for 8.0.21 & 8.0.22.














    No comments:

    Post a Comment

    RocksDB on a big server: LRU vs hyperclock, v2

    This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...