Wednesday, December 1, 2021

Sysbench: MySQL 5.6, 5.7 & 8.0 on a small server

This has results for sysbench with MySQL versions 5.6.49, 5.7.35 and 8.0.2x using the same setup as previously used for the insert benchmark. The versions for 8.0.2x are 8.0.20, 8.0.22, 8.0.23, 8.0.26 and 8.0.27.

Executive summary:

  • For CPU-bound setups, MySQL 8.0.27 is slower than 5.6.49 on 39 of the 42 microbenchmarks. For the microbenchmarks when it was slower, on average it gets 72% of the throughput vs 5.6.49.
  • For IO-bound setups, MySQL 8.0.27 is slower than 5.6.49 on 27 of the 42 microbenchmarks, faster on 13 and had the same throughput on 2.
  • In most cases more CPU overhead in 8.0 is the reason it is slower than 5.6.

Details

See the Postgres post for more details on how I run sysbench. The summary is there are 43 invocations of sysbench and each is a microbenchmark. But a typo meant I only have results for 42 of the 43.

Three configurations were tested: 10M rows without prepared statements, 10M rows with prepared statements, 400M rows without prepared statements. The 10M row tests are CPU-bound as the database fits in memory. The 400M row test is usually IO-bound as the database is larger than memory. Below I call the test configurations 10m.prep0, 10m.prep1 and 400m.prep0. The database sizes are 2.9G at 10M rows and 91G at 400M rows after the initial load.

One of my goals was to understand the impact of using prepared statements with MySQL especially with respect to CPU overhead. They didn't make a big difference with the exception of the one test for which there is a performance bug with prepared statements (random-points.pre.range1000.pk1).

How to read the results

The presentation for these results is just text files pasted into gists. There are two types of files: qps and metrics. 

The qps file has the relative QPS for each test where the relative QPS is the ratio: QPS-for-me / QPS-for-base.

  • QPS-for-me is the QPS for one of MySQL 5.6.49, 5.7.35 or 8.0.2x
  • QPS-for-base is the QPS for MySQL 5.6.49
  • Thus each line in the qps file has seven numeric columns and the first one has the value 1.00 (QPS-for-5.6.49 / QPS-for-5.6.49). The second through seventh columns have the values for (QPS-for-me / QPS-for-5.6.49) where me is 5.7.35, 8.0.20, 8.0.22, 8.0.23, 8.0.26 and 8.0.27 in that order. When the value in a column is less than 1.0 then that version is slower than 5.6.49.
The metrics file has absolute and relative (to results for 5.6.49) values for each of the 42 tests. This file is long so I start with the qps file and then consult the metrics file to understand why one result is better than another.
  • cpu/o - CPU per operation, measured by iostat. While there is a unit for this, I don't worry about that as it is most useful for comparing numbers between different test configurations so the units drop out.
  • r/o - storage reads per operation, measured by iostat
  • rKB/o - KB read from storage per operation, measured by iostat
  • wKB/o - KB written to storage per operation,  measured by iostat
  • o/s - operations/second (QPS, inserts/s, etc). I tend to use QPS below for everything.
  • dbms - the MySQL version and configuration file
Results

Here are the qps and metrics files for the 10m.prep0 (CPU-bound, no prepared statements):
  • 8.0.27 is faster than 5.6.49 on 3 of 42 tests (read-only.pre.range10000,pk1, read-only.range10000.pk1, update-index.range100,pk1). The first two do long range scans and the relative QPS is 1.25. For the update-index test the relative QPS is 1.48.
  • The relative QPS for 8.0.27 is less than 0.8 for 35 of the 42 tests. This means that 8.0.27 gets less than 80% of the throughput relative to 5.6.49 on 35 of the tests. The minimum relative QPS is 0.61. For the 39 tests on which 8.0.27 is slower than 5.6.49 the average of the relative QPS is 0.72.
  • For the tests in which the relative QPS in 8.0.27 is much less than one the cause is using more CPU/query. See the cpu/o column in the metrics file.
  • The speedup for update-index is harder to explain. 8.0.2x uses less CPU, but the difference isn't large enough to explain the improvement. The speedup for the read-only tests is mostly from using less CPU/query in 8.0.2x. See the cpu/o column in the metrics file.

Here are the qps and metrics files for the 10m.prep1 (CPU-bound, prepared statements):
  • Results are similar to 10m.prep0
  • 8.0.27 is faster than 5.6.49 3 of 42 tests. The relative QPS is 1.24 for the read-only tests that do long range scans and 1.54 for the update-index test.
  • The relative QPS for 8.0.27 is less than 0.8 for 35 of the 42 tests. 
  • The minimum relative QPS is 0.30 for random-points.pre.range1000.pk1 (AFAIK there is a known perf bug in the optimizer). MySQL 8.0.2x uses more than 3X CPU/query compared to 5.6.49. See the cpu/o column in the metrics file. 
  • For the 39 tests on which 8.0.27 is slower than 5.6.49 the average of the relative QPS is 0.70.

Here are the qps and metrics files for the 400m.prep0 (IO-bound, no prepared statements):
  • 8.0.27 is faster than 5.6.49 for 13 of the 42 tests (tests are listed below)
  • 8.0.27 matched 5.6.49 QPS for 2 of the 42 tests (random-points.pre.range1000.pk1, random-points.range1000.pk1)
  • The relative QPS for 8.0.27 is less than 0.8 for 8 of the 42 tests (tests are listed below)
  • For the 27 tests on which 8.0.27 is slower than 5.6.49 the average of the relative QPS is 0.84

Relative QPS for 8.0.27 vs 5.6.49 when 8.0.27 is faster for the IO-bound tests:

1.22    point-query.pre.range100.pk1
1.02    random-points.pre.range10.pk1
1.05    read-only.pre.range100.pk1
1.13    read-only.pre.range10000.pk1
1.03    range-covered-pk.pre.range100.pk1
1.02    range-notcovered-pk.pre.range100.pk1
1.12    update-inlist.range100.pk1
1.17    read-only.range100.pk1
1.14    read-only.range10000.pk1
1.03    point-query.range100.pk1
1.05    random-points.range10.pk1
1.05    range-covered-pk.range100.pk1
1.06    range-notcovered-pk.range100.pk1

Relative QPS for 8.0.27 vs 5.6.49 when 8.0.27 is much slower for the IO-bound tests:

0.75    range-covered-si.pre.range100.pk1
0.68    update-one.range100.pk1
0.73    update-zipf.range100.pk1
0.55    read-write.range10.pk1
0.65    hot-points.range100.pk1
0.75    range-covered-si.range100.pk1
0.79    scan.range100.pk1
0.60    insert.range100.pk1









No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...