Tuesday, September 12, 2023

Perf regressions in MySQL from 5.6.21 to 8.0.34 using sysbench and a small server

This has results for sysbench vs upstream MySQL on a small server. I have results for every release in 5.7 and 8.0 and some releases in 5.6. The goal is to document where things get faster or slower over time for a low-concurrency and CPU-bound workload. The focus is on CPU regressions. 

My results here aren't universal. 

  • Things won't look the same with an IO-bound workload. If nothing else that will make many of the CPU regressions less significant.
  • Things won't look the same with a workload that has more concurrency. While MySQL tends to get slower over time from more CPU overhead it also gets faster over time on concurrent workloads from improvements to synchronization code. Results from a few months ago on a larger server are here and the regressions are much smaller.
  • Things won't look the same with a workload that has complex queries. Most of the queries used by sysbench are simple and short running. This amplifies the impact of perf regressions in parse, semantic analysis and query optimization. 

tl;dr

  • Upstream MySQL would benefit from changepoint detection.
  • MySQL 8.0 is the worst for perf regressions, while 5.7 and 5.6 are better at avoiding them. In theory this is good news because newer regressions are easier to fix than older ones.
  • For most of the benchmark steps MySQL 8.0.34 gets between 20% and 50% less QPS than 5.6.21
  • There were large regressions for point query benchmark steps in 8.0.2x, Fortunately bug 102037 has been fixed in 8.0.3x (I found that bug via past usage of sysbench).
What about Postgres?
  • On the same setup the regressions from Postgres 11 to Postgres 16 are small while the regressions here are big. 
  • On a medium server they are also small.

Builds

It isn't easy to build older code on newer systems, compilers, etc. Notes on that are here for 5.6, for 5.6 and 5.7, for 5.7 and for 8.0. A note on using cmake is here. The rel builds were used -- everything was compiled using CMAKE_BUILD_TYPE=Release.

Tests were done for:
  • 5.6 - 5.6.21, 5.6.31, 5.6.41, 5.6.51
  • 5.7 - all point releases from 5.7.10 to 5.7.43
  • 8.0 - all point releases from 8.0.13 to 8.0.34
I used the cy10a_bee config and it is here for 5.6, 5.7 and 8.0. For 8.0 releases older than 8.0.19 I changed innodb_idle_flush_pct=1 to loose_innodb_idle_flush_pct=1.

Benchmarks

I used sysbench and my usage is explained here. Tests were run on a small server I have at home (see here). The test tables are cached by InnoDB.

Results

For the results below I split the benchmark steps into 5 groups -- 2 for point queries, 2 for range queries, 1 for writes. For the range query benchmark steps part 1 has queries that don't do aggregation while part 2 has queries that do aggregation. Unfortunately, I included the full scan benchmark step (scan_range=100) in part 2 but it doesn't do aggregation. I struggled with the presentation because there are a large number of results per chart in the sections that test all releases from 5.7 and 8.0. The spreadsheet with all data and charts is here and is easier to read.

All of the charts have relative throughput on the y-axis where that is (QPS for $me) / (QPS for $base), $me is a version (for example 5.7.24) and $base is the base version. The base version is the oldest version being compared and the goal is to see whether QPS is better or worse over time.

The legend on the RHS of each chart truncates the names I use for the benchmark steps and I don't know how to fix that other than sharing the link to the Google Sheet I used.

From 5.6.21 to 8.0.34

This section uses 5.6.21 as the base version and then compares that with 5.6.51, 5.7.10, 5.7.28, 5.7.43, 8.0.13, 8.0.23 and 8.0.34 to show how performance has changed from oldest tested (5.6.21) to newest tested (8.0.34).

  • For point queries, 8.0.3x gets 20% to 35% less QPS vs 5.6.21. There were larger regressions in 8.0.2x but most were fixed in 8.0.3x.
  • For range queries that don't do aggregation (part 1), 8.0.3x gets 30% to 40% less QPS vs 5.6.21. For range queries that do aggregation, 8.0.3x gets 0% to 35% less QPS vs 5.6.21. The difference depends on the length of the range scan (shorter scan == larger regression). Also, full scan (scan_range=100) has a significant regression.
  • For most writes, 8.0.3x gets 25% to 50% less QPS vs 5.6.21
It isn't easy to see but some of the lines are on top of each other.

MySQL 8.0: all point releases

This section uses 8.0.13 as the base version and compares every other 8.0 release with it.

  • For point queries, 8.0.34 gets 5% to 12% less QPS than 8.0.13. For some benchmark steps large regressions in 8.0.2x were fixed in 8.0.3x - that might have been bug 102037, found by my past usage of sysbench and fixed in 8.0.31. For the point-query* benchmark steps there are slow but steady regressions from 8.0.2x to 8.0.3x. 
  • For range queries, 8.0.34 gets 5% to 20% less QPS than 8.0.13. There are slow but steady regressions from 8.0.2x to 8.0.3x and the impact is becoming significant. Full scan (scan_range=100) has the largest regression.
  • For writes, 8.0.34 does better than 8.0.13 on the update-index benchmark step but otherwise gets 15% to 25% less QPS than 8.0.13. The regressions are slow but steady.

MySQL 5.7: all point releases

This section uses 5.7.10 as the base version and compares every other 5.7 release with it.

  • For point queries, 5.7.43 gets 2% to 6% less QPS than 5.7.10
  • For range queries that don't do aggreation (part 1), 5.7.43 gets 5% to 12% less QPS than 5.7.10. For range queries that do aggregation, 5.7.43 gets about 5% less QPS than 5.7.10. Full scan is 15% slower in 5.7.43 vs 5.7.10.
  • For writes, 5.7.43 gets more QPS than 5.7.10 for update-index and update-inlist. Otherwise it gets between 3% and 8% less QPS.

MySQL 5.6: some point releases

This section uses 5.6.21 as the base version and compares 5.6.31, 5.6.41 and 5.6.51 with it.

  • For point queries, 5.6.51 and 5.6.21 have similar results.
  • For range queries, excluding full scan, 5.6.51 gets 2% to 4% less QPS than 5.6.21. For full scan (scan_range=100) it gets 16% less QPS.
  • For writes, 5.6.51 gets from 5% less to 2% more QPS than 5.6.21 except for update-list where it gets 20% more.





No comments:

Post a Comment

How efficient is RocksDB for IO-bound, point-query workloads?

How efficient is RocksDB for workloads that are IO-bound and read-only? One way to answer this is to measure the CPU overhead from RocksDB a...