I have been claiming that I don't find significant performance regressions in MySQL 8.4 and 9.x when I use sysbench. I need to change that claim. There are regressions for write-heavy tests, they are larger for tests with more concurrency and larger when gtid support is enabled.
By gtid support is enabled I mean that these options are set to ON:
Both of these are ON by default in MySQL 9.5.0 and were OFF by default in earlier releases. I just learned about the performance impact from these and in future tests I will make probably repeat tests with them set to ON and OFF.
This blog post has results from the write-heavy tests with sysbench for MySQL 8.0, 8.4, 9.4 and 9.5 to explain my claims above.
tl;dr
- Regressions are most likely and larger on the insert test
- There are regressions for write-heavy workloads in MySQL 8.4 and 9.x
- Throughput is typically 15% less in MySQL 9.5 than in 8.0 for tests with 16 clients on the 24-core/2-socket srever
- Throughput is typically 5% less in MySQL 9.5 than 8.0 for tests with 40 clients on the 48-core server
- The regressions are larger when gtid_mode and enforce_gtid_consistency are set to ON
- Throughput is typically 5% to 10% less with the -gtid configs vs the -nogtid configs with 40 clients on the 48-core server. But this is less of an issue on other servers.
- There are significant increases in CPU, context switch rates and KB written to storage for the -gtid configs relative to the same MySQL version using the -nogtid configs
- Regressions might be larger for the insert and update-inlist tests because they have larger transactions relative to other write-heavy tests. Performance regressions are correlated with increases in CPU, context switches and KB written to storage per transaction.
What changed?
I use diff to compare the output from SHOW GLOBAL VARIABLES when I build new releases and from that it is obvious that the default value for gtid_mode and enforce_gtid_consistency changed in MySQL 9.5 but I didn't appreciate the impact from that change.
Builds, configuration and hardware
I compiled MySQL from source for versions 8.0.44, 8.4.6, 8.4.7, 9.4.0 and 9.5.0.
The versions that I tested are named:
- 8.0.44-nogtid
- MySQL 8.0.44 with gtid_mode and enforce_gtid_consistency =OFF
- 8.0.44-gtid
- MySQL 8.0.44 with gtid_mode and enforce_gtid_consistency =ON
- 8.4.7-notid
- MySQL 8.4.7 with gtid_mode and enforce_gtid_consistency =OFF
- 8.4.7-gtid
- MySQL 8.4.7 with gtid_mode and enforce_gtid_consistency =ON
- 9.4.0-nogtid
- MySQL 9.4.0 with gtid_mode and enforce_gtid_consistency =OFF
- 9.4.0-gtid
- MySQL 9.4.0 with gtid_mode and enforce_gtid_consistency =ON
- 9.5.0-nogtid
- MySQL 9.5.0 with gtid_mode and enforce_gtid_consistency =OFF
- 9.5.0-gtid
- MySQL 9.5.0 with gtid_mode and enforce_gtid_consistency =ON
The servers are:
- 8-core
- The server is an ASUS ExpertCenter PN53 with and AMD Ryzen 7 7735HS CPU, 8 cores, SMT disabled, 32G of RAM. Storage is one NVMe device for the database using ext-4 with discard enabled. The OS is Ubuntu 24.04.
- my.cnf for the -nogtid configs are here for 8.0, 8.4, 9.4, 9.5
- my.cnf for the -gtid configs are here for 8.0, 8.4, 9.4, 9.5
- The benchmark is run with 1 thread, 1 table and 50M rows per table
- 24-core
- The server is a SuperMicro SuperWorkstation 7049A-T with 2 sockets, 12 cores/socket, 64G RAM, one m.2 SSD (2TB, ext4 with discard enabled). The OS is Ubuntu 24.04. The CPUs are Intel Xeon Silver 4214R CPU @ 2.40GHz.
- my.cnf for the -nogtid configs are here for 8.0, 8.4, 9.4, 9.5
- my.cnf for the -gtid configs are here for 8.0, 8.4, 9.4, 9.5
- The benchmark is run with 16 threads, 8 tables and 10M rows per table
- 48-core
- The server is ax162s from Hetzner with an AMD EPYC 9454P 48-Core Processor with SMT disabled and 128G of RAM. Storage is 2 Intel D7-P5520 NVMe devices with RAID 1 (3.8T each) using ext4. The OS is Ubuntu 22.04 running the non-HWE kernel (5.5.0-118-generic).
- my.cnf for the -nogtid configs are here for 8.0, 8.4, 9.4, 9.5
- my.cnf for the -gtid configs are here for 8.0, 8.4, 9.4, 9.5
- The benchmark is run with 40 threads, 8 tables and 10M rows per table
Benchmark
I used sysbench and my usage is explained here. I now run 32 of the 42 microbenchmarks listed in that blog post. Most test only one type of SQL statement. Benchmarks are run with the database cached by InnoDB. While I ran all of the tests, I only share results from a subset of the write-heavy tests.
The read-heavy microbenchmarks are run for 600 seconds and the write-heavy for 900 seconds.
The purpose is to search for regressions from new CPU overhead and mutex contention. The workload is cached -- there should be no read IO but will be some write IO.
Results
The microbenchmarks are split into 4 groups -- 1 for point queries, 2 for range queries, 1 for writes. Here I only share results from a subset of the write-heavy tests.
I provide charts below with relative QPS. The relative QPS is the following:
(QPS for some version) / (QPS for MySQL 8.0.44)
When the relative QPS is > 1 then some version is faster than MySQL 8.0.44. When it is < 1 then there might be a regression. When the relative QPS is 1.2 then some version is about 20% faster than MySQL 8.0.44.
Values from iostat and vmstat divided by QPS are here for the 8-core, 24-core and 48-core servers. These can help to explain why something is faster or slower because it shows how much HW is used per request, including CPU overhead per operation (cpu/o) and context switches per operation (cs/o) which are often a proxy for mutex contention.
The spreadsheet and charts are here and in some cases are easier to read than the charts below. The y-axis doesn't start at 0 to improve readability.
Results: 8-core
Summary
- For many tests there are small regressions from 8.0 to 8.4 and 8.4 to 9.x
- There are small improvements (~5%) for the -gtid configs vs the -nogtid result for update-index
- There is a small regression (~5%) for the -gtid configs vs the -nogtid result for insert
- There are small regression (~1%) for the -gtid configs vs the -nogtid result for other tests
From vmstat metrics for the insert test where perf decreases with the 9.5.0-gtid result
- CPU per operation (cpu/o) increases by 1.10X with the -gtid config
- Context switches per operation (cs/o) increases by 1.45X with the -gtid config
- KB written to storage per commit (wKB/o) increases by 1.16X with the -gtid config
From vmstat metrics for the update-index test where perf increases with the 9.5.0-gtid result
- CPU per operation (cpu/o) decreases by ~3% with the -gtid config
- Context switches per operation (cs/o) decrease by ~2% with the -gtid config
- KB written to storage per commit (wKB/o) decreases by ~3% with the -gtid config
- This result is odd. I might try to reproduce it in the future
Results: 24-core
Summary
- For many tests there are regressions from 8.0 to 8.4 and 8.4 to 9.x and throughput is typically 15% less in 9.5.0 than 8.0.44
- There are large regressions in 9.4 and 9.5 for update-inlist
- There is usually a small regression (~5%) for the -gtid configs vs the -nogtid result
From vmstat metrics for the insert test comparing 9.5.0-gtid with 9.5.0-nogtid
- Throughput is 1.15X larger in 9.5.0-nogtid
- CPU per operation (cpu/o) is 1.15X larger in 9.5.0-gtid
- Context switches per operation (cs/o) are 1.23X larger in 9.5.0-gtid
- KB written to storage per commit (wKB/o) is 1.24X larger in 9.5.0-gtid
From vmstat metrics for the update-inlist comparing both 9.5.0-nogtid and 9.5.0-nogtid with 8.0.44-nogtid
- The problems here look different than most other tests as the regressions in 9.4 and 9.5 are similar for the -gtid and -nogtid configs. If I have time I will get flamegraphs and PMP output. The server here has two sockets and can suffer more from false-sharing and real contention on cache lines.
- Throughput is 1.43X larger in 8.0.44-nogtid
- CPU per operation (cpu/o) is 1.05X larger in 8.0.44-nogtid
- Context switches per operation (cs/o) are 1.18X larger in 8.0.44-nogtid
- KB written to storage per commit (wKB/o) is ~1.12X larger in 9.5.0
Results: 48-core
Summary
- For many tests there are regressions from 8.0 to 8.4
- For some tests there are regressions from 8.4 to 9.x
- There is usually a large regression for the -gtid configs vs the -nogtid result and the worst case occurs on the insert test
From vmstat metrics for the insert test comparing 9.5.0-gtid with 9.5.0-nogtid
- Throughput is 1.17X larger in 9.5.0-nogtid
- CPU per operation (cpu/o) is 1.13X larger in 9.5.0-gtid
- Context switches per operation (cs/o) are 1.26X larger in 9.5.0-gtid
- KB written to storage per commit (wKB/o) is 1.24X larger in 9.5.0-gtid



No comments:
Post a Comment