Friday, November 1, 2024

Too many performance regressions for InnoDB in MySQL 8.0.29 and 8.0.30

There are many changes to InnoDB in MySQL 8.0.29 and 8.0.30. And many changes increases the chance of performance regressions. A recent report on this from me is here. In the worst-case the throughput drops almost in half for the update-index microbenchmark. This blog post identifies the diff that causes the regression.

tl;dr
  • this diff causes the update-index regression. The diff adds support for instant add/drop column to InnoDB and landed in 8.0.29, then 8.0.29 was retracted and the diff is new to most of us in 8.0.30.
  • MySQL 8.0.40 fixes regressions for many of the read-only microbenchmarks. I failed to acknowledge that in my previous post. But the large regression for long range scans remains, especially at lower concurrency levels. And the regression for update-index also remains in 8.0.40.
  • I remain confused about why this problem is so much worse on one of my servers (dell32) than on my other large servers
The problems that arrived in InnoDB after 8.0.28 include:
  • bug 111538 - compile-time inlining was changed leading to less-inlining for performance critical code and more CPU overhead. This bug has been closed as of 8.0.40 but the perf regression is still here.
  • bug 116463 - InnoDB busy-wait loops wait ~20% longer on average. This change wasn't intended but arrived as part of rewriting hash functions used by InnoDB. The impact from this is more CPU overhead and more context switches when there is contention. I filed this bug recently and expect it to be fixed soon.
  • new hash functions - the hash functions used by InnoDB were rewritten and while the new algorithms are likely better at hashing uniformly, they changes also brought a few bugs and bug fixes
  • I filed bug 116531 for the update-index regression
My list could have been longer but I always disable the InnoDB adaptive hash index and missed the bugs that arrived from all of the improvements to it.

Release notes

A few relevant comments from the release notes.


  • InnoDB: To improve code quality and facilitate debugging, instances of #define in the InnoDB sources were replaced by constexpr specifiers or inline functions. (WL #14680)

  • InnoDB: InnoDB now supports ALTER TABLE ... DROP COLUMN operations using ALGORITHM=INSTANT.

    Operations that support ALGORITHM=INSTANT only modify metadata in the data dictionary. Table data is unaffected, making the operations instantaneous. If not specified explicitly, ALGORITHM=INSTANT is used by default by DDL operations that support it.

    Prior to MySQL 8.0.29, an instantly added column could only be added as the last column of the table. From MySQL 8.0.29, an instantly added column can be added to any position in the table.

    Instantly added or dropped columns create a new row version. Up to 64 row versions are permitted. A new TOTAL_ROW_VERSIONS column was added to the INFORMATION_SCHEMA.INNODB_TABLES table to track the number of row versions.

    For more information about DDL operations that support ALGORITHM=INSTANT, see Online DDL Operations. (WL #13899)


  • InnoDB: Hash and random generator functions in the InnoDB sources were improved. (Bug #16739204, Bug #23584861)


  • InnoDB: Changes in hashing functions made in MySQL 8.0.30 had an adverse effect on performance. (Bug #34870256)


  • Performance; InnoDB: Several functions internal to InnoDB, which were defined as inline in MySQL 8.0.28, were found to be no longer inline in MySQL 8.0.33, due in part to refactoring which accompanied improvements made in MySQL 8.0.30 to improve the InnoDB adaptive hash index. This had an adverse effect on queries using joins on InnoDB tables. (Bug #111538, Bug #35531293)

    References: This issue is a regression of: Bug #81814, Bug #16739204, Bug #23584861.

Builds

I compiled everything from source. For MySQL 8.0.28 and 8.0.40 I used the community download. For MySQL 8.0.29 and the the two mid-release builds I used the git repo. The 8.0.29 release was retracted because of a bug. The builds are:
  • my8028_rel_o2nofp
    • MySQL 8.0.28 community release
  • my8029_rel_o2nofp
    • MySQL 8.0.29 from the git repo
  • my8040_rel_o2nofp
    • MySQL 8.0.40 community release
  • my8029_rel_o2nofp_preidrop_155cf3d902
    • this is after 8.0.28 but prior to 8.0.29 and is the diff immediately prior to the diff that adds support for instant add/drop column. It is here in github.
  • my8029_rel_o2nofp_drop_e8f422de
    • this is after 8.0.28 but prior to 8.0.29 and is the diff that adds support for instant add/drop column. It is here in github.
The my.cnf files are here for 8.0.28/8.0.29 and for 8.0.40. I use different my.cnf files because innodb_redo_log_capacity arrived in 8.0.30.

Hardware

The servers are
  • dell32
    • Dell Precision 7865 Tower Workstation with 1 socket, 128G RAM, AMD Ryzen Threadripper PRO 5975WX with 32-Cores, 2 m.2 SSD (each 2TB, RAID SW 0, ext4). 
  • ax162-s
    • AMD EPYC 9454P 48-Core Processor with SMT disabled, 128G RAM, Ubuntu 22.04 and ext4 on 2 NVMe devices with SW RAID 1. This is in the Hetzner cloud.
Benchmark

I used sysbench and my usage is explained here. A full run has 42 microbenchmarks and most test only 1 type of SQL statement. The database is cached by InnoDB.

The benchmark is run with 8 tables and 10M rows per table. I used 24 threads for the dell32 server and 40 threads for the ax162-s server. Each microbenchmark runs for 300 seconds if read-only and 600 seconds otherwise. Prepared statements were enabled.

The command lines for my helper script were:
    bash r.sh 8 10000000 300 600 nvm 1 1 24 # for dell32
    bash r.sh 8 10000000 300 600 md2 1 1 40 # for ax162-s

Results: overview

All of the results use relative QPS (rQPS) where:
  • rQPS is: (QPS for my version / QPS for base version)
  • base version is the QPS from MySQL 8.0.28
  • my version is one of the other versions
The problem arrives in the diff that adds instant add/drop column support, which is in the my8029_rel_o2nofp_drop_e8f422de build.

Throughput relative to: my8028_rel_o2nofp
col-1 : my8029_rel_o2nofp_preidrop_155cf3d902
col-2 : my8029_rel_o2nofp_drop_e8f422de
col-3 : my8029_rel_o2nofp
col-4 : my8040_rel_o2nofp

1.01    0.55    0.56    0.56    update-index_range=100 -> dell32
0.99 0.78 0.80 0.90 update-index_range=100 -> ax162-s

Results: dell32

Summary
  • Throughput on update-index drops almost in half starting in 8.0.29. The problem arrives in the diff that adds instant add/drop column support. The problem remains in 8.0.40.
  • Metrics from vmstat and iostat are here and the results for update-index show a ~1.4X increase in both CPU overhead (cpu/o) and context switch rates (cs/o) per update statement.
  • Throughput per microbenchmark is here for relative (csv, tsv) and absolute (csv, tsv)
Throughput relative to: my8028_rel_o2nofp
col-1 : my8029_rel_o2nofp_preidrop_155cf3d902
col-2 : my8029_rel_o2nofp_drop_e8f422de
col-3 : my8029_rel_o2nofp
col-4 : my8040_rel_o2nofp

col-1   col-2   col-3   col-4
1.02    1.00    1.02    1.06    hot-points_range=100
0.98    1.00    0.98    0.95    point-query.pre_range=100
0.99    1.00    0.99    0.95    point-query_range=100
1.00    0.96    0.97    1.05    points-covered-pk.pre_range=100
0.99    0.96    0.97    1.05    points-covered-pk_range=100
0.99    0.94    0.96    1.04    points-covered-si.pre_range=100
1.00    0.95    0.97    1.05    points-covered-si_range=100
1.00    0.96    0.98    1.05    points-notcovered-pk.pre_range=100
1.00    0.96    0.97    1.05    points-notcovered-pk_range=100
1.00    0.96    0.97    1.02    points-notcovered-si.pre_range=100
1.00    0.94    0.96    1.01    points-notcovered-si_range=100
0.97    0.96    0.97    1.89    random-points.pre_range=1000
0.99    0.96    0.97    1.05    random-points.pre_range=100
1.00    0.98    0.98    0.96    random-points.pre_range=10
0.98    0.97    0.97    1.91    random-points_range=1000
0.99    0.96    0.97    1.05    random-points_range=100
1.00    0.98    0.98    0.96    random-points_range=10
0.98    0.97    0.98    0.99    range-covered-pk.pre_range=100
0.98    0.97    0.98    0.99    range-covered-pk_range=100
0.98    0.98    0.97    0.99    range-covered-si.pre_range=100
0.99    0.98    0.98    0.99    range-covered-si_range=100
0.98    0.99    0.98    0.96    range-notcovered-pk.pre_range=100
0.98    0.98    0.98    0.96    range-notcovered-pk_range=100
1.00    0.96    0.97    0.97    range-notcovered-si.pre_range=100
1.00    0.96    0.97    0.97    range-notcovered-si_range=100
0.99    0.99    0.99    1.01    read-only.pre_range=10000
0.99    0.99    0.98    0.95    read-only.pre_range=100
0.98    0.99    0.99    0.95    read-only.pre_range=10
1.00    1.00    1.00    1.01    read-only_range=10000
0.99    0.98    0.99    0.95    read-only_range=100
0.99    0.99    0.98    0.95    read-only_range=10
0.99    0.93    0.93    0.91    scan_range=100
1.00    0.99    0.99    0.93    delete_range=100
1.00    1.00    1.00    0.94    insert_range=100
0.99    0.99    0.99    0.95    read-write_range=100
0.99    0.99    0.99    0.95    read-write_range=10
1.01    0.55    0.56    0.56    update-index_range=100
1.00    0.99    0.99    1.03    update-inlist_range=100
1.00    1.00    1.00    0.95    update-nonindex_range=100
1.00    0.99    1.00    0.94    update-one_range=100
1.00    1.00    1.00    0.96    update-zipf_range=100
1.00    0.98    0.97    0.94    write-only_range=10000

Results: ax162-s

Summary
  • Throughput on update-index drops by 20% starting in 8.0.29. The problem arrives in the diff that adds instant add/drop column support. The regression isn't as bad in 8.0.40 where the drop is only ~10% relative to 8.0.28.
  • Metrics from vmstat and iostat are here and the results for update-index show a ~1.15X increase in both CPU overhead (cpu/o) and context switch rates (cs/o) per update statement in 8.0.29 and smaller (1.09X for CPU, 1.03X for context switches) in 8.0.40.
  • Throughput per microbenchmark is here for relative (csvtsv) and absolute (csvtsv)
Throughput relative to: x.my8028_rel_o2nofp
col-1 : x.my8029_rel_o2nofp_preidrop_155cf3d902
col-2 : x.my8029_rel_o2nofp_drop_e8f422de
col-3 : x.my8029_rel_o2nofp
col-4 : x.my8040_rel_o2nofp

col-1 col-2 col-3 col-4
0.97 0.99 1.00 0.99 hot-points_range=100
1.00 0.99 0.98 0.94 point-query.pre_range=100
0.98 0.99 0.98 0.94 point-query_range=100
1.00 1.01 1.00 1.02 points-covered-pk.pre_range=100
0.99 1.01 1.00 1.02 points-covered-pk_range=100
1.00 1.00 0.98 0.98 points-covered-si.pre_range=100
1.00 1.00 0.98 0.99 points-covered-si_range=100
1.00 1.02 1.00 1.02 points-notcovered-pk.pre_range=100
0.99 1.01 1.00 1.02 points-notcovered-pk_range=100
0.99 0.99 0.96 0.96 points-notcovered-si.pre_range=100
0.99 0.99 0.97 0.96 points-notcovered-si_range=100
1.00 1.02 0.99 1.72 random-points.pre_range=1000
0.99 1.01 1.00 1.02 random-points.pre_range=100
1.01 1.01 0.99 0.94 random-points.pre_range=10
0.99 1.01 0.98 1.72 random-points_range=1000
0.99 1.01 1.00 1.02 random-points_range=100
1.01 1.00 0.99 0.94 random-points_range=10
1.01 0.98 0.95 0.95 range-covered-pk.pre_range=100
1.00 0.97 0.95 0.95 range-covered-pk_range=100
1.00 0.99 0.96 0.96 range-covered-si.pre_range=100
1.00 0.99 0.97 0.96 range-covered-si_range=100
1.00 0.99 0.96 0.94 range-notcovered-pk.pre_range=100
1.00 0.99 0.97 0.94 range-notcovered-pk_range=100
1.00 1.00 0.98 0.93 range-notcovered-si.pre_range=100
0.99 0.99 0.98 0.93 range-notcovered-si_range=100
1.00 0.99 0.99 1.01 read-only.pre_range=10000
0.99 0.97 0.98 0.94 read-only.pre_range=100
1.00 0.98 0.98 0.94 read-only.pre_range=10
1.00 0.99 0.99 1.00 read-only_range=10000
0.99 0.97 0.97 0.93 read-only_range=100
1.00 0.99 0.98 0.94 read-only_range=10
0.93 0.81 0.86 0.83 scan_range=100
1.00 0.99 1.02 0.93 delete_range=100
1.00 1.00 1.01 0.94 insert_range=100
0.99 0.98 0.98 0.95 read-write_range=100
0.99 0.98 0.98 0.95 read-write_range=10
0.99 0.78 0.80 0.90 update-index_range=100
0.96 0.95 0.98 1.02 update-inlist_range=100
1.01 1.01 1.02 0.96 update-nonindex_range=100
1.00 0.99 1.00 0.95 update-one_range=100
1.00 1.00 1.01 0.95 update-zipf_range=100
0.99 0.97 0.98 0.92 write-only_range=10000



No comments:

Post a Comment

Speedb vs RocksDB on a large server

I am happy to read about storage engines that claim to be faster than RocksDB. Sometimes the claims are true and might lead to ideas for mak...