Monday, October 23, 2023

Perf regressions for MySQL/InnoDB vs the Insert Benchmark: 8.0.28 to 8.0.30

This post tries to explain a ~10% drop in the insert rate from MySQL 8.0.28 to 8.0.30 during the initial load done by the Insert Benchmark. It is a follow up to my previous post that shared results for all MySQL 8.0 releases.

tl;dr

  • MySQL needs to start using changepoint detection to find perf regressions
  • The largest threat to MySQL's future is perf regressions - Postgres doesn't have this problem
  • Most of the perf regressions during the initial load are from changes to code in the record layer (rec_get_offsets, rec_init_offsets, rec_get_nth_field). The problem is new CPU overheads.
  • On the bright side most of the regression is from InnoDB and the workaround is to use MyRocks
Updates

From my sysbench results, there are significant performance regressions in MySQL 8.0.28 and 8.0.30.

Percona has bug PS-8822 open for this.

The problem

The previous post has links to reports including this one that includes results for all 8.0 releases. The problem is visible in the Summary and the l.i0 benchmark step. The insert rate drops from 61896/s for 8.0.20 to 56726/s for 8.0.30 and then continues dropping to 54866/s for 8.0.34. The l.i0 benchmark step does the initial load of benchmark tables by inserting rows in PK order before secondary indexes are created.

From the vmstat and iostat metrics there is a small increase in CPU overhead (cpupq is CPU/insert) and context switches (cspq is context switches /insert).

I started by looking at the flamegraphs (see here) and then writing down the percentage of samples for parts of the flamegraph. My hard to decipher notes on those differences are here. At a high level, 8.0.30 spends more time in btr_cur_optimistic_insert and btr_cur_search_to_nth_level and their callees.
  • btr_cur_optimistic_insert accounts for ~11.8% of samples in 8.0.28 vs ~14.1% in 8.0.30
  • btr_cur_search_to_nth_level accounts for ~13.5% of samples in 8.0.28 vs ~15% in 8.0.30
At a low level, most of that difference appears to come from changes to record layer code, especially functions and/or macros like rec_get_offsets, rec_init_offsets, rec_init_offsets_comp_ordinary, rec_get_nth_field and rec_get_coverted_size_comp.
  1. A few macros were changed to functions, and might be harder to inline
  2. More code was added to a few macros and functions
The solution

Use changepoint detection to spot these problems before shipping a release.

No comments:

Post a Comment

Speedb vs RocksDB on a large server

I am happy to read about storage engines that claim to be faster than RocksDB. Sometimes the claims are true and might lead to ideas for mak...