This has results for the insert benchmark on a small server with a low-concurrency workload using InnoDB in MySQL 5.6.49, 5.7.31 and 8.0.22. My previous posts are for the same workload with MyRocks and Postgres.
The test is run with 1 client and 1 table. In this case, client means there is at most 1 connection for writes and 1 for reads. For some tests the write and read connections do things concurrently. My test server has 4 CPU cores and hyperthreads are disabled.
The test is run for 3 table sizes: 20M, 100M and 500M rows. The test table fits in memory for the 20M and 100M sizes and is larger than memory for the 500M size. For the read+write test steps the working set is not cached for the 500M size with InnoDB while it is cached for MyRocks and almost cached for Postgres.
The insert benchmark has several steps (explained here and here): load data, create secondary indexes, load more data, read+write and the read+write step is repeated using increasing rate limits for the writer (100/s, 200/s, 400/s, 600/s, 800/s and 1000/s).
Summary for InnoDB:
- 8.0.22 gets ~35% less throughput on insert-heavy tests vs 5.6.49
- 8.0.22 uses between 1.5X and 2X more CPU/insert vs 5.6.49
- 8.0.22 and 5.7.31 are faster on create index vs 5.6.49
- 8.0.22 gets ~10% less throughput on in-memory (20M, 100M rows) read-heavy tests vs 5.6.49
- 8.0.22 uses ~20% more CPU/query vs 5.6.49
- QPS on the read-heavy tests for the 500M row table in 5.6, 5.7 and 8.0 is much worse than for Postgres and MyRocks because InnoDB does more storage reads/query. I have yet to explain this. IO determines throughput in this case and the extra CPU overhead in 8.0 is less of an issue. I have yet to explain this but am not sure I want to spend more time on it.
For now I am sharing text summaries and in a few days I will share a longer report with graphs and tables. The text summaries are dense, which is good and bad. Dense means I can do comparisons for many DBMS configurations with all the data I need on one page. But dense also imposes a burden on the reader. The guide is here if you want to understand the output.
The report summaries are in github for the tests with 20M, 100M and 500M rows. The summary files per test step are listed below. The mrg.* files have absolute values for throughput and HW efficiency metrics. The rel.* files have the values relative to the base case and here the base case is InnoDB in 5.6.49, so the values from 5.6.49 are the denominator while values from 5.7.31 and 8.0.22 are the numerator. I share ratios for 5.7.31 and 8.0.22. For relative throughput a value < 1 means that InnoDB in 8.0 or 5.7 is slower than in 5.6. For relative HW efficiency a value > 1 means that InnoDB in 8.0 or 5.7 uses more HW per operation than 5.6.
The files are:
- load data (l.i0) - mrg.l.i0, rel.l.i0
- create secondary indexes (l.x) - mrg.l.x, rel.l.x
- load more data (l.i1) - mrg.l.i1, rel.l.i1
- read+write with 100 inserts/s - mrg.q100.2, rel.q100.2
- read+write with 200 inserts/s - mrg.q200.2, rel.q200.2
- read+write with 400 inserts/s - mrg.q400.2, rel.q400.2
- read+write with 600 inserts/s - mrg.q600.2, rel.q600.2
- read+write with 800 inserts/s - mrg.q800.2, rel.q800.2
- read+write with 1000 inserts/s - mrg.q1000.2, rel.q1000.2
As explained above I use ratios to compare throughput and HW efficiency. The numerator is the value from InnoDB in 5.7.31 and 8.0.22 while the denominator is the value in 5.6.49. For throughput a ratio < 1 means that 8.0 or 5.7 are slower. For HW efficiency a ratio > 1 means that 8.0 or 5.7 uses more HW per operation.
The ratios are triples -- the results for the test with 20M, 100M and 500M rows.
- This loads in PK order to a table that has no secondary indexes
- Insert/s (ips) ratios were 0.91, 0.90, 0.96 for 5.7 and 0.61, 0.62, 0.65 for 8.0
- CPU/insert (cpupq) ratios were 1.08, 1.15, 1.07 for 5.7 and 1.46, 1.46, 1.36 for 8.0
- Context switch/insert ratios were 0.94, 0.95, 0.94 for 5.7 and 1.67, 1.56, 1.57 for 8.0
- This creates 3 secondary indexes
- Indexed rows/s (ips) ratios were 1.50, 1.77, 1.64 for 5.7 and 1.42, 1.19, 1.54 for 8.0
- CPU/indexed row (cpupq) ratios were 0.75, 0.67, 0.73 for 5.7 and 0.75, 1.00, 0.73 for 8.0
- Write KB/indexed row (wkbpi) ratios were 0.88, 0.90, 0.90 for 5.7 and 0.78, 0.84, 0.90 for 8.0
- Context switch/indexed row ratios were 1.79, 1.70, 1.61 for 5.7 and 1.55, 1.57, 1.68 for 8.0
- This loads more data in PK order after the secondary indexes are created
- Insert/s (ips) ratios were 0.91, 0.90, 1.12 for 5.7 and 0.68, 0.56, 0.69 for 8.0
- CPU/insert (cpupq) ratios were 1.16, 1.21, 1.17 for 5.7 and 1.52, 1.89, 1.98 for 8.0
- Context switch/indexed row ratios were 1.13, 1.07, 1.07 for 5.7 and 1.81, 2.64, 1.87 for 8.0
- This has one connection doing queries and another doing 100 inserts/s
- Queries/s (qps) ratios were 1.05, 0.89, 0.31 for 5.7 and 0.90, 0.77, 0.63 for 8.0
- CPU/insert (cpupq) ratios were 0.99, 1.16, 1.32 for 5.7 and 1.18, 1.38, 1.28 for 8.0
- This has one connection doing queries and another doing 200 inserts/s
- Queries/s (qps) ratios were 1.04, 0.90, 1.05 for 5.7 and 0.89, 0.79, 1.04 for 8.0
- CPU/insert (cpupq) ratios were 1.01, 1.15, 0.96 for 5.7 and 1.19, 1.34, 1.21 for 8.0
- This has one connection doing queries and another doing 800 inserts/s
- Queries/s (qps) ratios were 1.01, 0.89, 1.03 for 5.7 and 0.88, 0.78, 1.00 for 8.0
- CPU/insert (cpupq) ratios were 1.03, 1.12, 1.00 for 5.7 and 1.21, 1.33, 1.30 for 8.0
- This has one connection doing queries and another doing 1000 inserts/s
- Queries/s (qps) ratios were 0.98, 0.86, 1.03 for 5.7 and 0.84, 0.76, 1.00 for 8.0
- CPU/insert (cpupq) ratios were 1.06, 1.20, 1.02 for 5.7 and 1.26, 1.41, 1.28 for 8.0
The facts and/or symptoms are:
- Storage reads/query (rpq) increases from ~0.01 at 100M rows to ~4 at 500M rows for InnoDB
- Table+index size at 500M rows
- Is ~95G for Postgres and ~103G for InnoDB. See dbgb1 in mrg.some, measured via "ls"
- Is ~32G for the PK index and ~35G for the secondary indexes for InnoDB in 5.7 per SHOW TABLE STATUS. Alas the Index_length column is zero in 8.0.22, see bug 80453.
- Insert buffer size (via Ibuf: size in SHOW ENGINE INNODB STATUS) is between 50k and 100k at 500M rows but less than 500 at 100M rows. The docs don't explain whether Ibuf: size counts entries or pages, but the comment column in IS.Innodb_Metrics states this is the number of pages, so between 10%. and 20% of buffer pool memory is used by the insert buffer.
- History list length is < 100 for InnoDB at 500M rows
- The read IO is from queries that are logically but not physically index-only for the secondary indexes. By logically index only I mean the queries only reference columns in the secondary index. By not physically index-only I mean that InnoDB has to read the base row in the clustered PK index as I explain on slide 7 in my talk on MVCC GC. I am not aware of counters to show when this happens. I added them to a patch long ago, but lost track of that change.
- The read IO is from pending work (purge, insert buffer merges). I write above that the history list length is small (< 100) so I doubt that purge is an issue. But the insert buffer is large (> 50,000) so I won't rule out insert buffer merges.
- InnoDB memory devoted to the insert buffer means that fewer secondary index pages stay in cache and the read IO comes from cache misses.
- The large redo logs (40G via 40 1G files, see my.cnf) don't fit in the OS page cache and too much IO is done to re-read them when redo is written. This was a problem years ago when InnoDB was doing 512 byte writes via buffered IO, as Linux would read 4096 byte filesystem pages before the first write to an uncached page, but I think that InnoDB has made changed to avoid that.