Saturday, August 12, 2023

Tuning InnoDB for the Insert Benchmark on a small server

I used the Insert Benchmark to find a good my.cnf for InnoDB on a small server with MySQL 8.0, 5.7 and 5.6. Results for InnoDB on a large server are here.

tl;dr

  • Throughput in 8.0.33 for a cached workload is about half of what it was in 5.6.51. The reason is new CPU overhead. The difference here with a small server and low concurrency workloads is larger than what I measure on larger servers with high concurrency.
  • Single-page flushing causes bad (multi-second) stalls in 8.0 for reads and writes. This makes it hard to get good QoS from InnoDB for a write-heavy workload.
  • The database size growth isn't unreasonable during the l.i1 benchmark step -- and it was unreasonable on the medium and large servers. This is probably explained by the usage of low-concurrency here (1 client) vs 8 on the medium and 24 on the large server.
  • Disabling the doublewrite (not safe for production) reduces or avoids performance problems where single-page flush stalls are part of the root cause. Hopefully this code gets revisited by upstream.
  • The base configuration is one of the best. While I tested up to 22 variations on it, the variations don't improve much for a small server with a low concurrency workload.
  • Queries can be slowed by MVCC GC debt unless the change buffer is disabled, but that comes at a huge cost to the write rate. The low query rates here also suffer from bug 109595.

Builds

I used upstream MySQL 8.0.33, 5.7.40 and 5.6.51. All used CMAKE_BUILD_TYPE=Release and then the 5.7 and 8.0 builds also used -flto (link time optimization).  The builds are named my8033_rel_lto, my5740_rel_lto and my5651_rel.

Benchmark

The insert benchmark was run in three setups.

  • cached by InnoDB - all tables are cached by InnoDB
  • cached by OS - all tables are cached by the OS page cache but not by InnoDB
  • IO-bound - the database is larger than memory

For cached by InnoDB and IO-bound the setups have a 10G InnoDB buffer pool and don't use buffered IO (innodb_flush_method = O_DIRECT_NO_FSYNC), with one exception -- the a4 config uses innodb_flush_method = fsync.

For cached by OS the configurations have a 1G InnoDB buffer pool and use buffered IO (innodb_flush_method = fsync). 

This benchmark used the Beelink server explained here that has 8 cores, 16G RAM and 1TB of NVMe SSD with XFS and Ubuntu 22.04. 

The benchmark is run with 1 client and 1 table. The benchmark is a sequence of steps.

  • l.i0
    • insert X million rows per table where X is 20 for cached and 500 for IO-bound
  • l.x
    • create 3 secondary indexes. I usually ignore performance from this step.
  • l.i1
    • insert and delete another X million rows per table with secondary index maintenance where X is 40 for cached by InnoDB, 5 for cached by OS and 5 for IO-bound. The number of rows/table at the end of the benchmark step matches the number at the start with inserts done to the table head and the deletes done from the tail. 
  • q100
    • do queries as fast as possible with 100 inserts/s/client and the same rate for deletes/s done in the background. Run for 3600 seconds.
  • q500
    • do queries as fast as possible with 500 inserts/s/client and the same rate for deletes/s done in the background. Run for 3600 seconds.
  • q1000
    • do queries as fast as possible with 1000 inserts/s/client and the same rate for deletes/s done in the background. Run for 3600 seconds.

Configurations

All of the my.cnf files are here.

The base configurations are here:

  • for cached by InnoDB and IO-bound: 5.6.515.7.408.0.33. The naming pattern is my.cnf.cy10$x_bee where x is a for the base config and a1 ... a22 otherwise.
  • for cached by OS: 5.6.515.7.408.0.33. The naming pattern is my.cnf.cy10$x_bee where x is a for the base config and a1 ... a22 otherwise.

The configurations tested are:

  • base - see links above
  • a2 - increases innodb_io_capacity[_max] to 20k/40k
  • a3 - adds innodb_flush_sync=off and innodb_io_capacity[_max]=20k/40k
  • a4 - adds innodb_flush_method=fsync
  • a5 - adds innodb_change_buffering=none
  • a7 - adds innodb_lru_scan_depth=2048
  • a10 - adds innodb_purge_threads=8, innodb_page_cleaners=8
  • a12 - reduces innodb_redo_log_capacity to 32G
  • a13 - adds innodb_max_purge_lag=50k, innodb_max_purge_lag_delay=1M, innodb_purge_threads=8
  • a14 - adds innodb_change_buffering=inserts
  • a15 - changes innodb_idle_pct to default
  • a16 - reduces innodb_io_capacity to 2000 and innodb_io_capacity_max to 4000
  • a17 - adds innodb_doublewrite=OFF (not safe for production)
  • a18 - adds innodb_buffer_pool_instances=4, innodb_page_cleaners=4
  • a19 - adds innodb_max_dirty_pages_pct_lwm=20, innodb_max_dirty_pages_pct=50
  • a20 - like a13 but with innodb_max_purge_lag=200k
  • a21 - like a13 but with innodb_max_purge_lag=500k
  • a22 - combines a10 and a20
Most of the configurations were used for MySQL 8.0.33. Only the first three or four (a1 to a4) were tested for 5.6.51 and 5.7.40.

Performance

For each of the workloads below I analyze the performance in four parts:
  • average throughput - I look at the average queries and inserts per second to see if these are much better or worse than the base config
  • response time - I look at max response time and the response time distributions to see if a config makes them much better or much worse than the base config
  • charts - there are charts for the per-second query, insert and delete rates. I look at the graphs to see if the rates degrade over time or have too many stalls.
  • SLA - there is a target rate for the background inserts and deletes which is one of 100, 500 or 1000 per second per client. There is an SLA failure when the actual rate is too much smaller than the target.
  • get_min queries - select min(transactionid) from $table is run at the start of some benchmark steps and this query should be fast because transactionid is the PK. But in some cases it is slow when there is too much MVCC GC debt.
  • size - How much does the database size increase from the create index benchmark step (l.x) to the write-only step (l.i1).  Anything larger than 1.5X for a b-tree can be too much.
Reports are here:
Results: cached by InnoDB

Summary:
  • average throughput
    • For 5.6 the base, a1 and a2 configs are similar. The l.x and l.i1 benchmark steps are slower with the a4 config.
    • For 5.7 the base, a1, a2 and a3 configs are similar. The l.i0, l.x and l.i1 benchmarks steps are slower with the a4 config.
    • For 8.0 all configs are similar on the l.i0, l.i1, q100, q500 and q1000 benchmark steps. There is more variance on the l.x (create index) step where the a18 config gets the best perf.
    • For 5.6 vs 5.7 vs 8.0
      • For the l.i0 step the insert rate in 8.0 drops to ~58% of the 5.6 throughput
      • For the l.x step creating indexes is ~1.3X faster in 8.0 vs 5.6
      • For the l.i1 step the insert rate in 8.0 is ~15% less vs 5.6 but 5.7 was faster than 5.6
      • For the q100, q500 and q1000 steps the query rate in 8.0 drops to ~56% of what it was in 5.6
      • For all of the benchmark steps there is a big increase in CPU overhead (see cpupq here) from 5.6 to 8.0
  • response time
    • For 5.65.7 and 8.0 the response time histograms were similar for all configs
  • charts
    • I am skipping the charts as this post is long enough.
  • SLA
    • There were no SLA failures and the target insert+delete rates were sustained.
  • get_min
    • All select min(transactionid) queries were fast
  • size
    • For 5.6 the database size grows by 1.41X which is OK
    • For 5.7 the database size grows by 1.47X which is OK
    • For 8.0 the database size grows by 1.43X which is OK
Results: cached by OS

Summary:
  • average throughput
    • For 5.6 the base config was best and the a2 config was bad
    • For 5.7 the base and a1 config were the best, the a2 and a3 configs were bad
    • For 8.0 the a17 config (disables doublewrite, not safe for production) and otherwise InnoDB falls over (it looks like the problem is single-page flushing)
    • For 5.6 vs 5.7 vs 8.0
      • For the l.i0 step the insert rate in 8.0 drops to ~60% of the 5.6 throughput
      • For the l.i1 step throughput improves from 5.6 to 5.7, but in 8.0 drops to ~63% of the 5.6 throughput
      • For q100, q500 and q1000 the query rate drops from 5.6 to 5.7 and the results for 8.0 are lousy
  • response time
    • For 5.6 all configs have ~2 second outliers for l.i1. The a2 config is worse for q100, q500 and q1000.
    • For 5.7 all configs have ~1 second outliers for l.i1. The a2 and a3 configs are worse for q100, q500 and q1000.
    • For 8.0 all configs except a17 have ~4 second outliers for l.i1. For q100, q500 and q1000 all configs have ~4 second outliers for queries, inserts and deletes. This looks like a problem with too-frequent single-page flush stalls. InnoDB in 8.0 does much worse than 5.7 or 5.6.
  • charts
    • I am skipping the charts as this post is long enough.
  • SLA
    • For 5.6 the a2 config did not sustain the target insert rates for q100, q500 and q1000
    • For 5.7 the a2 and a3 configs did not sustain the target insert rates
    • For 8.0 only the a17 config sustains the target insert rates
  • get_min
    • For 5.6 the query is slow (55+ seconds) for the a2 config
    • For 5.7 the query is slow (22+ seconds) for all configs
    • For 8.0 the query is slow (30 to 500 seconds) for all configs except a17 and the root cause is probably single-page flush stalls which are avoided by a17 because that disables doublewrite
  • size
    • For 5.6 the database size grows by 1.41X for the a0 and a1 configs which is OK but by 1.70X for a2 which is a problem
    • For 5.7 the database size grows by 1.57X which is too much
    • For 8.0 the database size grows by 1.53X to 1.70X which is too much
Results: IO-bound

Summary:
  • average throughput
    • For 5.6 the base, a1 and a2 configs are similar. The a4 config is bad.
    • For 5.7 the base, a1, a3 and a3 configs are similar. The a4 config is bad.
    • For 8.0 some configs did worse on the l.i1 benchmark step (a5, a13, a14, a18) which might explain why they did better on q100 (because there was less writeback debt). But the base config was one of the best.
    • For 5.6 vs 5.7 vs 8.0
      • For the l.i0 step the insert rate in 8.0 drops to ~59% of the 5.6 throughput
      • For the l.x step 5.7 and 8.0 are ~1.3X faster than 5.6
      • For the l.i1 step throughput in 5.7 and 8.0 is ~1.35X larger
      • For q100 and q500 the query rate in 8.0 is ~6% less than 5.6 while for q1000 it is ~2% larger than 5.6
  • response time
    • For 5.6 config a4 does much worse than base, a1 or a2.
    • For 5.7 config a4 does much worse than base, a1, a2 or a3
    • For 8.0 config a5 does worse on l.i1 and for inserts + deletes on q100, q500 and q1000
  • charts
    • I am skipping the charts as this post is long enough.
  • SLA
    • For 5.6 the a4 config does not sustain the target insert rates for q100, q500 and q1000
    • For 5.7 the a4 config does not sustain the target insert rates
    • For 8.0 all configs sustain the target insert rates
  • get_min
    • For 5.6 the query is slow (~1 second) for the base, a1 and a2 config and ~105 seconds for the a4 config
    • For 5.7 the the query is slow (1 to 3 seconds) for the base, a1, a2 and a3 configs and ~311 seconds for the a4 config
    • For 8.0 the query is slow (~1 second) for all configs except a5 which disabled the change buffer
  • size
    • For 5.6 the database size grows by ~1.4X which is OK
    • For 5.7 the database size grows by ~1.3X which is OK
    • For 8.0 the database size grows by ~1.3X which is OK












No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...