Wednesday, February 8, 2023

The value of the InnoDB change buffer

The InnoDB change buffer, formerly known as the insert buffer, is a clever feature that can make some workloads (IO-bound, write-heavy) much faster. It defers IO from secondary index maintenance for pages that are not in the buffer pool.

The insert benchmark (see here and here) is a great workload to demonstrate the benefit from the change buffer. Here I share results from it on a small server (Intel NUC, 4 cores, 16G RAM) with an IO-bound workload.

I ran these tests because I recently learned that the InnoDB change buffer has been removed from MariaDB 11 (see here). I assume there are some MariaDB workloads that benefit from the change buffer similar to the benefit it provides for the insert benchmark.

tl;dr

  • The insert rate is more than 3X larger when change buffering is enabled and inserts require secondary index maintenance. On slower storage (disks or even EBS) this difference would be even larger.
  • The number of storage reads per insert is more than 10X larger when change buffering is disabled and inserts require secondary index maintenance.
  • The change buffer also helps for read+write workloads because writes that consume less IO make more IO available for reads.

Benchmark

The insert benchmark was run with 6 steps listed below. The change buffer helps during l.i1, q100, q500 and q1000 because all require secondary index maintenance. But the largest benefit is during l.i1.

  • l.i0 - insert 400M rows without secondary indexes
  • l.x - create 3 secondary indexes
  • l.i1 - insert 100M rows with 3 secondary indexes in place. 
  • q100 - range queries with 100 inserts/s in the background, runs for 1 hour
  • q500 - range queries with 500 inserts/s in the background, runs for 1 hour
  • q1000 - range queries with 1000 inserts/s in the background, runs for 1 hour
The benchmark was repeated using three configurations:
  • cy9 - change buffer enabled, adaptive hash index disabled
  • cy9a - change buffer enabled, adaptive hash index enabled
  • cy9b - change buffer disabled, adaptive hash index disabled
Results

The results are here. The benefit is largest for the l.i1 benchmark step where inserts are done with three secondary indexes that all require index maintenance. The summary shows that the insert rate for l.i1 drops from ~10k/s with the insert buffer (results for cy9 and cy9a) to 2852/s without it (results for cy9b). The metrics section helps to explain the performance difference:
  • The rpq column (storage reads/insert) is 0.134 for cy9, 0.135 for cy9a and jumps to 1.563 for cy9b. Disabling the change buffer increases the reads/insert rate by more than 10X.
  • The cpupq column (CPU usecs/insert) is 119 for cy9, 117 for cy9a and 224 for cy9b. Disabling the change buffer increases CPU/insert by 1.9X.
The change buffer also helps the read+write benchmark steps as the background insert rate increased. The benefit is apparent in the results for q1000 (see the summary). The reason is that the change buffer reduces the IO consumed per insert which leaves more IO capacity for queries.

2 comments:

  1. And now it defaults to off for MySQL 8.4.
    Storage is not that fast yet for small and medium deployments, and big deployments with fast storage should know how to tune their MySQL's...
    Seems like a strange decision.

    ReplyDelete
    Replies
    1. I assume this is step in removing the feature. But I hope it remains.

      Delete

Evaluating vector indexes in MariaDB and pgvector: part 2

This post has results from the ann-benchmarks with the   fashion-mnist-784-euclidean  dataset for MariaDB and Postgres (pgvector) with conc...