Thursday, June 16, 2022

Insert Benchmark for Postgres 12, 13, 14 and 15: part 2

This has graphs of throughput vs time for three of the Insert Benchmark steps. The goal is to determine whether there is too much variance. A common source of variance is checkpoint stalls when using a B-Tree. This is a follow up to my blog post on the Insert Benchmark for Postgres versions 12.11, 13.7, 14.3 and 15b1. 

The benchmark steps for which graphs are provided are:

  • l.i0 - load in PK order without secondary indexes
  • l.i1 - load in PK order with 3 secondary indexes
The benchmark is repeated for two workloads -- cached and IO-bound. 

Cached

The database fits in memory for the cached workload.

There isn't much variance for the l.i0 workload.
The graph for the l.i1 workload is more exciting which is expected. For the l.i0 workload the inserts are in PK order and there are no secondary indexes. So each insert makes R/P pages dirty where R is the row size, P is the page size and R/P is much less than 1. But for l.i1 each insert is likely to make (3 + R/P) pages dirty so there is much more stress on page writeback and storage. The "3" in the estimate is from doing secondary index maintenance for each of the 3 secondary indexes. The drops in the graph are usually stalls when writeback falls behind.

IO-bound

The database is much larger than memory for the IO-bound workload.

The graph for l.i0 shows occasional drops that I assume are from writeback falling behind. However they can also be from the SSD that I use.

The graph for l.i1 is more interesting which is expected because it does more IO to storage per insert. The pattern is very regular and the insert rate gradually rises from ~2000/s to ~2800/s and this repeats every ~300 seconds. Were I to bet, this is caused by Postgres rather than my SSD. Perhaps one day I will do experiments to determine whether tuning the Postgres config can reduce the variance.


No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...