Tuesday, May 30, 2023

Updates to the insert benchmark

The insert benchmark can now be called the insert+delete benchmark.

While I continue to work on the replacement for the insert benchmark (see insert benchmark v3) progress on that is slow. I recently enhanced the insert benchmark to let me run the write-heavy insert benchmark steps while keeping the working set in-memory and this allows me to run the write-heavy steps for much longer than a few minutes and search for problems in MVCC GC implementations.

This is done via the --delete_per_insert option that works as advertised -- a delete is done per insert to avoid growing the number of rows in a benchmark table. Note that the size of the database files might grow if MVCC GC gets unhappy, but that is the point of the insert benchmark, it exists to make MVCC GC unhappy.

More information on the insert benchmark is here and here.

The steps in the insert benchmark used to be:

  1. l.i0 - Inserts in PK order. The PK is on the transactionid column. The benchmark table has a PK index but no secondary indexes. This should be fast for a (right growing) b-tree regardless of the buffer pool size. Of course it is faster for an LSM.
  2. l.x - Create 3 secondary indexes on each benchmark table.
  3. l.i1 - More inserts in PK order. These are slower than l.i0 because there are 3 secondary indexes that require maintenance. The drop in performance is much larger for a b-tree than an LSM, especially when the working set is larger than memory because secondary index maintenance is read-modify-write for a b-tree but read-free for MyRocks.
  4. q100 - Do queries as fast as possible and for each query client there is another client that does 100 inserts/s.
  5. q500 - Do queries as fast as possible and for each query client there is another client that does 500 inserts/s.
  6. q1000 - Do queries as fast as possible and for each query client there is another client that does 1000 inserts/s.
The steps l.i1, q100, q500 and q1000 have been modified for the insert+delete benchmark. In each case there is a connection that does deletes in addition to the connection that does inserts. The delete rate (delete/s) is <= the insert rate but the goal is for them to be equal.

Implementation notes:
  • Each connection runs in a separate thread (or process because I use Python multiprocessing).
  • The default way to run this is with a client per table. But sometimes I run it so that all clients share the same table. The client per table option means I don't encounter fewer perf problems from contention.
  • The --delete_per_insert option only works with a client per table config.
  • With --delete_per_insert the number of inserts/s done by a client is shared with the delete connection so that deletes run as fast as inserts, but no faster. They can run slower, but that with be a performance problem.
  • Deletes are done from the tail of the table. When N rows are to be deleted, the delete client deletes the N with the smallest transactionid value. So the benchmark table is like a queue -- inserts to the head, deletes from the tail.

2 comments:

  1. Do you expect a Global Variable for delete_per_insert in the future?

    ReplyDelete
    Replies
    1. I don't understand. It is a flag to iibench.py so it gets set per benchmark step -- it isn't set for l.i0 (initial load), it is set for l.i1 (random inserts) and the read+write steps (q100, q500, q1000).

      https://github.com/mdcallag/mytools/blob/master/bench/ibench/iibench.py#L132

      Delete

Speedb vs RocksDB on a large server

I am happy to read about storage engines that claim to be faster than RocksDB. Sometimes the claims are true and might lead to ideas for mak...