Friday, December 4, 2020

Tuning for the Insert Benchmark: Postgres, InnoDB & MyRocks

I used the Insert Benchmark and a small server to compare performance across many (11, 18 & 17) different configurations for Postgres, InnoDB and MyRocks. The goal is to determine which options matter for this scenario (workload + HW). The goal isn't to compare Postgres vs InnoDB vs MyRocks.

Results are here for Postgres, InnoDB and MyRocks. Conclusions are below. Some of these are specific to the workload + HW, so be careful about applying my findings. 

  • For InnoDB
    • It is hard to improve on innodb_dedicated_server. This is great news for deployments as there is less to tune. I hope the perf test community does more to confirm that across more workloads and HW.
    • Performance with the minimal (default + a few things) configuration is lousy. The solution is to set innodb_dedicated_server.
    • Performance with 4kb and 8kb pages is lousy except for QPS with more IO-bound database size where it is better. I don't understand the CPU overhead when it is lousy.
    • QPS with the more IO-bound database is much worse than MyRocks and Postgres. I don't know why.
  • For MyRocks
    • For the in-memory and less IO-bound databases the configurations didn't matter as perf was good for all. For the more IO-bound database all configurations except the minimal were good.
    • Tuning MyRocks was easy
  • For Postgres
    • Vacuum is needed, even for insert-only workloads, to set visibility map bits
    • wal_compression=on was bad for perf in many cases. Is PGLZ overhead too large?
    • Using a larger WAL helps by reducing checkpoint IO. This is standard advice.

A next step is to repeat the experiment with bigger HW and/or more concurrency. I want to try running iibench.py with PyPy and probably should upgrade the Python connector for MySQL.

No comments:

Post a Comment

Evaluating vector indexes in MariaDB and pgvector: part 2

This post has results from the ann-benchmarks with the   fashion-mnist-784-euclidean  dataset for MariaDB and Postgres (pgvector) with conc...