Thursday, October 27, 2016

Benchmarketing MyRocks

I have been spending time understanding MyRocks performance for new workloads including benchmarks that potential MyRocks users run. One of those benchmarks is sysbench and I wrote a script to make it easier for me to run.

sysbench

Like most synthetic benchmarks sysbench is valuable but has its flaws. It helps to understand the flaws when looking at results. Most uses of sysbench are for very small databases. A typical run for me is 8 tables with 1M rows per table. That uses about 2G of space with uncompressed InnoDB tables. For a typical MyRocks configuration that will use a 3 level LSM tree with data in levels 0, 1 and 2 and I usually disable compression for those levels. And if you are running performance tests for a 2G database that fits in cache I wouldn't use compression. Small databases save time when running benchmarks as the load happens real fast. But you might miss the real overheads that occur with a larger database.

Another possible problem with sysbench is that several of the test configurations are for read-only workloads. If your real workload isn't read-only, then you might miss real overheads. For example, the RocksDB memtable might be empty for a read-only workload. That avoids the cost of checking the memtable on a query and can overstate the QPS you will measure.

I spent a day explaining unexpected performance variance on a read-only sysbench test. I took too long to notice that the LSM on the slower server had data in levels 0, 1 and 2 while the LSM on the faster server only used levels 1 and 2. By not having data in level 0 there was less work to do to process a query and the faster server got more QPS. This was visible in the compaction IO statistics displayed by SHOW ENGINE ROCKSDB STATUS. Had this been a read-write workload the LSM would have been in a steadier state with data (usually) in the memtable and level 0. But in this case the memtable was empty and compaction was stopped because there were no writes and the compaction scores for all levels was <= 1. I wonder whether we can add a feature to RocksDB to trigger compaction during read-only workloads when the LSM tree can be made more performant for queries?

configuration

The best settings for the MyRocks my.cnf file are also a source of confusion. I almost always enable the concurrent memtable. See the comments for the options allow_concurrent_memtable_write and enable_write_thread_adaptive_yield. I explained the benefits of these options in a previous post. Alas the options are disabled by default and not mentioned in the suggested my.cnf options. They are enabled by adding this to my.cnf:
rocksdb_allow_concurrent_memtable_write=1
rocksdb_enable_write_thread_adaptive_yield=1

I enable the concurrent memtable for most of my benchmarks. When MyRocks arrives in MariaDB Server and Percona Server I wonder whether other users will do the same. For read-write workloads the concurrent memtable can be a big deal.

No comments:

Post a Comment

Evaluating vector indexes in MariaDB and pgvector: part 2

This post has results from the ann-benchmarks with the   fashion-mnist-784-euclidean  dataset for MariaDB and Postgres (pgvector) with conc...