Thursday, February 11, 2016

RocksDB vs the world for loading Linkbench tables

I like RocksDB, MyRocks and MongoRocks because they are IO efficient thanks to excellent compression and low write-amplification. It is a bonus when I get better throughput with the RocksDB family. Here I show that RocksDB has excellent load performance and to avoid benchmarketing the context is loading Linkbench tables when all data fits in cache. This workload should be CPU bound and can be mutex bound. I compare engines for MySQL (RocksDB, TokuDB, InnoDB) and MongoDB (RocksDB, WiredTiger) using the same hardware.


I used Linkbench for MySQL and LinkbenchX for MongoDB. The load test was run with generate_nodes=false to limit the load to the count and link tables/collections. The load was repeated with 1 and 8 users (loaders=1 & maxid1=1M, loaders=8 & maxid1=4M). The test server has 144G RAM, 2 sockets, 12 CPU cores (24 HW-threads) and a 400G Intel s3700 SSD.

The oplog, binlog and sync-on-commit were disabled. I did that to remove bottlenecks that might hide stalls in the database engine.

I tested the following binaries:


This shows the average insert rate during the single-threaded load. The performance summary is:
  • MySQL gets more throughput than MongoDB for the same workload because MongoDB uses more CPU. MongoDB also suffers from using more indexes on the Linkbench tables (the extra indexes are internal) as explained previously. I hope they fix this.
  • For MongoDB, data is loaded faster with WiredTiger than RocksDB because RocksDB uses more CPU.
  • MySQL+RocksDB and MySQL+TokuDB were by far the fastest for single-threaded loads if you ignore uncompressed InnoDB and I ignore it because I must have compression. Compressed InnoDB suffers from the overhead of doing some (de)compression operations in the foreground.
  • Uncompressed InnoDB is about 10% slower in MySQL 5.7 compared to 5.6 for the single-thread load. Low-concurrency performance regressions in MySQL is a known problem.

This shows the value of: (CPU utilization * 1000) / insert_rate. That includes CPU consumed in the foreground by the thread processing the users requests and in the background. The value is inversely correlated with the insert rate. The value is larger for MongoDB than for MySQL which explains why the insert rate is larger for MySQL.


This shows the average insert rate during the load with 8 user threads. The performance summary is:
  • MySQL+RocksDB is a lot faster than everything else except uncompressed InnoDB and I ignore that because I require compression.
  • Compressed InnoDB suffers from mutex contention on the pessimistic code path. See this.
  • TokuDB suffers from mutex contention and is slower here than at 1 thread. See this.
  • Mongo+RocksDB suffers from mutex contention and the difference between it and WiredTiger is larger here than for the single-threaded load.

This is the rate of context switches per insert. The context switch rate was measured by vmstat. Several of the engines suffer from too much mutex contention.

No comments:

Post a Comment