Wednesday, July 8, 2015

Linkbench for MySQL and MongoDB with a large database

This has results for Linkbench where the database is much larger than RAM. I previously shared results for a database that can be cached in RAM.

I used Linkbench for MySQL and LinkbenchX for MongoDB as described in the Configuration section in a previous post. The value of maxid1 was set to 1,000,000,001 in the FBWorkload.properties file. For the cached database test that was set to 20,000,001. The test server has 144G of RAM, 40 hyperthread cores and PCIe flash. The tests were configured to use 10 threads for loading and 20 threads for queries. I disabled the binlog during the load for MySQL and enabled it during the query tests. The oplog was always enabled for MongoDB.

I don't report results for mmapv1 because the load would take too long and I might not have enough storage. I used WebScaleSQL for MySQL and the v3.0-fb MongoDB partners branch for MongoDB. The following configurations were tested:
  • mysql.rocks.log - MySQL with RocksDB and the binary log enabled
  • mysql.inno.log - MySQL with InnoDB and the binary log enabled
  • mongo.rocks.log - MongoDB with RocksDB and the oplog enabled
  • mongo.wiredtiger.log - MongoDB with WiredTiger and the oplog enabled
  • mongo.mmap.log - MongoDB with mmapv1 and the oplog enabled

Results

The test pattern is to load the data then do 24 1-hour runs of the query test. The results below include:
  • load time - the number of seconds for the load test
  • load rate - the average insert rate during the load test
  • load size - the database size in GB when the load ended
  • 2h qps - the average QPS during the 2nd 1-hour query test
  • 2h size - the database size in GB after the 2nd 1-hour query test
  • 12h qps - the average QPS during the 12th 1-hour query test
  • 12h size - the database size in GB after the 12th 1-hour query test
  • 24h qps - the average QPS during the 24th 1-hour query test
  • 24h size - the database size in GB after the 24th 1-hour query test
load    load    load    2h      2h      12h     12h     24h     24h
time    rate    size    qps     size    qps     size    qps     size    server
43820   99759   707g    22039   748g    25696   805g    25947   831g    mysql.inno.log
24261   180184  368g    32469   372g    30180   383g    29321   394g    mysql.rocks.log
251688  17368   630g    9670    633g    6762    644g    6768    656g    mongo.rocks.log
175740  24874   695g    8533    766g    8019    791g    7718    806g    mongo.wt.log

Update - I ran a test for mmapv1 with maxid1=200M rather than 1B, so it has about 20% of the data compared to the others. I didn't have enough time or disk space to use the larger value with mmap. The data is below:
load    load    load    2h      2h      12h     12h     24h     24h
time    rate    size    qps     size    qps     size    qps     size    server
276350  3165    553g    3222    555g    3084    568g    2969    573g    mongo.mmap.log

Some conclusions from the results:
  • the load rates are similar to the rates for the cached database and IO when the database is larger than RAM doesn't have a big impact on performance. But this might depend on fast storage and these results are for a server with PCIe flash.
  • relative to the query rates for the cached database, RocksDB suffers the least and WiredTiger suffers the most when the database is larger than RAM.
  • the size for RocksDB was similar between MySQL and MongoDB for the cached database. That is not the case here as MongoRocks uses almost 1.7X more space than MyRocks. I have yet to debug this. For MongoRocks all data is stored in one column family. For MyRocks I used one column family per index and if nothing else this makes it easy to determine how much space and IO is consumed per index. There are also more optimizations in MyRocks to remove tombstones earlier but the space difference shows up immediately during the load and there shouldn't be tombstones at that point. MongoDB also uses more indexes than MySQL (9 vs 4) for Linkbench and that might be the reason but this is still a mystery.
  • I restarted MySQL after the load test to enable the binlog prior to the query tests. InnoDB was configured to use direct IO so I lost the contents of the buffer pool during the restart. Performance suffered for the early query tests until the buffer pool warmed up and that includes the 2h test I reported above. This explains why QPS was better for the 12h and 24h results with InnoDB. Buffered IO is used for MySQL+RocksDB so less data is lost during a restart (block cache must be warmed up but OS filesystem cache is not wiped).

Comparison with the cached database results

The table below shows the ratio of the rates from this test with the rates for the cached database. The goal is to determine how much performance degrades when the database is larger than RAM. The ratio is expressed as a percentage and 50% means that the rate for the uncached database was 50% of the rate for the cached database. Note that the load rates are close to 100% for RocksDB while InnoDB degrades a bit and WiredTiger degrades more. This is expected from a write-optimized algorithm like RocksDB and b-trees like InnoDB and WiredTiger.

The results for 2h for InnoDB are less accurate because mysqld was restarted as described in the previous section. For the 24h results MySQL degrades less than MongoDB and RocksDB degrades less than b-trees (WiredTiger & InnoDB).

load    2h      24h
rate    qps     qps     server
97.6%   67.6%   63.0%   mysql.rocks.log
95.0%   42.9%   52.0%   mysql.inno.log
98.8%   66.9%   50.1%   mongo.rocks.log
88.7%   33.4%   34.8%   mongo.wiredtiger.log


And the same data in a chart...


No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...