Tuesday, January 19, 2016

MyRocks vs InnoDB via Linkbench with a disk array

Previously I evaluated MyRocks and InnoDB for an IO-bound workload using a server with fast storage. Here I evaluate them for an IO-bound workload using a server with a disk array.

MyRocks sustains higher load and query rates than InnoDB on a disk array because it does less random IO on writes which saves more random IO for reads. This was the original motivation for the LSM algorithm. MyRocks does better on SSD because it writes less data to disk per commit. It compressed data 2X better than InnoDB which helps on both disk and SSD courtesy of improving the cache hit ratio. While the LSM algorithm was designed for disk arrays it also works great on SSD thanks to a better compression rate and better write efficiency. The LSM algorithm has aged well.

Configuration


This test used a server with two sockets, 8 cores (16 HW threads) per socket, 40GB of RAM and a disk array with 15 disks and SW RAID 0 using a 2MB RAID stripe.

Compared to the previous result, I used maxid1=200M in the Linkbench configuration to create a database about 1/5 the size of the previous test. These tests still used loaders=20 and requesters=20 to get 20 client threads for the load and query steps.

The pattern in which I ran the test changed. In the previous result I did the load and then ran 24 1-hour query steps with everything using 20 concurrent clients. In this test I have results for 1, 4, 8, 12, 16, 20, 24, 28 and 32 concurrent clients. For each level of concurrency data was loaded, 4 1-hour query steps were run and the result from the query step are shared from the 4th hour. Note that it takes about 24 hours on the SSD server for InnoDB QPS to stabilize as the index becomes fragmented. I did not run the query steps for 24 hours so the results here might understate InnoDB performance. The results also show the challenge of doing database benchmarks -- you have to get the engine into a steady state.

Compression


This shows the database size after load for uncompressed InnoDB (257 GB), compressed InnoDB (168 GB) and MyRocks (85 GB). Note that the server has 40G of RAM. MyRocks is able to cache a much larger fraction of the database than InnoDB but even for MyRocks at least half of the database is not in cache.

Load


In the previous result the load for uncompressed InnoDB was the fastest but that server had fast storage which hides the random IO penalty from InnoDB during page writeback. MyRocks has the fastest load in this case because the disk-array is much more limited on random IO and MyRocks does much less random IO on writes. One day I hope to explain why the load rate degrades for MyRocks beyond 8 threads. The data for the graph is here.

Query


MyRocks does much better on QPS as concurrency increases while InnoDB quickly saturates. MyRocks uses less random IO on writes which saves more random IO for reads. It also does better because it keeps a much larger fraction of the database in RAM. The data for the graph is here.

Efficiency


Alas, I don't have efficiency results because there was a bug in my test scripts.

No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...