Small Datum: MyRocks: use less IO on writes to have more IO for reads

Wednesday, November 23, 2016

MyRocks: use less IO on writes to have more IO for reads

Holiday is almost here and I wrote a long blog post on write-efficiency yesterday so this one will be short. A longer version of this is in progress because this is an interesting result for me to explain. We assume that an LSM is less efficient for reads because it is more efficient for writes and it is hard to be optimal for all of read, write & space efficiency.

For real workloads it is complicated and for now I include benchmarks in "real workloads". Here is one interesting result from my IO-bound tests of Linkbench. The summary is that when you spend less on IO to write back changes then you can spend more on IO to handle user queries. That benefit is more apparent on slower storage (disk array) than on faster storage (MLC NAND flash) because slower storage is more likely to be the bottleneck.

IO-bound Linkbench means that I used a server with 50G of RAM and ran Linkbench with maxid1=1B (1B nodes). The MyRocks database was ~400G and the InnoDB database was ~1.6T. Both MyRocks and InnoDB used MySQL 5.6.26. The workload is IO-heavy and the database working set is not cached.

The interesting result is that the difference between MyRocks and InnoDB becomes larger as storage gets slower. Another way to describe this is that InnoDB loses more performance than MyRocks when moving from faster to slower storage. I assume this is because MyRocks uses less IO capacity for writing back database changes so it has more IO capacity for handling user queries.

Transactions per second

MyRocks InnoDB MyRocks/InnoDB

Disk array 2195 414 5.3

Slow SSD 23484 10143 2.3

Fast SSD 28965 21414 1.4

The random operations per second provided by the storage devices above is approximately 1k for the disk array, 10k for the slow SSD and more than 100k for the fast SSD.

8 comments:

PeterNovember 23, 2016 at 3:31 PM
Mark,

I'm not surprised by results at all. It is not very clear how the logical IO distributed over the data stored but with 50GB you have much more cache fit for RocksDB than for Innodb. Even with LSM I would expect data access is not uniform.

The less cache misses we get naturally the more CPU time vs number of IOs you will get and as such the slower the storage the larger is impact.

For example if for System1 you take 10ms of CPU time plus 100 disk IOs per operation and for System2 it is 100ms and 10 disk IO operations than the total time would be a lot difference if you have 10ms for IO or 0.01ms

From what you reported before RocksDB uses more CPU time per operation compared to uncompressed Innodb.
ReplyDelete
Replies
Mark LeithNovember 24, 2016 at 1:30 AM
Continuing to compare InnoDB from 5.6 seems ... a little disingenuous.

You have complained about this from others in the past, now I complain about it to you.. ;)
ReplyDelete
Replies
PeterNovember 24, 2016 at 8:11 AM
This is a good point. I still find plenty of issues with sysbench but it only covers very small piece of functionality and more benchmarks are needed. I would welcome seeing LinkBench results as well but also others.
ReplyDelete
Replies

Add comment

Wednesday, November 23, 2016

MyRocks: use less IO on writes to have more IO for reads

8 comments:

Postgres 18 beta1: small server, IO-bound Insert Benchmark (v2)