Saturday, February 14, 2015

Storage efficiency

Storage efficiency is a big deal. I have learned a lot about read, write and space amplification over the past few years. Tiered storage has been a part of my work life via flashcache, but tiered storage for an LSM doesn't require flashcache. This is going to get interesting. We have more choices for SSD at a variety of price points based on write endurance and performance.  We have write-optimized database engines (RocksDB, Tokutek, WiredTiger) arriving for OLTP workloads. We can use this with commodity hardware and open-source DBMS solutions like MySQL and MongoDB. A lot of interesting work remains to put the pieces together while matching the quality of service we get from MySQL+InnoDB and I hope to use RocksDB as part of MySQL and MongoDB.

We are beginning to engage with the community. Yoshinori Matsunobu has a talk at the MySQL UC. I have a talk at the CloudDM workshop at ICDE. I expect more talks, hopefully a few conference papers and if you are local then there are RocksDB meetups.

One benefit from RocksDB compared to a b-tree is much better compression and much less write-amplification. I have observed this via linkbench and real workloads. There are many other ways that an LSM can reduce the IO demand from an application. But I don't want to give away too many details until Yoshi and I have done our talks.

No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...