- RocksDB merge operator logically does read-modify-write but physically does a Put into the memtable and the read is deferred until user queries or compaction. I hope that MyRocks eventually takes advantage of the merge operator.
- Read free replication in TokuDB and MyRocks.
- Fast Updates in TokuDB do something similar to the merge operator for update and insert
- Non-unique secondary index maintenance is read free in MyRocks
- Deferred secondary index update in Tarantool is a new approach to avoiding reads for write-heavy workloads. It is worth reading. I wish it had been published by VLDB.
- MyRocks also avoids some storage reads when validating unique constraints (for PK and unique secondary indexes) on inserts and some updates because the Get() that it does can use a bloom filter. This benefit is larger when there is a bloom filter on the max level of the LSM tree, but that has other costs is is frequently not done.
- MyRocks uses the SingleDelete feature in RocksDB to allow tombstones to be removed earlier which decreases the CPU overhead for reads after a batch of writes.
- A b-tree can also reduce index maintenance related storage read IO. See the InnoDB change buffer.
Updates
An incomplete list of relevant research:
- DiffIndex - this is more about distributed systems, but still interesting
- Techniques for reducing index maintenance overhead in an LSM by Luo & Carey (VLDB 2019)
- Deferred Lightweight Indexing (DELI)
The algorithm described by Tarantool is the nearly identical as the previously published EDBT paper at https://tristartom.github.io/docs/edbt14.pdf. We also had a paper at VLDB 2019 that uses a similar idea but exploits a primary key index to cleanup secondary indexes more efficiently (http://www.vldb.org/pvldb/vol12/p531-luo.pdf).
ReplyDeleteThanks for interesting links. I only glanced quickly, but isn't the EDBT14 paper a tradeoff: You can choose synchronously updated secondary indexes (requires a read) or you can avoid the read and use eventually consistent secondary indexes?
DeleteDo we need a RUM Conjecture for dist sys?
DeleteSorry that I posted the wrong link for the first paper. The correct one should be Deferred Lightweight Indexing for Log-Structured Key-Value Stores (https://ieeexplore.ieee.org/document/7152467)
ReplyDeleteThank you for the references
Delete