Monday, August 17, 2015

Transactions with RocksDB

RocksDB has work-in-progress to support transactions via optimistic and pessimistic concurrency control. The features need more documentation but we have shared the API, additional code for pessimistic and optimistic and examples for pessimistic and optimistic. Concurrency control is a complex topic (see these posts) and is becoming popular again for academic research. An awesome PhD thesis on serializable snapshot isolation by Michael Cahill ended up leading to an implementation in PostgreSQL.

We intend to use the pessimistic CC code for MyRocks, the RocksDB storage engine for MySQL. We had many discussions about the repeatable read semantics in InnoDB and PostgreSQL and decided on Postgres-style. That is my preference because the gap locking required by InnoDB is more complex.

MongoRocks uses a simpler implementation of optimistic CC today and a brief discussion on CC semantics for MongoDB is here. AFAIK, write-write conflicts can be raised, but many are caught and retried internally. I think we need more details. This is a recent example of confusion about the current behavior.

Thanks go to Anthony for doing the hard work on this.

2 comments:

  1. Thanks, do you know when it will be supported in RocksDB Java?

    ReplyDelete
  2. I do not. I use neither but I am more interested in bindings for Go.

    ReplyDelete

Common prefix skipping, adaptive sort

The patent expired for US7680791B2 . I invented this while at Oracle and it landed in 10gR2  with claims of ~5X better performance vs the pr...