Wednesday, September 14, 2022

Using SPIN to validate the InnoDB read-write lock

I hope to being learning how to use TLA+ and that reminded me of prior work I did with SPIN.

I have written two blog posts (here and here) about my usage of SPIN to validate concurrent code. The most important usage was for the rewrite of the InnoDB read-write lock that my team did at Google. It was important because that model helped to convince the InnoDB team to accept the change. I hope to being learning about TLA+ 

Within Google the project was mostly done by Ben Handy and I helped. This was ~15 years ago so my memory is vague. I remember reading the SPIN book to learn how to write a model for the algorithm. I also remember finding a machine with a lot of RAM and then doing what I could to reduce the memory requirements to run the model. Reducing the amount of concurrency tested was one such change. Finally, I remember enjoying SPIN.

I didn't share the model before but a friend, and InnoDB/TiDB expert, provided me with a copy. That model can now be read here.

2 comments:

  1. I remember that. It was fun to see a model checker being used in the real world. I had used SPIN some 5 to 10 years earlier at the university. On the Innobase/Oracle side, I was not involved with the synchronization primitives. A couple of years ago, I finally bit the bullet and replaced the InnoDB mutexes, "events" (mutex+condition variable), and rw-locks with something simpler.

    For buffer page latches, the simplest that I could come up was two futex-backed std::atomic (the same I used for the simplest rw-locks) and a recursion counter. Recursive write latches are needed for InnoDB BLOB operations, where a "main" mini-transaction retains an exclusive latch on an index leaf page while sub-mini-transactions write the linked list of BLOB pages, one page at a time.

    A nice consequence of this effort was that I was able to shrink the buffer page descriptor to about a half of the size it had in MariaDB 10.2 or MySQL 5.7.

    I suspect that there is something wrong with InnoDB synchronization primitives before MariaDB 10.6. Every now and then, we would see mysterious hangs for older versions than MariaDB 10.6. I am not talking about latches being used in a deadlock-prone fashion (such as MDEV-29883), but about an unexplained hang within a synchronization object.

    ReplyDelete
    Replies
    1. Long ago we had a long existing intermittent hang in InnoDB eventually found & fixed by Yasufumi Kinoshita. But my memory is that the watchdog timer in InnoDB would resolve it.

      Maybe it was for buffer pages, but long ago in the FB patch we had a change to share sync objects, maybe rw-lock. Allocating them per-page started to use a lot of memory. But then the question I have is whether you maintain debug-ability and observability when changing to something like futex.

      Delete

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...