Monday, July 13, 2015

Summary of Linkbench for MongoDB & MySQL

I published many benchmark reports last week for Linkbench using MongoDB and MySQL with several storage engines: RocksDB, InnoDB, WiredTiger and mmapv1. My goal from running tests like this is to find things we can make better. From these tests the thing to make better is load performance for MongoDB. MySQL did better on the load and query tests but the difference on the load tests was much larger. I suspect that much of the problem can be blamed on the need for 9 indexes in MongoDB versus 4 indexes in MySQL to support the Linkbench workload as more indexes means there is more index maintenance during the load. If we get support for clustered primary key indexes into a future MongoDB release then we can reduce this from 9 to 6 indexes. The extra two indexes might be here forever because there is no composite PK index in MongoDB. The results from last week are:

MySQL schema

From the MySQL Setup section in the Linkbench README file there are 4 indexes created across 3 tables:
  • linktable has a primary key index on (link_type, id1, id2) and a secondary index on (id1, link_type, visibility, time, id2, version, data). The secondary index is covering for the most frequent query. While it almost doubles the size of linktable it also greatly reduces the worst-case random IO required for that query from a few thousand to a few. This can be a good tradeoff even when using PCIe flash.
  • counttable has a primary key index on (id, link_type)
  • nodetable has a primary key index on (id)

Why 5 extra indexes?

Why does MongoDB need 5 more indexes than MySQL? From the MongoDB driver in LinkbenchX there are 3 indexes created by the user and then 3 implicit primary key indexes on _id:
  • the link collection has a primary key index on _id, a unique secondary index on (link_type, id1, id2) and a non-unique secondary index on (id1, link_type, visibility, time, id2, version, data)
  • the count collection has a primary key index on _id and a unique secondary index on (id, link_type)
  • the node collection has a primary key index on _id
I have described 6 indexes but claimed that 9 were used. The other 3 are used for WiredTiger and RocksDB but not for mmapv1 and are indexes on DiskLoc (key is DiskLoc, value is the document). for each collection. While WiredTiger and RocksDB support a clustered primary key that is not supported (yet) in MongoDB because the storage engine API is still evolving from what was needed for mmapv1. With mmapv1 the base table is heap organized and the value in a primary key index is a DiskLoc which I assume is a filename and offset where the document resides. WiredTiger and RocksDB are copy-on-write and the location for a document will change one or more times after an update so they can't use DiskLoc to point to the one place on disk where the document exists, thus they need the extra index. Server-14569 and server-15354 are open for this. 

If we get proper support for a clustered primary key index, then MongoDB would use 6 indexes for Linkbench versus 4 for MySQL. How do we reduce that from 6 to 4? The unique secondary indexes for the link and count collections are not needed in MySQL. The problem here is that MongoDB doesn't support a composite PK. One workaround is to concatenate the fields (link_type | id1 | id2 -> _id for the link collection, (id | link_type -> _id) for the count collection. I don't recommend that because it must be done by the client, and done correctly, on every insert and update. It is even more work to make the concatenated value order preserving, which is easy in code, but makes it harder to construct the _id value when manually querying the database. This also means that some fields will be stored multiple times: once in the _id field, once separately. 

No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...