Small Datum: How does MongoDB do on Linkbench with concurrency?

Wednesday, November 18, 2015

How does MongoDB do on Linkbench with concurrency?

I started to study performance math with a focus on queueing theory and the USL. To get data for the USL I ran Linkbench for MongoDB with different levels of concurrency. I tested WiredTiger in MongoDB 3.2.0rc0 and 3.0.7 and then RocksDB in MongoDB 3.2.0. The performance summary is:

WiredTiger load scales much better in 3.2 compared to 3.0
RocksDB throughput is stable at high concurrency
WiredTiger 3.2 throughput is stable at high concurrency. For WiredTiger 3.0 the load rate drops significantly at high concurrency and the query rate also has an odd drop.

Configuration

The test server has 2 sockets with 6 CPU cores/socket and hyperthreading enabled to get 24 HW threads. The server also has 6 SAS disks with HW RAID 0 and 1 SSD. I intended to use the disk array for all tests but ended up using the SSD for the WiredTiger 3.0.7 test. The server also has 144G of RAM and the test database was cached by mongod for all tests. The oplog was enabled but sync-on-commit was not done. For Linkbench I set maxid1 to 10M and the test pattern was load, run for 30 minutes, run for 30 minutes and the query rate is reported for the second 30 minute run. Snappy compression was used for all tests. The test was run for 1 to 48 concurrent clients with MongoDB 3.2.0 and stopped at 30 concurrent clients for MongoDB 3.0.

Data

Results are on gist.github for the load and query tests.

Load

WiredTiger throughput for write-heavy workloads is much better in 3.2 than 3.0 as I previously reported. From PMP output it looks like WiredTiger in MongoDB 3.0 suffers from too frequent plan-cache invalidation that I wrote about this previously. I think RocksDB insert performance saturates earlier from mutex contention on the memtable writer mutex.

Query

WiredTiger performance is the same for 3.0 and 3.2 until 21 concurrent clients. At that point the performance for 3.0 drops and response time for all operations is slower (reads & writes). This is odd and I won't try to debug it. For RocksDB I have an educated guess at the problem (need to use SingleDelete optimization in MongoRocks to reduce overhead from tombstones) to explain why throughput for it is worse than WiredTiger.

2 comments:

Alex MalafeevNovember 20, 2015 at 1:11 AM
I wonder if the insert performance was a regression in 3.0 as compared to 2.6 and before? We are extremely excited by the scalability with concurrency in 3.2 that I think the official release for us will see the fastest push to production cycle.

And I suppose by 2.6 I mean mmapv1 as very few people were running 2.6 WT
ReplyDelete
Replies

Add comment