Monday, June 29, 2015

Examining performance for MongoDB and the insert benchmark

My previous post has results for the insert benchmark when the database fits in RAM. In this post I look at MongoDB performance as the database gets larger than RAM. I ran these tests while preparing for a talk and going on vacation, so I am vague on some of the configuration details. The summary is that RocksDB does much better than mmapv1 and the WiredTiger B-Tree when the database is larger than RAM because it is more IO efficient. RocksDB doesn't read index pages during non-unique secondary index maintenance. It also does fewer but larger writes rather than the many smaller/random writes required by a B-Tree. This is more of a benefit for servers that use disk.

Average performance

I present performance results using a variety of metrics. The first is average throughput during the test. RocksDB is much faster than WiredTiger and WiredTiger is much faster than mmapv1. But you should be careful about benchmark reports that only include the average. Read on to learn more.

Cumulative average

This displays the cumulative average. That is the average from test start to the current point in time. At test end the value is the same as the average performance. This metric is a bit more useful than the average performance because it can show some variance. In the result below RocksDB quickly reaches a steady rate while WiredTiger and mmapv1 degrade over time as the database gets larger than RAM. However this can still hide intermittent variance.

Variance

This displays throughput per 10-second interval for a subset of the test. mmapv1 has the least variance while WiredTiger and RocksDB have much more. The variance is a problem and was not visible in previous graphs.

Variance, part 2

The final two graphs show the per-interval throughput for all engines and then only for WiredTiger and RocksDB. The second graph was added to avoid compressing the results for RocksDB to the left hand side of the graph. The lines for RocksDB and WiredTiger are very wide because of the large variance in throughput.


2 comments:

  1. Great post.

    Hi, Mark I have read most of your posts in Small Datum and learn a lot.

    I test MongoDB 3.4 with RocksDB and WiredTiger engine using sysbench with single thread, and found WiredTiger is 20% faster then RocksDB, and the result is reproducible.

    ReplyDelete
    Replies
    1. I am sure that WT is faster than MongoRocks for some workloads. But I have a few suggestions:
      1) I take a lot of time to explain the workloads that I use. You have provide almost no information about the workload that you used.
      2) There are many dimensions by which something can be better. These include database size, write efficiency and response time. When you write that WT is faster I assume you are describing response time, but you make no mention of space and write efficiency. Space efficiency determines how much SSD you need to buy to store your database. Write efficiency determines how long that SSD will last. While WT frequently wins on response time, MongoRocks usually wins on space and write efficiency.

      Regardless, I like both WT and MongoRocks.

      Delete

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...