Here I report results for much lower-end hardware. I have two Intel NUC systems with 8G of RAM, 1 disk and 1 SSD. They are small, quiet and were easy to setup. There is an active NUC community hosted by Intel with useful answers to many questions. I managed to get the systems running without asking for help from Domas. That is rare.
I ran LinkbenchX as described in the previous post with a few minor changes because this hardware is smaller. First, I used 4 load threads and 4 request threads compared to 10 load threads and 20 request threads on the larger hardware (set loaders=4 and requesters=4 in LinkConfigMongoDBV2.properties). Then I ran tests for smaller databases using maxid1=2M for the cached database and maxid1=20M for the uncached database (set in FBWorkload.properties). Finally I added one option to storage.rocksdb.configString to put SST index and bloom filter data in the RocksDB cache to make it subject to the block cache limits. While this can hurt performance it also gets RocksDB to respect storage.rocksdb.cacheSizeGB. Without this option the data for SST index and filter data is always in memory when the SST file is open and as the database grows this can use a lot of memory, especially when running on a server with 8GB of RAM. Without this option the memory consumed for index and filter data also looks like a memory leak in RocksDB until you realize what is going on (yes, I have wasted too much time on this problem). The extra option is:
block_based_table_factory={cache_index_and_filter_blocks=1}
Results
The tests here were run with the database on the single disk. The oplog was enabled for the test but sync-on-commit was disabled. Tests were done with maxid1=2M for the cached database and maxid1=20m for the uncached database. Unfortunately the database was always not cached for mmapv1 because it uses so much more space for the same data compared to WiredTiger and RocksDB.
The results below include:
- load time - the number of seconds for the load test
- load rate - the average insert rate during the load test
- load size - the database size in GB when the load ended
- 2h qps - the average QPS during the 2nd 1-hour query test
- 2h size - the database size in GB after the 2nd 1-hour query test
- 12h qps - the average QPS during the 12th 1-hour query test
- 12h size - the database size in GB after the 12th 1-hour query test
The QPS for RocksDB is a lot better than WiredTiger in the cached database test. After looking at the iostat data from the test I see that WiredTiger didn't cache the database for the 12h result below. The WiredTiger database was 5G, the test server has 8G of RAM and the WiredTiger block cache gets 4G of RAM. Assuming the database compressed by 2X then the uncompressed database is 10G, the 4G block cache can store 40% of it and the OS filesystem cache gets at most 4G. From vmstat data I see that the memory.cache column grows to ~4G.
Sizing a cache for InnoDB with direct IO is easy. Give it as much memory as possible and hope that the background tasks that share the HW don't use too much memory. But then InnoDB supported compression and now we had a problem of figuring out how to share the InnoDB buffer pool between compressed and uncompressed pages. There is some clever code in InnoDB that tries to figure this out based on whether a workload is CPU or IO bound. Well, we have the same problem with WiredTiger and RocksDB. Because they use buffered IO the OS filesystem cache is the cache for compressed pages and the WiredTiger/RocksDB block cache is the cache for uncompressed pages. Neither WiredTiger or RocksDB has code yet to dynamically adjust the amount of memory used for compressed versus uncompressed pages but I am certain that it is easier to dynamically resize the block cache in them compared to InnoDB.
For now RocksDB and WiredTiger default to using 50% of system RAM for the block cache. I suspect that in many cases, like when the database is larger than RAM, that it is better to use much less than 50% of system RAM for their block caches. I will save my hand waving math for another post and will leave myself a note to repeat the tests below with the cache set to use 20% of RAM.
uncached database, maxid1=20m, loaders=4, requesters=4
load load load 2h 2h 12h 12h
time rate size qps size qps size server
16252 5421 14g 113 14g 129 14g mongo.rocks.log
12037 7319 15g 105 16g 97 17g mongo.wt.log
19062 4494 69g 50 68g 46 68g mongo.mmap.log
cached database, maxid1=2m, loaders=4, requesters=4
load load load 2h 2h 12h 12h
time rate size qps size qps size server
1629 5886 2.2g 3405 3.0g 3147 3.9g mongo.rocks.log
12774 7530 2.5g 2966 4.1g 1996 5.0g mongo.wt.log
2058 4659 14g 1491 14g 627 18g mongo.mmap.log
Hardware
The Intel NUC systems have:- NUC5i3RYH - Intel core i3 with 4 hyperthread cores
- 8 GB RAM
- 1TB disk - HGST Travelstar, 7200 RPM disk
- 120GB SSD - Samsung 850 EVO
No comments:
Post a Comment