Results for cached database
The test client is LinkbenchX and the configuration is described in a previous post. The test used 10 threads for loading and 20 for queries. After loading there were 12 1-hour runs of the query test and results are reported for the 2nd and 12th hour.
The results below are for the test with maxid1=20M set in FBWorkload.properties for both servers (disk array, PCIe flash from a previous post). The load rate is similar between the servers with a disk array and PCIe flash. The query rate is better for the server with PCIe flash, but that might be due more to more cores and newer CPUs than to storage performance. The load and query rates are better for WiredTiger than for RocksDB on both servers.
The server names are explained in a previous post. The oplog was enabled for all tests.
--- results for disk array
load load load 2h 2h 12h 12h 24h 24h
time rate size qps size qps size qps size server
5333 16519 14g 9649 17g 9689 22g mongo.rocks.log
3015 29223 16g 17715 30g 15654 40g mongo.wt.log
35253 2499 65g 4918 68g 4642 78g mongo.mmap.log
--- results for PCIe flash
5015 17565 14g 14442 17g 13925 24g 13506 29g mongo.rocks.log
3601 28020 16g 25488 34g 22992 45g mongo.wt.log
Results for uncached database
The results below are for the test with maxid1=1B for both servers (disk array, PCIe flash from a previous post). The load rate is similar between disk and flash and is also similar to the rates above for the cached database. Multiple secondary indexes are maintained during the load but IO latency does not have a significant impact on the load rate, even for WiredTiger which does more random IO than RocksDB.
The query rates are significantly lower for the disk array than for PCIe flash. IO latency is significant for queries. However, RocksDB does better than WiredTiger on the disk array possibly because it uses less random IO for writes which leaves more random IO capacity to serve reads.
Update - I repeated tests for the disk-array server with different configurations and the results are better. For WiredTiger I set storage.syncPeriodSecs=600 which changes the default checkpoint interval from 60 to 600 seconds. The benefit should be fewer disk writes and QPS improved by more than 30% with that change. For RocksDB I used the default configuration and QPS improved by more than 20% compared to the non-default configuration I had been using (Igor did a good job choosing the defaults). For all engines I used a smaller block cache -- 32G rather than 70G -- to save more space for compressed blocks in the OS filesystem cache. Results for all engines improved with a smaller block cache.
Update - I repeated tests for the disk-array server with different configurations and the results are better. For WiredTiger I set storage.syncPeriodSecs=600 which changes the default checkpoint interval from 60 to 600 seconds. The benefit should be fewer disk writes and QPS improved by more than 30% with that change. For RocksDB I used the default configuration and QPS improved by more than 20% compared to the non-default configuration I had been using (Igor did a good job choosing the defaults). For all engines I used a smaller block cache -- 32G rather than 70G -- to save more space for compressed blocks in the OS filesystem cache. Results for all engines improved with a smaller block cache.
--- results for disk array
load load load 2h 2h 12h 12h 24h 24h 48h 48h
time rate size qps size qps size qps size qps size server
298901 14625 606g 580 585g 611 588g 604 590g 600 596g mongo.tokumx.log, 70G
297159 14711 597g 731 585g 786 588g 782 592g 736 598g mongo.tokumx.log, 32G
297159 14711 597g 731 585g 786 588g 782 592g 736 598g mongo.tokumx.log, 32G
178923 24432 694g 343 704g 333 728g mongo.wt.log, default, 70G
176432 24777 696g 449 709g 434 738g 423 749g 418 757g mongo.wt.log, non-default, 32G
271569 16097 631g 448 631g 477 632g 452 633g 471 635g mongo.rocks.log, non-default, 70G
274780 15909 628g 458 628g 592 629g 574 631g 569 633g mongo.rocks.log, default, 32G
--- results for PCIe flash
251688 17368 630g 9670 633g 6762 644g 6768 656g mongo.rocks.log
175740 24874 695g 8533 766g 8019 791g 7718 806g mongo.wt.log
This has the mean response time in milliseconds for each of the query types in Linkbench. The data below is from the 12th 1-hour run for the disk array (the first pair of numbers) and then PCIe flash (the second pair of numbers. The most frequent operation is GET_LINKS_LIST followed by MULTIGET_LINK. On the disk array because the response time for these two operations is better for RocksDB and that explains why it gets more QPS than WiredTiger. For PCIe flash the response time for GET_LINKS_LIST is lower for WiredTiger which explains the better QPS. The response time for all of the write operations is better for RocksDB than WiredTiger on disk and flash, but those operations are less frequent. WiredTiger does more reads from disk & flash during writes as b-tree leaf pages must be read before being written.
The QPS for RocksDB is higher than WiredTiger for the 2nd 1-hour run with the PCIe server but lower after the 12th 1-hour run. The mean response time for the GET_LINKS_LIST operation almost doubles and the cause might be the range read penalty from an LSM.
The QPS for RocksDB is higher than WiredTiger for the 2nd 1-hour run with the PCIe server but lower after the 12th 1-hour run. The mean response time for the GET_LINKS_LIST operation almost doubles and the cause might be the range read penalty from an LSM.
12th 1-hour run 2nd 1-hour run 12th 1-hour run
disk disk - flash flash flash flash
wired rocks - wired rocks wired rocks
ADD_NODE 0.717 0.444 0.327 0.300 0.324 0.276
UPDATE_NODE 23.0 21.5 1.217 1.083 1.240 0.995
DELETE_NODE 22.9 21.6 1.260 0.675 1.285 1.018
GET_NODE 22.7 20.9 0.913 2.355 0.941 0.622
ADD_LINK 47.6 23.5 2.988 1.610 3.142 2.255
DELETE_LINK 31.9 21.6 2.063 1.610 2.407 1.701
UPDATE_LINK 51.1 25.8 3.238 2.507 3.407 2.395
COUNT_LINK 16.7 10.9 0.686 0.571 0.739 0.547
MULTIGET_LINK 22.6 18.9 1.599 1.195 1.603 1.136
GET_LINKS_LIST 35.6 27.6 1.910 2.181 2.056 3.945
And the data below is example output from the end of one test, RocksDB on flash for the 12 1-hour run.
ADD_NODE count = 627235 p25 = [0.2,0.3]ms p50 = [0.2,0.3]ms p75 = [0.2,0.3]ms p95 = [0.3,0.4]ms p99 = [1,2]ms max = 263.83ms mean = 0.276ms
UPDATE_NODE count = 1793589 p25 = [0.7,0.8]ms p50 = [0.8,0.9]ms p75 = [0.9,1]ms p95 = [1,2]ms p99 = [4,5]ms max = 301.453ms mean = 0.995ms
DELETE_NODE count = 246225 p25 = [0.7,0.8]ms p50 = [0.8,0.9]ms p75 = [0.9,1]ms p95 = [1,2]ms p99 = [4,5]ms max = 265.012ms mean = 1.018ms
GET_NODE count = 3150740 p25 = [0.4,0.5]ms p50 = [0.5,0.6]ms p75 = [0.5,0.6]ms p95 = [0.9,1]ms p99 = [3,4]ms max = 301.078ms mean = 0.622ms
ADD_LINK count = 2189319 p25 = [1,2]ms p50 = [2,3]ms p75 = [2,3]ms p95 = [3,4]ms p99 = [7,8]ms max = 317.292ms mean = 2.255ms
DELETE_LINK count = 727942 p25 = [0.4,0.5]ms p50 = [0.6,0.7]ms p75 = [2,3]ms p95 = [3,4]ms p99 = [6,7]ms max = 320.13ms mean = 1.701ms
UPDATE_LINK count = 1949970 p25 = [1,2]ms p50 = [2,3]ms p75 = [2,3]ms p95 = [3,4]ms p99 = [7,8]ms max = 393.483ms mean = 2.395ms
COUNT_LINK count = 1190142 p25 = [0.3,0.4]ms p50 = [0.4,0.5]ms p75 = [0.5,0.6]ms p95 = [0.8,0.9]ms p99 = [2,3]ms max = 296.65ms mean = 0.547ms
MULTIGET_LINK count = 127871 p25 = [0.7,0.8]ms p50 = [0.9,1]ms p75 = [1,2]ms p95 = [1,2]ms p99 = [4,5]ms max = 272.272ms mean = 1.136ms
GET_LINKS_LIST count = 12353781 p25 = [0.5,0.6]ms p50 = [1,2]ms p75 = [1,2]ms p95 = [3,4]ms p99 = [38,39]ms max = 2360.432ms mean = 3.945ms
REQUEST PHASE COMPLETED. 24356814 requests done in 3601 seconds. Requests/second = 6762
No comments:
Post a Comment