I looked for CPU regressions in recent RocksDB releases and was happy to not find them.
My workload was low concurrency (1 or 2 threads), an in-memory workload and a small server. I tested RocksDB versions 6.4 and 6.11 through 6.17. My servers use Ubuntu 20.04 which has g++ version 9.3. I wasn't able to compile versions prior to 6.4 because there were compiler errors that I didn't try to resolve. Such is the cost of using modern C++.
Running the tests
I used the all3.sh, run3.sh and rep_all3.sh scripts from my github repo. The scripts do two things for me. First, they handles changes to the db_bench options across RocksDB versions. Second, all3.sh runs tests in a sequence that is interesting to me. I need to update all3.sh for changes in the 6.X branch with respect to the db_bench options.
I forgot to do ulimit -n 50000 prior to running tests and repeated tests after fixing that. I have forgotten to do that many times in the past.
I ran the test in two modes: not-cached and cached. By not-cached I mean that the database is larger than the RocksDB block cache and cached means the database fits in the RocksDB block cache. In both cases all data is in the OS page cache.
For the benchmarks:10M KV pairs were inserted in the initial load, each test step was run for 300 seconds and the LSM tree was made small (8M write buffer, 32M L1) to get more levels in the LSM tree.
Command lines that are useful to me, but the scripts might be inscrutable to you:
bash run3.sh 100000000 64 300 8 32 $(( 10 * 1024 * 1024 ))
# To run for not-cached
bash run3.sh 100000000 64 300 8 32 $(( 1 * 1024 * 1024 ))
# To generate summaries for response time and throughput
bash rep_all3.sh v64 v611 v612 v613 v614 v615 v616 v617
Results
While there are many test steps (test step == one run of db_bench) the most interesting are the first two (fillrandom, overwrite) and the last 5 (readwhilewriting, seekrandomwhilewriting with different range sizes). Results can be misleading for the read-only tests that are done in between these as the performance on them depends on the shape of the LSM tree. The amount of data in the memtable, L0 and L1 isn't deterministic and can have a large impact on the CPU overhead for queries. In MyRocks I reduce the impact from this by flushing the memtable and compacting the L0 but db_bench doesn't have options for that (yet). It does have an option to do a full compaction but that is too much for me.
So I will share the results for all test steps but focus on the first 2 and last 5. There aren't significant regressions from v64 to v617. Results with a larger font and numbers for both response time and throughput are in github for cached and not-cached.
This has the QPS from the cached test:
v64 v611 v612 v613 v614 v615 v616 v617 test
344379 344567 340444 330834 325129 333669 331436 347029 overwrite
2817650 2779196 2877354 2886832 2887711 2774030 2710212 2823124 readseq
159063 129527 121106 123190 121325 134453 120146 137377 readrandom
77047 73334 64849 71194 49592 93552 64414 98480 seekrandom
3760630 3319862 3436593 3424777 3416794 3468542 3348936 3419843 readseq
216189 168233 170755 167659 177929 194189 170752 197241 readrandom
77176 74307 67099 73279 83014 95671 66752 97165 seekrandom
76688 73207 65771 71620 80992 93068 64924 94364 seekrandom
67994 65093 59306 65372 72698 80427 59296 83996 seekrandom
35619 33270 32388 34049 36375 38317 32091 38652 seekrandom
155204 151360 150730 151218 149980 150653 150261 148748 readwhilewriting
57080 54777 56317 55931 55271 55564 55581 56334 seekrandomwhilewriting
56184 53540 54838 54445 54450 54633 54465 54410 seekrandomwhilewriting
51143 49391 50092 50338 49548 50055 49486 50481 seekrandomwhilewriting
29553 27373 28491 28734 28586 28242 27853 28267 seekrandomwhilewriting
And this has the QPS from the not-cached test:
349918 349072 341224 347164 348470 340888 347850 334909 fillrandom
344040 327776 334852 332857 336480 343888 339678 332415 overwrite
2888170 2704291 2869560 2838685 2708847 2630220 2743535 2634374 readseq
167660 133981 130999 112923 120657 120273 87018 121126 readrandom
79615 58025 66542 49269 71643 71525 94862 71959 seekrandom
3784203 3284938 3411521 3404096 3414857 3409997 3448335 3366118 readseq
222893 165198 172372 175113 169132 174096 190337 166636 readrandom
80397 59224 67565 83540 73345 73666 94354 73855 seekrandom
78815 58232 65396 81865 72491 71938 92648 72689 seekrandom
70153 52933 60468 73907 64593 64626 80768 65317 seekrandom
36654 29389 32587 36881 34236 33753 38028 33618 seekrandom
154127 150561 150021 151168 148856 149113 149967 150643 readwhilewriting
57050 55440 55498 55576 55258 56255 55440 55162 seekrandomwhilewriting
56178 54348 55160 54893 54251 54699 54177 54770 seekrandomwhilewriting
51651 49415 50537 50627 49488 49281 49268 50172 seekrandomwhilewriting
29567 27303 28346 28359 28376 27969 27932 28204 seekrandomwhilewriting
No comments:
Post a Comment