Thursday, February 18, 2021

Still not finding CPU regressions in RocksDB

I repeated the search for CPU regressions in recent RocksDB releases using a m5ad.12xlarge server from AWS. I was happy to not find problems, just as I did not find any on a small servers.

I setup the AWS host using this script and then ran the all3.sh script for 1, 4, 8 and 16 concurrent clients. I forgot to save the command line, but the only thing I changed was the number of background threads that RocksDB can use (increased from 2 to 8).

The AWS host has two direct attached NVMe drives and I used one for the database directory with XFS and discard enabled. I also disabled hyperthreads when I setup the host.

Results are in github for one, four, eight and sixteen threads. The output is created by the rep_all4.sh script and has 3 sections. The first is the response time/operation per test. The second is the throughput per test. The third is the relative throughput per test (relative to RocksDB version 6.4). The versions tested are (v64=6.4, v611=6.11, ..., v617=6.1.7).

Below I only show the output from the third section. A value > 1 means the new version has more throughput than RocksDB version 6.4. The values tend to be close to 1 and I don't see significant regressions.

My focus is on the first two and last 5 tests and the others are read-only and can suffer from noise because the state of the LSM tree isn't deterministic (data in memtable and L0 can vary). In the long run I hope for options in db_bench to flush the memtable and compact L0 into L1 to reduce this problem.

Results for 1 thread

v64 v611 v612 v613 v614 v615 v616 v617 test
1 0.99 0.99 0.99 0.99 1.00 0.98 1.00 fillrandom
1 0.97 0.96 0.97 0.96 0.97 0.97 0.97 overwrite
1 0.89 0.94 0.95 1.01 0.94 0.90 0.91 readseq
1 0.79 0.77 0.82 0.80 0.81 0.69 0.76 readrandom
1 0.98 1.00 0.99 0.98 1.02 0.67 0.99 seekrandom
1 0.89 0.98 0.96 0.95 0.96 0.89 0.94 readseq
1 0.85 0.79 0.80 0.79 0.81 0.80 0.82 readrandom
1 0.92 0.97 0.96 0.96 1.00 1.09 0.98 seekrandom
1 0.96 0.98 0.96 0.91 0.97 1.10 0.97 seekrandom
1 0.96 0.96 0.96 0.97 1.00 1.07 0.96 seekrandom
1 1.01 1.04 1.05 1.00 1.07 1.09 0.99 seekrandom
1 1.06 1.01 0.87 0.98 0.95 1.03 1.00 readwhilewriting
1 0.84 0.94 0.98 1.03 0.96 0.94 0.97 seekrandomwhilewriting
1 0.97 1.03 0.97 0.98 0.99 0.97 0.96 seekrandomwhilewriting
1 0.93 0.97 0.93 0.97 0.94 0.97 0.99 seekrandomwhilewriting
1 0.95 0.97 1.02 0.95 1.01 1.01 1.01 seekrandomwhilewriting

Results for 4 threads

v64 v611 v612 v613 v614 v615 v616 v617 test
1 0.99 0.98 0.99 0.98 0.99 0.98 0.98 fillrandom
1 0.99 1.07 1.09 1.03 1.06 1.02 1.02 overwrite
1 0.92 0.94 0.94 0.97 0.96 0.95 0.93 readseq
1 0.84 0.90 0.90 0.90 0.74 0.74 0.99 readrandom
1 0.72 1.01 0.98 1.01 0.72 0.84 1.04 seekrandom
1 0.90 0.97 0.90 0.89 0.88 0.90 0.90 readseq
1 0.87 0.94 0.94 0.95 1.03 0.86 0.95 readrandom
1 0.75 1.06 0.99 1.01 1.11 0.86 1.03 seekrandom
1 0.73 1.02 0.97 0.97 1.07 0.79 0.95 seekrandom
1 0.76 1.02 0.96 1.02 1.12 0.85 1.03 seekrandom
1 0.78 0.95 0.96 0.96 1.05 0.82 0.96 seekrandom
1 0.94 0.93 0.91 0.94 0.95 0.94 0.96 readwhilewriting
1 0.96 0.92 0.94 0.94 0.94 0.92 0.97 seekrandomwhilewriting
1 0.98 0.94 0.98 0.96 0.98 0.97 0.97 seekrandomwhilewriting
1 0.97 0.93 1.04 0.98 0.98 0.95 0.96 seekrandomwhilewriting
1 0.96 1.02 1.00 1.02 0.98 1.01 0.98 seekrandomwhilewriting

Results for 8 threads

v64 v611 v612 v613 v614 v615 v616 v617 test
1 1.01 0.96 0.97 0.99 0.97 0.97 0.99 fillrandom
1 0.74 0.84 1.07 1.01 1.00 0.99 0.99 overwrite
1 0.89 0.89 1.09 1.06 1.10 1.08 1.08 readseq
1 1.01 1.01 0.89 0.97 1.01 0.72 0.88 readrandom
1 1.17 0.99 1.02 0.94 0.98 0.80 0.94 seekrandom
1 1.01 1.06 1.01 1.01 0.99 0.98 0.97 readseq
1 0.98 0.99 0.91 0.96 0.92 0.81 0.92 readrandom
1 1.13 0.92 0.95 0.88 0.92 0.74 0.89 seekrandom
1 1.12 0.94 0.96 0.91 0.96 0.77 0.91 seekrandom
1 1.15 0.96 1.00 0.94 0.95 0.79 0.92 seekrandom
1 1.04 0.96 0.91 0.92 0.92 0.81 0.90 seekrandom
1 0.97 0.97 0.99 0.99 0.98 1.02 1.00 readwhilewriting
1 0.98 1.01 0.98 0.96 0.97 0.97 0.98 seekrandomwhilewriting
1 0.95 0.94 0.95 0.96 0.95 0.92 0.94 seekrandomwhilewriting
1 0.94 0.97 0.96 0.96 0.95 0.95 0.98 seekrandomwhilewriting
1 0.98 1.01 0.98 0.96 0.96 0.94 0.92 seekrandomwhilewriting

Results for 16 threads

v64 v611 v612 v613 v614 v615 v616 v617 test
1 0.99 0.98 0.98 0.95 0.97 0.96 0.99 fillrandom
1 0.93 0.95 1.41 1.29 1.32 1.30 1.32 overwrite
1 0.95 0.99 1.12 1.13 1.12 1.09 1.12 readseq
1 1.07 0.85 0.97 0.95 0.88 0.95 0.91 readrandom
1 1.03 1.00 1.00 1.03 0.96 1.02 1.02 seekrandom
1 0.98 1.06 0.97 0.97 0.96 0.97 0.94 readseq
1 1.04 0.88 1.03 1.01 0.97 0.96 1.05 readrandom
1 0.99 1.00 0.98 0.99 0.94 0.97 0.99 seekrandom
1 1.00 1.00 0.99 1.00 0.93 0.98 1.00 seekrandom
1 1.00 0.99 0.97 1.01 0.93 0.97 0.98 seekrandom
1 0.98 0.98 0.98 0.97 0.93 0.94 0.95 seekrandom
1 0.93 0.97 0.97 0.97 0.94 0.96 0.93 readwhilewriting
1 0.97 0.97 0.92 0.96 0.99 0.97 0.98 seekrandomwhilewriting
1 0.95 0.97 0.96 0.97 0.95 0.98 0.94 seekrandomwhilewriting
1 0.97 1.00 0.99 0.97 0.99 0.99 0.94 seekrandomwhilewriting
1 0.97 1.01 1.00 1.02 1.00 0.98 0.99 seekrandomwhilewriting

No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...