Tuesday, November 14, 2017

Sysbench, in-memory, small server: MyRocks over time

In this post I compare four MyRocks releases from February to October using in-memory sysbench and a small server. The goal is understand where we have made MyRocks faster and slower this year.

tl;dr
  • For many tests there is no decrease in QPS from February to October
  • For some tests the QPS decreased by 3% to 8%
  • The largest regression is for read-heavy tests that run after write-heavy tests. Querying an LSM tree after many updates usually costs more in CPU and/or IO compared to querying it before the updates. But the CPU overhead might have increased since February.

Configuration

The tests used MyRocks from FB MySQL which is currently based on 5.6.35. Builds were done using HEAD from February 10, April 14, June 16, August 15 and October 16. The git hashes for these builds are:
  • February 10 - FB MySQL f3019b, RocksDB c2ca7a
  • April 14 - FB MySQL e28823, RocksDB 9300ef
  • June 16 - FB MySQL 52e058, RocksDB 7e5fac
  • August 15 - FB MySQL 0d76ae, RocksDB 50a969
  • October 16 - FB MySQL 1d0132, RocksDB 019aa7
All tests used jemalloc with mysqld. The i3 and i5 NUC servers are described here. My use of sysbench is described here. The my.cnf files are here for the i3 NUC and i5 NUC. I tried to tune my.cnf for all engines but there are a few new & changed options in that time. For all tests the binlog was enabled but fsync was disabled for the binlog and database redo log. Compression was not used.

Sysbench is run with 2 tables and 2M rows per table. Each test is repeated for 1 and 2 clients. Each test runs for 600 seconds except for the insert-only test which runs for 300 seconds. The database fits in RAM.

I repeat tests on an i5 NUC and i3 NUC. The i5 NUC has more RAM, a faster SSD and faster CPU than the i3 NUC, but I disabled turbo boost on the i5 NUC many months ago to reduce variance in performance and with that the difference in CPU performance between these servers is smaller.

Results

All of the data for the tests is on github for the i3 NUC and the i5 NUC. Results for each test are listed separately below. The results below have the QPS for the test with 1 client relative to the QPS for the February 10 build. The tests are explained here.

Graphs

No graphs this time. The results aren't that interesting.

update-inlist

Here and the sections that follow have the QPS and relative QPS. The relative QPS is the QPS for the test with 1 client relative to the QPS for the February 10 build. Values are provided for the i3 and i5 NUC.

There is no regression.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
1713    1.00    2002    1.00    feb10
1826    1.07    2133    1.07    apr14
1605    0.94    1987    0.99    jun16
1698    0.99    2017    1.01    aug15
1761    1.03    2087    1.04    oct16

update-one

QPS decreased by ~8%.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
8375    1.00    9295    1.00    feb10
8036    0.96    9151    0.98    apr14
7774    0.93    8602    0.93    jun16
7705    0.92    8555    0.92    aug15
7732    0.92    8620    0.93    oct16

update-index

There is no regression on the i3 NUC and QPS has decreased by 7% on the i5 NUC.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
5981    1.00    6861    1.00    feb10
5799    0.97    6722    0.98    apr14
5678    0.98    6300    0.92    jun16
5809    1.00    6306    0.92    aug15
6022    1.04    6392    0.93    oct16

update-nonindex

QPS decreased by 3% on the i3 NUC and 6% on the i5 NUC.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
6521    1.00    7184    1.00    feb10
6346    0.97    7127    0.99    apr14
5913    0.91    6516    0.91    jun16
6066    0.93    6565    0.91    aug15
6310    0.97    6724    0.94    oct16

delete

QPS decreased by 7% on the i3 NUC and 5% on the i5 NUC.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
15301   1.00    16361   1.00    feb10
14714   0.96    16552   1.01    apr14
13973   0.91    15736   0.96    jun16
14216   0.93    15447   0.94    aug15
14233   0.93    15515   0.95    oct16

read-write with range-size=100

QPS decreased by 7%.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
8101    1.00    8583    1.00    feb10
7700    0.95    8162    0.95    apr14
7333    0.91    7812    0.91    jun16
7366    0.91    7747    0.90    aug15
7532    0.93    7967    0.93    oct16

read-write with range-size=10000

QPS decreased by 5%.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
262     1.00    308     1.00    feb10
253     0.97    300     0.97    apr14
241     0.92    290     0.94    jun16
246     0.94    285     0.93    aug15
249     0.95    294     0.95    oct16

read-only with range-size=100

QPS decreased by 8% on the i3 NUC and 21% on the i5 NUC. I suspect that the 21% regression on i5 NUC is an outlier and unlikely to repeat, but I will find out when I test a new build. I think it is an outlier because there is variance with MyRocks for read-heavy tests that follow write-heavy tests. The state of the LSM tree (number of entries in the memtable, number of files in L0) is not deterministic across test runs and that impacts read performance. This is made worse because the LSM tree state can remain in that state for the duration of the read-only test. I prefer to run read-write tests when evaluating MyRocks to avoid this variance.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
8610    1.00    9272    1.00    feb10
7818    0.91    8223    0.89    apr14
7506    0.87    8463    0.91    jun16
7604    0.88    7660    0.93    aug15
7920    0.92    7321    0.79    oct16

read-only.pre with range-size=10000

QPS decreased by 4% on the i3 NUC and 3% on the i5 NUC.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
215     1.00    261     1.00    feb10
213     0.99    256     0.98    apr14
203     0.94    261     1.00    jun16
208     0.97    256     0.98    aug15
207     0.96    254     0.97    oct16

read-only with range-size=100000

QPS decreased by 6% on the i3 NUC and 8% on the i5 NUC. The decrease here is larger than for the previous test. The difference is that this test is run after write-heavy tests while the previous test is run before them. It costs more to search the LSM structures after random updates (compare the QPS here with the QPS in the previous test), and that cost may have increased. I have written more about mistakes to avoid when doing a benchmark with an LSM and if you only do read-only tests before fragmenting the LSM tree you might be an optimist.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
214     1.00    257     1.00    feb10
203     0.95    242     0.94    apr14
194     0.91    240     0.93    jun16
197     0.92    230     0.89    aug15
201     0.94    237     0.92    oct16

point-query.pre

QPS decreased by 3% on the i3 NUC and 1% on the i5 NUC.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
15775   1.00    16190   1.00    feb10
15314   0.97    16035   0.99    apr14
14504   0.92    15218   0.94    jun16
14627   0.93    15462   0.96    aug15
15277   0.97    16022   0.99    oct16

point-query

QPS decreased by 7%. The decrease here is larger than for the previous test. See the comment two sections about about read-only tests that follow write-heavy tests.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
15326   1.00    15900   1.00    feb10
14126   0.92    14556   0.92    apr14
13612   0.89    15030   0.95    jun16
13557   0.88    13721   0.86    aug15
14328   0.93    14801   0.93    oct16

random-points.pre

QPS decreased by 3% on the i3 NUC and 2% on the i5 NUC.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
1450    1.00    1527    1.00    feb10
1459    1.01    1499    0.98    apr14
1301    0.90    1360    0.89    jun16
1374    0.95    1394    0.91    aug15
1401    0.97    1502    0.98    oct16

random-points

QPS decreased by 8% on the i3 NUC and 12% on the i5 NUC. See the comment two sections about about read-only tests that follow write-heavy tests.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
1063    1.00    1151    1.00    feb10
 928    0.87     940    0.82    apr14
 952    0.90     947    0.82    jun16
 962    0.90     847    0.74    aug15
 973    0.92    1008    0.88    oct16

hot-points

QPS decreased by 15%. This is like random-points except it fetches the same values for every query. It is run after the write-heavy tests. The regression is similar to random-points.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
1565    1.00    1809    1.00    feb10
1384    0.88    1531    0.85    apr14
1239    0.79    1422    0.79    jun16
1334    0.85    1341    0.74    aug15
1329    0.85    1535    0.85    oct16

insert

QPS decreased by 8% on the i3 NUC and 7% on the i5 NUC.

i3 NUC          i5 NUC
QPS     ratio   QPS     ratio   engine
8337    1.00    9102    1.00    feb10
8377    1.00    9086    1.00    apr14
7871    0.94    8723    0.96    jun16
8074    0.97    8785    0.97    aug15
7650    0.92    8446    0.93    oct16

No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...