Tuesday, February 28, 2017

Part 3: sysbench, a large server and small database

This is part 3 of my performance report for sysbench and MyRocks. For this test I use a large server (24 cores, 48 threads, fast NVMe SSD). A small server was used for part 1 and part 2.

This test is done with a small and cached database - 8M rows. InnoDB is a great choice for such a workload. I am not surprised that MyRocks and TokuDB are not great at this workload. The value for MyRocks and TokuDB is better compression and less write-amplification and that usually requires a larger database and the benefit grows when the database doesn't fit in RAM. But early evaluators might use sysbench so I want to understand how they perform in this setup.

tl;dr
  • The common pattern was InnoDB was faster than MyRocks and MyRocks was faster than TokuDB. 
  • InnoDB in MySQL 5.7.10 is much faster than in MySQL 5.6.26 on tests that are write-heavy or have long range scans. But InnoDB in 5.7 isn't strictly faster than in 5.6 even at high concurrency.
  • MyRocks suffers on tests with longer range scans but did pretty good for point queries.
  • TokuDB was the slowest for all of the workloads except read-write with --oltp-range-size=10000 and read-only with --oltp-range-size in 100 and 10000.
  • Percona fixed a bug in TokuDB that I reported. Thank you.
It can be interesting to compare this result with the result from the smaller servers I use at home. The smaller server has a slower CPU (core i3) and slower SSD. Unfortunately I need to repeat tests on my work servers using the latest version of sysbench.

Details

I used version of sysbench that is a few months old with an older version of my scripts. For my home servers I upgraded to modern sysbench and I will soon do that at work. I will also upgrade from MySQL 5.7.10 to a more recent version and that should make a famous MySQL support guru happy.

The test server has 24 cores, 48 threads, 2 sockets and 256gb of RAM. The storage is from multiple NVMe SSDs. I tested 4 engines -- myrocks-5635 is MyRocks from FB MySQL merged to upstream MySQL 5.6.35, innodb-5626 is InnoDB from upstream MySQL 5.6.26, innodb-5710 is InnoDB from upstream MySQL 5.7.10 and tokudb-5717 is TokuDB from Percona Server 5.7.17-11. The my.cnf files are here for MyRocks, InnoDB-5.7 and InnoDB-5.6. In all cases the binlog is enabled and fsync is disabled for both the storage engine redo log and the binlog.

Tests are run for 1 to 128 concurrent clients. When describing the results I divide that into
low concurrency (<= 8 clients), medium concurrency (16 to 40 clients) and high concurrency (>= 48 clients).  The test was run at each concurrency level for 180 seconds for read-heavy tests and 300 seconds for write-heavy tests. The sysbench process is run on the same host as mysqld.

Tests are run in this order:
  • prepare - this isn't a test, it creates and loads the tables. There were 8 tables with 1M rows per table. The sysbench table has one secondary index.
  • update-only and secondary index maintenance is required
  • update-only and secondary index maintenance is not required
  • read-write with --oltp-range-size=100 and then optionally with --oltp-range-size=10000
  • read-only with --oltp-range-size in 10, 100, 1000 and 10000. I don't report results for --oltp-range-size=1000
  • point queries
  • insert-only
Results

For the performance summaries below I use "X >> Y >> Z" to mean X is faster than Y and Y is faster than Z. I did not include graphs for one or both of 1) that is a lot more work and 2) I want some readers to take the time and think about the numbers. Unfortunately the formatting is not great for a wide table.

Update-only with secondary index maintenance

The numbers in the table are the QPS for 1 to 128 threads.

   1       2        4       8      16      24      32      40      48      64      80      96     128   threads
2895    10363   20118   34017   38720   42138   44736   44646   44169   43633   43104   42690   42442   myrocks-5635
4029    8373    15522   24563   32448   37061   40251   42522   44219   44631   45123   45762   45808   innodb-5626
4254    8792    17020   30524   48443   60623   67052   68773   69232   70556   72134   72696   74222   innodb-5710
2145    4026     7344   13575   21222   27807   30039   29541   29106   27964   26493   25435   23210   tokudb-5717

Throughput:
  • 1 to 8 clients - MyRocks >> InnoDB-5.7 >> InnoDB-5.6 >> TokuDB. I don't know why MyRocks was lousy for 1 client but OK for 2+.
  • 16 to 40 clients - InnoDB-5.7 >> MyRocks >> InnoDB-5.6 >> TokuDB
  • 48+ clients - InnoDB-5.7 >> MyRocks, InnoDB-5.6 >> TokuDB

Update-only without secondary index maintenance

The numbers in the table are the QPS for 1 to 128 threads.

   1       2        4       8      16      24      32      40      48      64      80      96     128   threads
2901   10696    20983   37078   43756   47549   50627   52753   52318   51690   50853   50003   49074   myrocks-5635
6384   12625    23847   40501   51505   53024   53587   53523   53316   52483   51632   51276   51117   innodb-5626
5618   11060    21361   38810   58197   69717   74286   75519   75545   76520   77335   78117   79152   innodb-5710
2260    4470     8206   15322   26552   33024   37739   41461   44358   44415   41899   39626   35341   tokudb-5717

Throughput:
  • 1 to 8 clients - InnoDB-5.6 >> InnoDB-5.7 >> MyRocks >> TokuDB.  I don't know why MyRocks was lousy for 1 client but OK for 2+.
  • 16 to 40 clients - InnoDB-5.7 >> InnoDB-5.6 >> MyRocks >> TokuDB.
  • 48+ clients - InnoDB-5.7 >> InnoDB-5.6 >> MyRocks >> TokuDB

Read-write with --oltp-range-size=100

The numbers in the table are the QPS for 1 to 128 threads.

   1       2        4       8      16      24      32      40      48      64      80      96     128   threads
4798    9955    20426   46361   98537  132010  153726  165477  167280  166549  168513  166807  166722   myrocks-5635
6391   12591    27534   61157  112516  146970  172971  193107  196981  191683  190707  191579  190615   innodb-5626
6143   13051    27216   57447  108786  145127  169957  192548  204655  205408  205278  205629  206721   innodb-5710
5157    9009    17931   42903   76633   95322  108896  114310  114650  113619  112220  108055  103181   tokudb-5717

Throughput:
  • 1 to 8 clients - InnoDB-5.7, InnoDB-5.6 >> MyRocks >> TokuDB
  • 16 to 40 clients - InnoDB-5.7, InnoDB-5.6 >> MyRocks >> TokuDB
  • 48+ clients - InnoDB-5.7 >> InnoDB-5.6 >> MyRocks >> TokuDB

Read-write with --oltp-range-size=10000

The numbers in the table are the QPS for 1 to 128 threads.

   1       2        4       8      16      24      32      40      48      64      80      96     128   threads
 210     412      774    2091    4077    5288    5568    6409    6885    6235    6675    6543    5889   myrocks-5635
 306     612     1216    2403    4713    6439    7398    8098    8416    8397    8393    8364    8247   innodb-5626
 448     888     1758    3462    6699    8770    9724   10302   10482   10448   10330   10382   10360   innodb-5710
 360     722     1409    2713    5218    6932    7529    7851    7973    7954    7856    7822    7675   toku5717.none.100g

Throughput:
  • 1 to 8 clients - InnoDB-5.7 >> TokuDB >> InnoDB-5.6 >> MyRocks
  • 16 to 40 clients - InnoDB-5.7 >> TokuDB >> InnoDB-5.6 >> MyRocks
  • 48+ clients - InnoDB-5.7 >> InnoDB-5.6 >> TokuDB >> MyRocks

Read-only with --oltp-range-size=10

The numbers in the table are the QPS for 1 to 128 threads.

   1       2        4       8      16      24      32      40      48      64      80      96     128   threads
5659   12266    25778   61366  128290  175337  204504  230534  246007  262875  269596  272178  274313   myrocks-5635
6491   13838    30028   72355  144523  194622  230314  265632  298481  300703  302924  303685  305162   innodb-5626
6052   12700    27143   63307  129096  177797  209601  243368  273429  272255  272563  275147  276691   innodb-5710
5391   11261    23935   55313  113932  164953  195290  223643  249579  247840  246926  247897  247045   tokudb-5717

Throughput:
  • 1 to 8 clients - InnoDB-5.6 >> InnoDB-5.7 >> MyRocks >> TokuDB
  • 16 to 40 clients - InnoDB-5.6 >> InnoDB-5.7 >> MyRocks >> TokuDB
  • 48+ clients - InnoDB-5.6 >> InnoDB-5.7 >> MyRocks >> TokuDB

Read-only with --oltp-range-size=100

The numbers in the table are the QPS for 1 to 128 threads.

   1       2        4       8      16      24      32      40      48      64      80      96     128   threads
4031    8429    17417   38579   82542  116461  134208  150897  165211  166491  164694  166504  166625   myrocks-5635
5375   11273    23474   54099  110142  151311  177268  202868  225283  228292  229257  229828  231068   innodb-5626
5383   11292    23629   54050  110015  152462  176974  201112  223618  223737  224161  226272  226484   innodb-5710
4804   10062    20795   47474   98569  138827  159757  180406  198294  198680  198103  198374  196718   tokudb-5717

Throughput:
  • 1 to 8 clients - InnoDB-5.6, InnoDB-5.7 >> TokuDB >> MyRocks
  • 16 to 40 clients - InnoDB-5.6, InnoDB-5.7 >> TokuDB >> MyRocks
  • 48+ clients - InnoDB-5.6, InnoDB-5.7 >> TokuDB >> MyRocks

Read-only with --oltp-range-size=10000

The numbers in the table are the QPS for 1 to 128 threads.

   1       2        4       8      16      24      32      40      48      64      80      96     128   threads
 161     325      634    1238    2438    3385    3807    4093    4300    4261    4239    4214    4135   myrocks-5635
 242     484      961    1900    3729    5099    5857    6439    6748    6663    6562    6589    6500   innodb-5626
 357     708     1407    2788    5354    6984    7689    8095    8246    8222    8145    8138    8071   innodb-5710
 301     606     1203    2365    4583    6057    6582    6885    6992    6908    6862    6802    6772   tokudb-5717

Throughput:
  • 1 to 8 clients - InnoDB-5.7 >> TokuDB >> InnoDB-5.6 >> MyRocks
  • 16 to 40 clients - InnoDB-5.7 >> TokuDB >> InnoDB-5.6 >> MyRocks
  • 48+ clients - InnoDB-5.7 >> TokuDB >> InnoDB-5.6 >> MyRocks

Point-query

The numbers in the table are the QPS for 1 to 128 threads.

   1       2        4       8      16      24      32      40      48      64      80      96     128   threads
6784   14328    31277   80458  166882  222750  265885  306227  341095  354327  349912  345317  349403   myrocks-5635
7236   15236    33569   87991  177791  236463  278423  319764  362604  371093  374380  371092  375698   innodb-5626
6802   14177    30860   78950  163909  222998  265156  307198  345832  354835  362774  366825  367057   innodb-5710
6249   12951    27608   67805  142064  194320  230720  264171  294814  306298  308177  309542  310391   tokudb-5717

Throughput:
  • 1 to 8 clients - InnoDB-5.6 >> MyRocks >> InnoDB-5.7 >> TokuDB
  • 16 to 40 clients - InnoDB-5.6 >> MyRocks, InnoDB-5.7 >> TokuDB
  • 48+ clients - InnoDB-5.6 >> InnoDB-5.7 >> MyRocks >> TokuDB

Insert-only

The numbers in the table are the QPS for 1 to 128 threads.

   1       2        4       8      16      24      32      40      48      64      80      96     128   threads
5254    16950   31561   41074   47720   52063   54066   53899   53900   53725   53343   53098   52278   myrocks-5635
5753    11506   36083   57235   62591   62469   62577   61899   61131   60592   59080   56895   52913   innodb-5626
5291    15657   33358   55385   79656   90812   97735   99944  100714  101967  103374  104934  106194   innodb-5710
2975     5754    9033   17695   29039   37341   38410   35859   33792   30053   27362   25028   21275   tokudb-5717

Throughput:
  • 1 to 8 clients - InnoDB-5.6 >> InnoDB-5.7 >> MyRocks >> TokuDB
  • 16 to 40 clients - InnoDB-5.7 >> InnoDB-5.6 >> MyRocks >> TokuDB
  • 48+ clients - InnoDB-5.7 >> InnoDB-5.6 >> MyRocks >> TokuDB

Friday, February 24, 2017

Part 2 - sysbench, MyRocks, InnoDB and a small server

This is part 2 of the performance report for sysbench, MyRocks, InnoDB and a small server. The first part covered an IO-bound workload where the database was larger than RAM. This part covers a cached workload where the database fits in the MyRocks block cache and InnoDB buffer pool.

tl;dr
  • MyRocks is faster than InnoDB in the write-heavy tests.
  • InnoDB in MySQL 5.6 is faster than MyRocks in the read-heavy tests. I can't reproduce this on the newer/faster CPUs I use at work. For this test I used an Intel NUC with a 5th generation core i3 CPU. I am still trying to figure this out.
  • InnoDB in MySQL 5.6 is faster than in 5.7 for most write-heavy tests.
  • InnoDB in MySQL 5.7 is faster than in 5.6 for most read-heavy tests. I am curious why this depends on read-heavy vs write-heavy.
Details

The previous blog post has all of the details and and shared results for the IO-bound test that used 4 tables with 40M rows per table. This test used 4 tables with 1M rows per table. Tests are run for 3 storage engines - MyRocks from FB MySQL merged to upstream MySQL 5.6.35 (myrocks-5635), InnoDB from upstream MySQL 5.6.35 (innodb-5635) and InnoDB from upstream MySQL 5.7.10 (innodb-5710). The sequence of tests is the same as described in the previous blog post.

A sample command line for the test is:
bash all_small.sh 4 1000000 600 600 300 innodb 1 0 \
    ~/b/orig5710/bin/mysql none ~/b/sysbench/share/sysbench


Update-only with secondary index maintenance

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
6795    10738   13212   myrocks-5635
3907     5912    7323   innodb-5635
3286     4820    6012   innodb-5710
- optimized my.cnf
6787    10735   13103   myrocks-5635
3616     6021    7393   innodb-5635
3122     4733    6131   innodb-5710

Summary:
  • MyRocks is faster than InnoDB because secondary index maintenance is write-only (read-free). On the IO-bound test that avoided stalls from page reads. Here it saves on CPU.
  • The optimized my.cnf doesn't change performance
  • InnoDB in MySQL 5.6 is faster than in 5.7

Update-only without secondary index maintenance

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
7373    11510   14762   myrocks-5635
8050    10783   13185   innodb-5635
6528     8971   11338   innodb-5710
- optimized my.cnf
7335    11374   14609   myrocks-5635
8809    12240   15179   innodb-5635
6644     8935   11580   innodb-5710

Summary:
  • InnoDB in MySQL 5.6 is slightly faster than MyRocks. The update statement in this test doesn't require secondary index maintenance so MyRocks loses that benefit.
  • The optimized my.cnf helps InnoDB in MySQL 5.6
  • InnoDB in MySQL 5.6 is much faster than in 5.7

Delete

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       concurrency
- default my.cnf
8303    myrocks-5635
7848    innodb-5635
6461    innodb-5710
- optimized my.cnf
8265    myrocks-5635
7819    innodb-5635
6215    innodb-5710

Summary:
  • MyRocks is faster than InnoDB
  • The optimized my.cnf doesn't change performance
  • InnoDB in MySQL 5.6 is faster than in 5.7

Write-only

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
12547   19685   25236   myrocks-5635
11209   15429   18387   innodb-5635
 9701   13035   15308   innodb-5710
- optimized my.cnf
12197   19241   24533   myrocks-5635
13076   17780   21097   innodb-5635
10401   14496   17402   innodb-5710

Summary:
  • MyRocks is faster than InnoDB at concurrency >= 2
  • The optimized my.cnf helps InnoDB
  • InnoDB in MySQL 5.6 is faster than in 5.7

Read-write with --range-size=100


The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
5404    10126   13214   myrocks-5635
8507    12658   16867   innodb-5635
8983    13311   16806   innodb-5710
- optimized my.cnf
7426    12116   16076   myrocks-5635
10084   15236   20895   innodb-5635
10480   15666   20830   innodb-5710

Summary:
  • InnoDB is faster than MyRocks because there is more CPU overhead in MyRocks for range queries
  • The optimized my.cnf helps MyRocks and InnoDB
  • InnoDB has similar performance for MySQL 5.6 and 5.7

Read-write with --range-size=10000

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
153     286     311     myrocks-5635
273     484     652     innodb-5635
434     733     863     innodb-5710
- optimized my.cnf
185     330     531     myrocks-5635
278     523     684     innodb-5635
449     784     902     innodb-5710

Summary:
  • InnoDB is a lot faster than MyRocks  because of the CPU overhead for range queries
  • The optimized my.cnf helps MyRocks and InnoDB
  • InnoDB in MySQL 5.7 is much faster than in 5.6. I think something was done to make range queries more efficient in 5.7.

Read-only with --range-size=10

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
11260   19129   27348   myrocks-5635
13846   25056   37479   innodb-5635
14533   25412   37048   innodb-5710
- optimized my.cnf
13778   22240   31544   myrocks-5635
15348   27860   42859   innodb-5635
15320   27187   42294   innodb-5710

Summary:
  • InnoDB is a lot faster than MyRocks. On my work servers with newer/faster CPUs I don't reproduce this and am still trying to understand the cause.
  • The optimized my.cnf helps MyRocks and InnoDB
  • InnoDB has similar performance for MySQL 5.6 and 5.7

Read-only with --range-size=100

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
4393     8245   10191   myrocks-5635
8943    16784   25504   innodb-5635
11288   19771   28019   innodb-5710
- optimized my.cnf
8134    14686   20881   myrocks-5635
9847    18361   26953   innodb-5635
11778   20905   30065   innodb-5710

Summary:
  • InnoDB is a lot faster than MyRocks. See summary for read-only with --range-size=10
  • InnoDB in MySQL 5.7 is faster than 5.6. I think range queries were made more efficient in 5.7

Read-only with --range-size=10000


The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
106     209     242     myrocks-5635
214     411     536     innodb-5635
357     624     711     innodb-5710
- optimized my.cnf
199     380     486     myrocks-5635
227     424     556     innodb-5635
374     648     732     innodb-5710

Summary:
  • InnoDB is a lot faster than MyRocks. See summary for read-only with --range-size=10
  • InnoDB in MySQL 5.7 is faster than 5.6. I think range queries were made more efficient in 5.7

Point-select

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
16482   29111   36510   myrocks-5635
15840   30280   37841   innodb-5635
16347   29451   47411   innodb-5710
- optimized my.cnf
16411   28628   35344   myrocks-5635
18481   34911   43390   innodb-5635
19169   31806   53018   innodb-5710

Summary:
  • InnoDB is a lot faster than MyRocks. See summary for read-only with --range-size=10
  • It isn't clear whether InnoDB is faster in MYSQL 5.6 or 5.7

Insert-only

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
8003    11722   13323   myrocks-5635
9548    11549   13323   innodb-5635
8065     9541    9816   innodb-5710
- optimized my.cnf
8271    12312   13804   myrocks-5635
9712    12732   14917   innodb-5635
8357    10716   11985   innodb-5710

Summary:
  • InnoDB in MySQL 5.6 is faster than MyRocks
  • The optimized my.cnf helps performance
  • InnoDB in MySQL 5.6 is faster than in 5.7

Thursday, February 23, 2017

Using modern sysbench to compare MyRocks and InnoDB on a small server

I used sysbench to compare MyRocks and InnoDB on a small server. I ran tests for an IO-bound and in-memory configuration and share results for the IO-bound setup here.

tl;dr
  • On some workloads InnoDB is faster than MyRocks but this does not reproduce on the faster/newer CPUs that I use at work. I wasn't able to explain this but it looks like the memory-latency bound functions in RocksDB account for more of the CPU overhead on my home servers. Maybe I need to try Intel vTune, I just wish it didn't cost $899.
  • Performance with the default my.cnf is very good for all of the engines. The optimized my.cnf is more useful on the larger servers I use at work. One change in the optimized my.cnf increases the RocksDB page size from 4kb to 16kb which can hurt throughput on IO-bound workloads.
  • MyRocks did much better on inserts, updates and deletes.
  • InnoDB did much better on range scans and better on point selects. There are some changes in MySQL 5.7 that makes range scans faster. The goal for MyRocks is much better space & write efficiency than InnoDB (use less SSD, SSD lasts longer) with similar or good-enough read efficiency. We achieved the space and write efficiency goals. There is more work to be done for read efficiency.
  • There isn't much of a difference for InnoDB between MySQL 5.6.35 and 5.7.10. The difference is more obvious on workloads with more concurrency using larger servers.
  • There is little overhead from compression with MyRocks. There is a lot with InnoDB, especially on write-heavy tests where uncompressed InnoDB is much faster than compressed InnoDB. 
Details

The small server is an Intel NUC (NUC5i3ryh) with a 5th generation core i3 that runs at 2.1GHz. I use these at home because they are small, quiet, efficient and affordable. The server has 2 CPU cores, 4 with HT enabled, 8G of RAM, a 110gb Samsung 850 EVO m.2 SSD and a 7200 RPM disk. The OS is Ubuntu 16.04.

Tests are run for three storage engines. The first is myrocks-5635 which is MyRocks from FB MySQL merged to upstream MySQL 5.6.35. The second is innodb-5635 which is InnoDB from upstream MySQL 5.6.35. The third is innodb-5710 which is InnoDB from upstream MySQL 5.7.10. Command lines for cmake to build each binary are here.

The test was repeated three times for each engine: with the default my.cnf, with an optimized my.cnf and no compression and with an optimized my.cnf and zlib compression. This test is the IO-bound configuration with 4 tables and 40M rows per table. The database is larger than RAM but my test scripts didn't record the database size (need to fix that).
  • myrocks-5635 - the my.cnf files for MyRocks are here. Snappy compression is used for all levels in the default my.cnf. For the optimized my.cnf with zlib compression there is no compression for L0-L2, LZ4 compression for L3 to the next to last level and then zlib for the last level of the LSM tree. Prior to the insert-only tests the database was ~34gb without compression and ~18gb with zlib compression. Compaction IO stats are here, but suffers from a bug that is fixed but not yet in my build.
  • innodb-5635 - the my.cnf files for InnoDB in MySQL 5.6.35 are here
  • innodb-5710 - the my.cnf files for InnoDB in MySQL 5.7.10 are here.
  • innodb-5717 - the my.cnf file for InnoDB in MySQL 5.7.17 is here. In this post I don't share results for 5.7.17 but future posts use it.
  • innodb-801 - the my.cnf file for InnoDB in MySQL 8.0.1 is here. In this post I don't share results for 8.0.1 but future posts use it.
The benchmark client is modern sysbench although I have begun to use my fork. This is run by a helper script that uses a sequence of  workloads via bundled lua scripts. All tests use the uniform distribution except for update-nonindex-special. The default in sysbench is a skewed distribution. But skew means that IO-bound tests (database >> RAM) are less IO-bound because there are more cache hits. So I switched.

Tests are run in this order:
  • prepare - this isn't a test, it creates and loads the tables. For the IO-bound test there were 4 tables with 40M rows per table. The sysbench table has one secondary index.'
  • update-inlist - each update statement updates 100 rows selected by exact match on the PK via an in-list
  • update-one - each update statement updates one row. The same row in the database gets all updates.
  • update-index - uses oltp_update_index.lua to run an update-only workload and secondary index maintenance is required. MyRocks does better on this test because non-unique secondary index maintenance is read-free for it.
  • update-nonindex - uses oltp_update_non_index.lua to run an update-only workload and secondary index maintenance is not required
  • update-nonindex-special - like update-nonindex but uses the special distribution
  • delete - uses oltp_delete.lua to run a sequence of delete row, re-insert row operations. Upstream doesn't use transactions (yet) but the diff to fix that is small.
  • write-only - uses oltp_write_only.lua to get of the writes but none of the reads from oltp_read_write.lua. I have stopped running this test.
  • read-write.rangeX - uses oltp_read_write.lua --range-size=X. I set X too 100 and then 10,000.
  • read-only.rangeX - uses oltp_read_only.lua with --range-size=X. I set X to 10, 100, 1000 and then 10,000. I have changed to only run this for X set to 100 and 10,000. The read-only.range10000 is also run before the update tests so the performance can be measured before fragmentation. The test run before updates is called read-only.pre.
  • point-query - uses oltp_point_select.lua. The workload is to fetch all columns in one row by primary key. This test is run before and after the update tests so the performance can be measured before fragmentation. The test run before updates is called point-query.pre.
  • random-points - uses oltp_inlist_select.lua. Each select statement fetches 100 rows found by exact match on the PK using an in-list. This test is run before and after the update tests so the performance can be measured before fragmentation. The test run before updates is called random-points.pre.
  • scan - this does a full scan of the primary index and is done before and after the update tests to determine the impact from fragmentation.
  • hot-points - uses oltp_inlist_select.lua. Each select statement fetches 100 rows found by exact match on the PK. This is similar to random-points except this fetches the same 100 rows for all queries. The working set for this test is always cached.
  • insert - uses oltp_insert.lua for an insert-only workload
Several of the read-heavy tests are run before and after the write-heavy tests. That is done to understand the impact of fragmentation on query efficiency as the database is fragmented by the write-heavy tests. Some of the read-heavy tests can be slower if the database is still doing B-Tree write back or LSM compaction, and that can happen for tests run before and after the write-heavy tests. While the read-heavy tests are run before the write-heavy tests, they are still run after the load which is also write-heavy and the script above doesn't do much to wait for writes to stop after the load.

Sample command lines:
# myrocks
bash all_small.sh 4 40000000 600 600 300 rocksdb 1 0 ~/b/myrocks/bin/mysql \
    none ~/b/sysbench/share/sysbench

# InnoDB in MySQL 5.6.35 without compression
bash all_small.sh 4 40000000 600 600 300 innodb 1 0 ~/b/orig5635/bin/mysql \
    none ~/b/sysbench/share/sysbench

# InnoDB in MySQL 5.6.35 with zlib compression
bash all_small.sh 4 40000000 600 600 300 innodb 1 0 ~/b/orig5635/bin/mysql \
    key_block_size=8 ~/b/sysbench/share/sysbench

Update-only with secondary index maintenance


The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
4162    6655    8585    myrocks-5635
 543     712     810    innodb-5635
 610     714     816    innodb-5710
- optimized my.cnf without compression
4236    6762    8394    myrocks-5635
 552     687     810    innodb-5635
 579     699     791    innodb-5710
- optimized my.cnf with zlib compression
3540    5665    7148    myrocks-5635
 218     403     533    innodb-5635
 237     389     469    innodb-5710

Summary:
  • MyRocks is best because non-unique secondary index maintenance is write-only (read-free)
  • The optimized my.cnf didn't help
  • Compressed InnoDB is worse than uncompressed InnoDB. The per-index mutex might be the largest problem. This isn't explained by (de)compression latency.
  • InnoDB in MySQL 5.6 and 5.7 have similar performance

Update-only without secondary index maintenance

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
4596    7726    10100   myrocks-5635
2224    2892     3295   innodb-5635
2327    2886     3176   innodb-5710
- optimized my.cnf without compression
4578    7202     9616   myrocks-5635
2887    3835     4364   innodb-5635
2694    3419     4008   innodb-5710
- optimized my.cnf with zlib compression
4031    6320     7998   myrocks-5635
 484     768      888   innodb-5635
 500     687      815   innodb-5710

Summary:
  • MyRocks is best but the difference with InnoDB is smaller because no engines have to do secondary index maintenance for this test.
  • The optimized my.cnf helped InnoDB but hurt MyRocks
  • Compressed InnoDB is much worse than uncompressed InnoDB. The per-index mutex might be the largest problem.
  • InnoDB in MySQL 5.6 is slightly faster than in 5.7

Delete

The numbers in the table are the QPS for 1 thread. I can't use more threads until I update the test script to use transactions for each delete/reinsert pair.

1       concurrency
- default my.cnf
6294    myrocks-5635
 963    innodb-5635
1010    innodb-5710
- optimized my.cnf without compression
5821    myrocks-5635
1077    innodb-5635
1090    innodb-5710
- optimized my.cnf with zlib compression
5453    myrocks-5635
 644    innodb-5635
 588    innodb-5710

Summary:
  • MyRocks is best. That might be because secondary index maintenance is write-only (read-free).
  • The optimized my.cnf didn't make a big difference
  • Compressed InnoDB is much worse than uncompressed InnoDB. The per-index mutex might be the largest problem.
  • InnoDB in MySQL 5.6 and 5.7 have similar performance

Write-only

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
8813    12937   16351   myrocks-5635
1808     2151    2344   innodb-5635
1895     2172    2340   innodb-5710
- optimized my.cnf without compression
8062    12565   16010   myrocks-5635
1886     2158    2644   innodb-5635
1830     2160    2493   innodb-5710
- optimized my.cnf with zlib compression
7250    10977   13965   myrocks-5635
 791     1030    1214   innodb-5635
 724      972    1088   innodb-5710

Summary:
  • MyRocks is best. That might be because secondary index maintenance is write-only (read-free).
  • The optimized my.cnf didn't make a big difference
  • Compressed InnoDB is much worse than uncompressed InnoDB. The per-index mutex might be the largest problem.
  • In some cases InnoDB in MySQL 5.6 is slightly faster than 5.7

Read-write with --range-size=100

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
4740    8490    11516   myrocks-5635
2800    3649     4445   innodb-5635
3392    4424     5036   innodb-5710
- optimized my.cnf without compression
5480    9508    13005   myrocks-5635
3370    4729     5555   innodb-5635
3573    4814     5614   innodb-5710
- optimized my.cnf with zlib compression
5157    8525    11883   myrocks-5635
1756    2415     2961   innodb-5635
1637    2324     2769   innodb-5710

Summary:
  • MyRocks is best because it processes writes faster than InnoDB and the writes dominate the workload
  • The optimized my.cnf helped
  • Compressed InnoDB is much worse than uncompressed InnoDB, perhaps because of the per-index mutex.
  • InnoDB in MySQL 5.6 and 5.7 have similar performance

Read-write with --range-size=10000

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
174     305     416     myrocks-5635
208     348     484     innodb-5635
346     550     723     innodb-5710
- optimized my.cnf without compression
207     370     500     myrocks-5635
241     425     596     innodb-5635
374     607     792     innodb-5710
- optimized my.cnf with zlib compression
194     363     469     myrocks-5635
168     298     416     innodb-5635
213     369     484     innodb-5710

Summary:
  • Compared to oltp_read_write.lua with --range-size=100, the bottleneck here switches from writes to reads and InnoDB is faster than MyRocks for long range scans.
  • The optimized my.cnf helps MyRocks
  • Compressed InnoDB is much worse than uncompressed InnoDB. The largest problem might be the per-index mutex but another problem is (de)compression latency.
  • InnoDB in MySQL 5.7 is much faster than in 5.6.

Read-only with --range-size=10

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
 8407   13967   19378   myrocks-5635
 7725   14452   21220   innodb-5635
 9962   17029   24023   innodb-5710
- optimized my.cnf without compression
 8089   13189   16243   myrocks-5635
11219   19683   27638   innodb-5635
11536   19198   26926   innodb-5710
- optimized my.cnf with zlib compression
 8076   12884   18263   myrocks-5635
 7549   12009   17570   innodb-5635
 7380   11700   17173   innodb-5710

Summary:
  • InnoDB is faster than MyRocks because of the read overhead from an LSM.
  • MyRocks QPS on the read-only tests has extra variance because the state of the memtable (number of keys) and L0 (number of files) has more variance when the read-only test follows write heavy tests.
  • On this test InnoDB does better than MyRocks on my older/slower CPU (5th generation core i3) that I have at home while MyRocks does better on newer Intel Xeon CPUs I get at work.
  • Compressed InnoDB is worse than uncompressed InnoDB. I assume this is decompression latency.
  • InnoDB in MySQL 5.6 and 5.7 has similar performance

Read-only with --range-size=100

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
5130     8834   12320   myrocks-5635
5818    10236   15274   innodb-5635
7682    13208   18709   innodb-5710
- optimized my.cnf without compression
5683     9477   12781   myrocks-5635
7455    13236   19261   innodb-5635
8514    14517   20507   innodb-5710
- optimized my.cnf with zlib compression
5445     8872   12927   myrocks-5635
5140     8150   12456   innodb-5635
5432     8463   12911   innodb-5710

Summary:
  • See summary for oltp_read_only.lua with --range-size=10
  • The difference between InnoDB and MyRocks gets larger as the range scan gets larger

Read-only with --range-size=10000

The numbers in the table are the QPS for 1, 2 and 4 threads.
1       2       4       concurrency
- default my.cnf
138     240     325     myrocks-5635
175     314     445     innodb-5635
306     519     643     innodb-5710
- optimized my.cnf without compression
161     285     363     myrocks-5635
208     378     524     innodb-5635
324     520     685     innodb-5710
- optimized my.cnf with zlib compression
156     290     377     myrocks-5635
146     270     365     in-5636.zlib.opt
191     349     430     innodb-5710

Summary:
  • See summary for oltp_read_only.lua with --range-size=10
  • The difference between InnoDB and MyRocks gets larger as the range scan gets larger
  • InnoDB in MySQL 5.7 is faster than in 5.6. Something was done to make range scans faster.

Point-select

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
10839   18506   25871   myrocks-5635
 8992   16318   24090   innodb-5635
11332   19621   27239   innodb-5710
- optimized my.cnf without compression
10157   16694   22271   myrocks-5635
12944   23440   33489   innodb-5635
13123   22321   31288   innodb-5710
- optimized my.cnf with zlib compression
 9790   16486   22853   myrocks-5635
 7878   13071   18917   innodb-5635
 7484   12358   17911   innodb-5710

Summary:
  • InnoDB is faster than MyRocks
  • The optimized my.cnf helps InnoDB but hurts MyRocks
  • Compressed InnoDB is slower than uncompressed InnoDB. I assume this is decompression latency.
  • InnoDB in MySql 5.6 and 5.7 has similar performance

Insert-only

The numbers in the table are the QPS for 1, 2 and 4 threads.

1       2       4       concurrency
- default my.cnf
7351    10654   12527   myrocks-5635
2120     6343    8264   innodb-5635
2694     6337    7784   innodb-5710
- optimized my.cnf without compression
7556    11341   12556   myrocks-5635
2356     4437    4540   innodb-5635
2227     4615    4807   innodb-5710
- optimized my.cnf with zlib compression
7210    10816   11273   myrocks-5635
3837     4732    5370   innodb-5635
2979     3885    4528   innodb-5710

Summary:
  • MyRocks is faster perhaps because secondary index maintenance is write-only (read-free).
  • The optimized my.cnf hurts InnoDB
  • Compressed InnoDB is slower than uncompressed InnoDB. I assume this is from both the per-index mutex and (de)compression latency.
  • InnoDB in MySQL 5.6 and 5.7 has similar performance

Wednesday, February 8, 2017

MyRocks, InnoDB and in-memory insert performance

This is yet another result to understand an insert-only workload. In this case the database is small enough to be cached by the storage engine. Previous results were explained here, here and here. From several tests it is apparent that MyRocks and InnoDB from MySQL 5.6 have similar throughput while InnoDB from MySQL 5.7 does much better for cached & insert-only when the server has fast SSD.

In this result I changed a few options to reduce the chance that compaction stalls will occur with MyRocks. I increased level0_slowdown_writes_trigger to 20 and level0_stop_writes_trigger to 30. I also ran tests with universal compaction. We probably have some work to do here -- improving default option values and making compaction throttling less spiky.

tl;dr
  • MyRocks was faster than InnoDB in MySQL 5.6 and 5.7 at 1 and 2 concurrent clients
  • MyRocks and InnoDB/MySQL-5.6 have similar throughput at 4+ concurrent clients
  • InnoDB/MySQL-5.7 is fastest at 8+ concurrent clients
  • Disabling the binlog helps MyRocks more than InnoDB. We have work in progress to make binlog processing faster for MyRocks.
  • MyRocks with leveled compaction and the binlog is limited by compaction stalls in moving data from level 0 to level 2.
Note that MyRocks did great when able to avoid the bottlenecks from binlog processing and compaction stalls. That is apparent in the result for universal compaction with the binlog disabled. 

Test

For this test I inserted 400M rows using the insert benchmark. The test was run for 1, 2, 4, 8, 12 and 16 concurrent clients. There was a table per client and each table has 3 secondary indexes. The inserts are in PK order. The database cache was large enough to cache the database.

Tests were repeated with the binlog enabled and disabled. For MyRocks I used a build from November 21 to match what was used in previous tests. For MyRocks I also ran tests with leveled and universal compaction. This is the first time I tried universal compaction with MyRocks.

Results

The numbers in the table below are the average insert rate, larger is better.

1       2       4       8       12      16      concurrency
34465   61843   106185  131536  131926  121988  innodb-5.6.26, binlog
35217   63857   108666  133200  133023  125196  innodb-5.6.26, no binlog
31461   60855   114058  201918  258398  281690  innodb-5.7.10, binlog
32962   62490   117028  206292  263678  282486  innodb-5.7.10, no binlog

leveled compaction
38465   71467   113186  125392  125865  125984  myrocks, binlog
44092   81950   132275  142298  139519  134998  myrocks, no binlog

universal compaction
37474   71162   115042  133556  135181  134003  myrocks, binlog
43006   79888   134590  198511  238379  252207  myrocks, no binlog

Graphs

These graphs show the row insert rate per 5 second interval for the test with 16 clients. The graph for InnoDB-5.7 ends earliest because it finishes the load first. The graph for MyRocks with leveled compaction has the most variance when ingest is faster than compaction from level 0 to level 2. Ignoring the variance, MyRocks throughput declines less over time than InnoDB.

The next graph is for InnoDB-5.7. While throughput declines over time there is not much variance.
The next graph is for InnoDB-5.6. It has more variance and loses more throughput over time compared to InnoDB-5.7. MySQL has been busy making InnoDB better.
The next graphs are for MyRocks with leveled and then universal compaction. There are more stalls with leveled compaction. The average throughput is similar and binlog processing is the bottleneck. We have work in progress to improve leveled compaction and the overhead from the binlog.


Thursday, February 2, 2017

Why MyRocks?

This is yet another attempt to explain why MyRocks is interesting. MySQL continues to get more popular and should soon become #1 on the db-engines ranking. Popularity isn't the only thing about MySQL that is improving -- performance, manageability and availability are also much better and cloud vendors, especially Amazon, are doing remarkable work to make MySQL in the cloud much easier than it used to be.
While Oracle has been a great owner of MySQL we still benefit from external contributions to the core DBMS. But new storage engines are unlikely, even ones that had great promise like TokuDB and PBXT, because the storage engine API is hard to implement. I am surprised and thrilled by the progress we are making with MyRocks. It is in production for us today and I expect it to be in production elsewhere within the next 12 months. I am grateful to work with talented teams (MyRocks, MySQL & RocksDB) and understanding management.

Efficient performance is the reason for MyRocks. We want to provide performance similar to InnoDB but with much better storage efficiency. The technical message is that MyRocks has less space and write amplification than InnoDB without sacrificing too much read amplification. For a workload I care about it uses 1/2 the space and writes at 1/10 the rate of InnoDB. The less technical message is that with MyRocks a deployment needs less SSD and the SSD will last longer.

The other important question is when to use MyRocks. My goal is for MyRocks to be an alternative to InnoDB for any workload where the database is larger than RAM. While we have optimizations for in-memory workloads with RocksDB I don’t focus on that today. I also assume that too many workloads use too much RAM today. If you have fast storage with SSD and the database fits in RAM then you probably have too much RAM. So my pitch is to use more fast storage and less RAM and use less space, less power and possibly fewer servers.

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...