Friday, March 3, 2017

Part 5: sysbench, a larger server, and IO-bound database and really fast storage

I used a database that doesn't fit in the DBMS cache for part 5 in my unending series on sysbench and MyRocks. In this case I used really fast storage - the database fits in the OS page cache, reads are served from the OS page cache and writes are done to the fast SSD. A previous post has more details on the hardware.

tl;dr
  • MyRocks is much faster than compressed InnoDB except for InnoDB-5.7 with long range scans
  • MyRocks had better or similar throughput than uncompressed InnoDB-5.6 for update-only, insert-only and short range scans. It was slower for long range scans and the point query test.
  • MyRocks was faster than uncompressed InnoDB-5.6 for the update-only tests and slower for insert-only and tests with range/point queries.
  • MyRocks was faster than TokuDB for all tests except the longest range queries

Disclaimer

The goal for MyRocks is the best space efficiency, great write efficiency and good enough read efficiency. The best space efficiency means it needs less SSD. Great write efficiency means that SSD lasts longer, and I just replaced 3 SSDs on my home servers so I appreciate the value of that. Good enough read efficiency is harder to define and my goal over time has become better than good enough. I want MyRocks to match uncompressed InnoDB in response time. We aren't there yet but we are making progress.

For this workload I only share data on read efficiency. MyRocks loses one of its advantages given that the database is in the OS page cache -- when you spend less on writes you can spend more on reads. But when the database fits in the OS page cache there will be no reads from the storage device.

Tests

The tests are fully described in a previous post. In this case I used 8 tables with 10M rows per table and a 2gb DBMS cache. Tests were repeated with no compression, fast compression and slow compression. Fast compression was lz4 for MyRocks and snappy for TokuDB. Slow compression was zlib for all engines. Tests were run for 1 to 128 connections (threads) on a server with 24 CPU cores and 48 HW threads. All engines used jemalloc. The binlog was enabled but fsync for the binlog and storage engine was disabled.

Tests were run for several storage engines:
  • myrocks-5635 - MyRocks from FB MySQL merged to upstream MySQL 5.6.35. The full my.cnf was listed in a previous in a previous post and the paste is here. Then I edited that to reduce the RocksDB block cache and enable or disable compression using this.
  • innodb-5626 - InnoDB from upstream MySQL 5.6.26. In a few weeks or months I will upgrade to a more recent 5.6 build.
  • innodb-5717 - InnoDB from upstream MySQL 5.7.17
  • tokudb-5717 - TokuDB from Percona Server 5.7.17-11. The my.cnf is here.

Update-only with secondary index maintenance

The numbers in the table are the QPS for 1 to 128 threads.

   1       2        4       8      16      24      32      40      48      64      80      96     128   threads
- none
2804    9616    18823   33177   38211   41248   44460   44723   44151   43671   43289   43150   42621   myrocks-5635
2355    4903     8127   12703   15987   28899   29468   30260   30785   30991   30710   29922   28946   innodb-5626
3413    7834    14519   23386   32131   36263   36582   37110   37751   39091   40058   40945   42019   innodb-5717
2099    3958     7085   13135   20494   26263   28408   28723   28472   27786   26615   25577   23243   tokudb-5717
- lz4/snappy
2801    9638    18841   33199   38234   41270   44190   44347   44047   43404   42889   42601   42199   myrocks-5635
2070    3921     7004   12988   20146   25857   28196   28545   28322   27784   26663   25550   23293   tokudb-5717
- zlib
2683    8937    17422   31287   37139   39406   41880   42230   42337   41907   41877   41482   41040   myrocks-5635
 328    1004     1971    3346    4782    5343    5666    5863    6016    6154    6239    6260    6215   innodb-5626
 455    1164     2302    4040    5994    6513    6944    7184    7319    7516    7534    7596    7528   innodb-5717
1988    3741     6665   12441   19575   24062   26028   26211   26129   25727   24554   23746   21799   tokudb-5717

Summary:
  • Uncompressed: MyRocks >> InnoDB-5.7 >> InnoDB-5.6 >> TokuDB
  • Compressed: MyRocks >> TokuDB >> InnoDB
  • Compressed InnoDB might suffer from the per-index mutex
  • MyRocks benefits because secondary index maintenance is read-free (write-only)

Update-only without secondary index maintenance

The numbers in the table are the QPS for 1 to 128 threads.

   1       2        4       8      16      24      32      40      48      64      80      96     128   threads
- none
2772    9935    19432   35272   43243   46621   49285   51651   51780   51154   50176   49766   48711   myrocks-5635
3888    7639    12655   17174   28267   31354   34335   36421   34686   38179   39268   40883   41326   innodb-5626
2917    6166    18918   35674   55660   69910   74162   74654   74734   76431   78150   79040   80607   innodb-5717
2071    4089     7508   14548   24884   31050   34702   37987   40152   41042   39219   37496   33944   tokudb-5717
- lz4/snappy
2779    9948    19471   35516   42875   46143   48964   51227   51290   50592   49984   49372   48504   myrocks-5635
2062    4045     7432   14484   24403   30637   34182   37107   39212   40452   38857   37256   33901   tokudb-5717
- zlib
2660    9235    18026   33259   41518   44511   46920   48697   48865   48870   48668   47668   47119   myrocks-5635
 600    1125     2034    3754    5356    5946    6184    6278    6355    6417    6446    6433    6384   innodb-5626
 688    1292     2652    4776    6950    7825    8186    8329    8435    8551    8535    8518    8408   innodb-5717
1964    3839     7112   13739   23446   28967   32118   34603   36506   37325   36384   35010   32236   tokudb-5717

Summary:
  • Uncompressed <= 4 threads: MyRocks >> InnoDB-5.7 >> InnoDB-5.6 >> TokuDB
  • Uncompressed >= 8 threads: InnoDB-5.7 >> MyRocks >> InnoDB-5.6, TokuDB
  • Compressed: MyRocks >> TokuDB >> InnoDB
  • Compressed InnoDB might suffer from the per-index mutex
  • MyRocks doesn't benefit from read-free secondary index maintenance because there is no index maintenance.

Read-write with --oltp-range-size=100

The numbers in the table are the QPS for 1 to 128 threads.

   1       2        4       8      16      24      32      40      48      64      80      96     128   threads
- none
4615    9749    19889   44836   95983  128906  150236  161720  164109  164003  165044  164140  163066   myrocks-5635
4927   12937    28513   54213   85564   82470   85415  109080  124297  132075  132059  132434  130368   innodb-5626
7686   15402    31015   61110  106977  144695  167321  185388  197180  205404  207852  208216  207327   innodb-5717
4403    7548    15377   36443   66814   82179   92476   96790   97598   97450   96157   93420   91241   tokudb-5717
- lz4/snappy
4581    9728    19962   44480   95368  128023  149742  160837  164039  162767  162522  164242  161043   myrocks-5635
4087    7413    14971   35184   64774   79351   87971   92652   94746   94083   93201   91028   89282   tokudb-5717
- zlib
4300    8899    18466   40781   84814  113870  129297  143581  144207  140118  145012  142208  141812   myrocks-5635
2008    3917     7290   13214   20586   24610   26661   27897   28638   29370   30041   30478   30752   innodb-5626
2175    4234     8150   15318   25087   30660   33600   35528   36792   38187   38967   39302   39509   innodb-5717
3580    6824    14221   30690   55008   68552   76448   81532   82780   83096   81667   80006   79417   tokudb-5717

Summary:
  • Uncompressed <= 8 threads: InnoDB-5.7 >> InnoDB-5.6 >> MyRocks >> TokuDB
  • Uncompressed >= 16 threads: InnoDB-5.7 >> MyRocks >> InnoDB-5.6 >> TokuDB
  • Compressed: MyRocks >> TokuDB >> InnoDB
  • Compressed InnoDB might suffer from the per-index mutex

Read-write with --oltp-range-size=10000

The numbers in the table are the QPS for 1 to 128 threads.

   1       2        4       8      16      24      32      40      48      64      80      96     128   threads
- none
 250     503      991    1932    3771    5083    5937    6400    6354    6605    6522    6276    6556   myrocks-5635
 267     439     1070    2258    4403    5974    6889    7556    7680    7509    7212    6965    6466   innodb-5626
 471     948     1866    3675    6951    8974    9959   10497   10673   10590   10569   10470   10493   innodb-5717
- lz4/snappy
 237     473      946    1861    3597    5089    6025    6267    6465    6802    6450    6472    6643   myrocks-5635
 329     646     1292    2378    4546    6003    6504    6775    6906    6807    6751    6671    6620   tokudb-5717
- zlib
 224     457      907    1775    3459    4709    5243    5928    6008    5822    6052    5823    5639   myrocks-5635
 139     326      668    1376    2723    3851    4471    4845    5039    4997    4821    4822    4822   innodb-5626
 176     431      904    1820    3835    5204    5830    6237    6487    6515    6469    6412    6357   innodb-5717
 276     570     1136    2149    4065    5396    5861    6146    6272    6205    6157    6112    6042   tokudb-5717

Summary:
  • Uncompressed: InnoDB-5.7 >> InnoDB-5.6 >> MyRocks
  • Compressed <= 32 threads: TokuDB >> InnoDB-5.7 >> MyRocks >> InnoDB-5.6
  • Compressed >= 40 threads: InnoDB-5.7 >> TokuDB >> MyRocks >> InnoDB-5.6
  • MyRocks does better here relative to others than on the same test with a cached database.
  • Something was done in MySQL 5.7 to make InnoDB more efficient for long range scans.

Read-only with --oltp-range-size=10

The numbers in the table are the QPS for 1 to 128 threads.

   1       2        4       8      16      24      32      40      48      64      80      96     128   threads
- none
5583   12053    25340   60323  125323  173556  201507  228848  240600  260551  268105  270922  270802   myrocks-5635
5802   11302    23312   45754   86042  119485  132435  182210  278491  282393  285004  286285  286280   innodb-5626
5998   12663    27213   67057  139873  191166  229855  267232  302766  304396  304294  308140  310807   innodb-5717
4619   10658    22704   52257  108814  152185  180060  207094  229817  232031  234555  238575  233373   tokudb-5717
- lz4/snappy
5533   11988    25118   59446  123771  170236  200173  226406  241030  257188  266852  268078  265281   myrocks-5635
5019   10563    22409   51342  107286  150102  177104  204018  227579  229317  231358  232494  228827   tokudb-5717
- zlib
4989   10682    22073   50882  106786  150247  177378  202741  221048  230274  229999  230863  231839   myrocks-5635
3104    8163    18296   40498   84292  117789  136762  155641  172568  173673  174538  174994  175127   innodb-5626
4094    8760    17970   40191   85050  120897  141525  161327  179816  180670  180304  180439  182237   innodb-5717
4490    9828    20436   45973   97078  136122  160800  185617  207077  208896  211602  212022  213916   tokudb-5717

Summary:
  • Uncompressed <= 8 threads: InnoDB-5.7 >> MyRocks >> TokuDB, InnoDB-5.6
  • Uncompressed 16 to 40 threads: InnoDB-5.7 >> MyRocks >> TokuDB >> InnoDB-5.6
  • Uncompressed >= 48 threads: InnoDB-5.7 >> InnoDB-5.6 >> MyRocks >> TokuDB
  • Compressed: MyRocks >> TokuDB >> InnoDB

Read-only with --oltp-range-size=100

The numbers in the table are the QPS for 1 to 128 threads.

   1       2        4       8      16      24      32      40      48      64      80      96     128   threads
- none
4557    9527    19771   44629   95037  133472  155356  175533  189690  196465  196471  197426  196431   myrocks-5635
5076   10543    21840   49775  101828  141198  166103  190612  213432  211546  215583  216149  216561   innodb-5626
5384   11440    23741   56913  118847  165338  193619  220664  246599  250321  251309  249837  252654   innodb-5717
4525    9456    19643   43853   91855  128489  148978  168766  185897  185482  189092  190814  187785   tokudb-5717
- lz4/snappy
4486    9364    19322   43591   93011  131252  151314  173058  189554  193018  193373  192635  191932   myrocks-5635
4445    9270    19275   42894   90120  126824  146602  164909  180813  182457  183962  184567  182690   tokudb-5717
- zlib
3959    8205    16871   36958   79526  113109  130868  148129  164466  165194  165251  164002  164486   myrocks-5635
3438    6997    14207   30336   63505   90598  103758  116782  128542  128625  129293  130163  130199   innodb-5626
3553    7291    14880   32345   68779   99275  113500  127408  139032  140490  141352  141673  140846   innodb-5717
4073    8413    17317   38102   80324  113206  132046  149543  163231  165632  166191  169175  169615   tokudb-5717

Summary:
  • Uncompressed <= 8 threads: InnoDB-5.7 >> InnoDB-5.6 >> MyRocks >> TokuDB
  • Uncompressed 16 to 40 threads: InnoDB-5.7 >> InnoDB-5.6 >> MyRocks >> TokuDB
  • Uncompressed >= 48 threads: InnoDB-5.7 >> InnoDB-5.6 >> MyRocks >> TokuDB
  • Compressed: MyRocks, TokuDB >> InnoDB

Read-only with --oltp-range-size=10000

The numbers in the table are the QPS for 1 to 128 threads.

   1       2        4       8      16      24      32      40      48      64      80      96     128   threads
- none
 197     396      784    1553    3068    4211    4696    5069    5303    5260    5248    5203    5150   myrocks-5635
 227     456      908    1799    3525    4798    5538    6114    6343    6310    6287    6200    6182   innodb-5626
 381     756     1489    2926    5641    7272    7949    8407    8604    8531    8476    8442    8254   innodb-5717
 287     577     1151    2262    4419    5836    6335    6616    6695    6575    6542    6496    6427   tokudb-5717
- lz4/snappy
 189     378      749    1486    2941    4035    4555    4922    5160    5127    5096    5070    5005   myrocks-5635
 278     559     1112    2187    4276    5670    6149    6436    6558    6427    6405    6343    6278   tokudb-5717
- zlib
 159     318      628    1251    2480    3470    3886    4183    4405    4403    4340    4299    4283   myrocks-5635
 140     282      566    1117    2210    3092    3538    3853    4109    4102    4051    3903    4015   innodb-5626
 187     375      744    1488    2925    3992    4460    4716    4890    4914    4908    4881    4835   innodb-5717
 242     485      971    1920    3770    5070    5564    5836    5997    5889    5841    5791    5744   tokudb-5717

Summary:
  • Uncompressed: InnoDB-5.7 >> TokuDB >> InnoDB-5.6 >> MyRocks
  • Compressed: TokuDB >> InnoDB-5.7 >> MyRocks >> InnoDB-5.6

Point-query

The numbers in the table are the QPS for 1 to 128 threads.

   1       2        4       8      16      24      32      40      48      64      80      96     128   threads
- none
6412   13400    28931   72991  152003  206627  245257  282682  317475  320832  320487  319250  325351   myrocks-5635
6831   14592    31265   80304  165013  218684  259701  302272  341855  347161  348867  350069  346924   innodb-5626
6591   13831    29885   77583  159026  215990  258119  300289  336777  347714  351820  353239  353298   innodb-5717
5950   12385    26277   62763  130650  180310  212967  244552  271554  276659  276588  278345  281433   tokudb-5717
- lz4/snappy
6395   13451    28854   72695  151874  205623  245214  283065  317367  314263  319040  324845  323703   myrocks-5635
5828   12137    25693   60984  127016  175532  207269  237836  265395  269802  271339  273249  274790   tokudb-5717
- zlib
5859   12482    26097   62841  134703  183953  219125  253690  284868  288741  285110  291383  292424   myrocks-5635
4658    9672    19931   45142   97018  135486  158160  180528  200518  200171  203003  204169  204478   innodb-5626
4557    9471    19508   44014   94149  132992  155439  177183  197399  198790  200071  200243  200104   innodb-5717
5378   11181    23504   54303  114715  159725  189099  217428  242933  246185  248465  250252  255635   tokudb-5717

Summary:
  • Uncompressed <= 40 threads: InnoDB-5.6 >> InnoDB-5.7 >> MyRocks >> TokuDB
  • Uncompressed >= 48 threads: InnoDB-5.6, InnoDB-5.7 >> MyRocks >> TokuDB
  • Compressed: MyRocks >> TokuDB >> InnoDB

Insert-only

The numbers in the table are the QPS for 1 to 128 threads.


   1       2        4       8      16      24      32      40      48      64      80      96     128   threads
- none
5206   16893    31320   41382   48020   52252   54483   54546   54497   54196   54106   53575   53047   myrocks-5635
5743   12052    32062   49692   56744   57288   56026   53698   51452   50519   48907   47300   44191   innodb-5626
5305   15574    29411   53523   80249   91492   93392   89487   86214   84496   84736   85002   84752   innodb-5717
2938    5797     8947   17728   29184   37570   38571   35497   33231   30409   27484   24989   21330   tokudb-5717
- lz4/snappy
5210   16827    31047   40876   47854   52096   54152   54273   54120   53872   53554   53346   52546   myrocks-5635
2925    5740     8963   17824   29350   37647   38781   35571   33359   30580   27645   25080   21443   tokudb-5717
- zlib
5226   16682    31500   40534   47009   51833   53281   53719   53613   53433   53463   52761   52412   myrocks-5635
3256    5874    12234   20365   28593   31562   32943   33475   33195   33641   32953   33919   34103   innodb-5626
3297    6442    12727   22377   34871   40052   42320   43433   42824   43592   46455   46183   45197   innodb-5717
2920    5801     8931   17724   28971   37195   38456   35110   32937   30220   27492   24924   21337   tokudb-5717

Summary:
  • Uncompressed <= 40 threads: InnoDB-5.7 >> InnoDB-5.6 >> MyRocks >> TokuDB
  • Uncompressed >= 48 threads: InnoDB-5.7 >> MyRocks >> InnoDB-5.6 >> TokuDB
  • Compressed <= 8 threads: MyRocks >> InnoDB >> TokuDB
  • Compressed 16 to 40 threads: MyRocks >> InnoDB-5.7 >> TokuDB >> InnoDB-5.6
  • Compressed >= 48 threads: MyRocks >> InnoDB-5.7 >> InnoDB-5.6 >> TokuDB

4 comments:

  1. Does MyRocks compression reduce memory usage? I seem to recall that InnoDB keeps copies of both compressed and uncompressed data in memory which could increase memory usage. Is it the same case with MyRocks?

    ReplyDelete
    Replies
    1. Great question. With InnoDB table compression if a page is in the buffer pool then there must be a copy of the compressed version of it and there is optionally a copy of the uncompressed version of it. So one of the following is true for each page in the buffer pool:
      * compressed and uncompressed copy of page are cached
      * compressed copy of page is cached

      With RocksDB:
      1) I assume we are using the OS page cache for compressed pages. Note there is an option to cache them in the RocksDB block cache but that is not widely used.
      2) RocksDB block cache has uncompressed version of blocks (see #1)
      3) OS page cache has some filesystem pages. Note that RocksDB blocks are not aligned to filesystem page boundaries.

      So one of the following is true:
      * uncompressed block in block cache, all of the filesystem pages that overlap it are in the OS page cache
      * uncompressed block in block cache, some of the filesystem pages that overlap it are in the OS page cache
      * uncompressed block in block cache, none of the filesystem pages that overlap it are in the OS page cache

      Delete
    2. If the goal is to reduce memory usage/ fit more pages in RAM, would you recommend turning MyRocks compression on?

      Also is it better to use block cache (is that the equivalence of InnoDB buffer cache?) or just use OS file cache?

      Delete
    3. My current approach - http://smalldatum.blogspot.com/2016/09/tuning-rocksdb-block-cache.html

      Delete

Speedb vs RocksDB on a large server

I am happy to read about storage engines that claim to be faster than RocksDB. Sometimes the claims are true and might lead to ideas for mak...