Monday, August 31, 2015

Cached linkbench performance for MySQL 5.7.8, 5.6, WebScale and MyRocks

This extends previous results for Linkbench to compare performance for a cached database with concurrent clients. My conclusions are:
  • InnoDB compression in the Facebook patch for MySQL 5.6 is much faster for insert-heavy workloads than the same feature in upstream 5.6 and 5.7. Too bad those changes might not reach upstream
  • InnoDB transparent page compression is faster than non-transparent for write-heavy workloads assuming that feature is OK to use on your servers.
  • QPS for MyRocks suffers over time. We have work-in-progress to fix this. Otherwise it is already competitive with InnoDB. Compression with MyRocks is much better than InnoDB for linkbench data. That has also been true for real workloads.
  • Load rates are higher for compressed InnoDB tables when partitioning is used for 5.6 but not for 5.7. I didn't debug the slowdown in 5.7. It has been a win in the past for IO-bound linkbench because it reduces contention on the per-index mutex in InnoDB. Work has been done in 5.7 to reduce the contention on the per-index mutex.

Setup

The database size was between 10G and 30G after the load. The test was run with maxid=20000001, loaders=10 & requesters=20. Otherwise the default settings for linkbench were used. The InnoDB buffer pool was large enough to cache the database. The server has 144G of RAM, fast PCIe flash storage and 40 HW threads with HT enabled. The binlog was enabled but fsync was not done for the binlog or InnoDB redo log on commit. I tested several configurations for compression and partitioning:
  • p0.c0 - no partitioning, no compression
  • p0.c1 - no partitioning, compression (transparent & non-transparent)
  • p1.c0 - partitioning, no compression
  • p1.c1 - partitioning, compression (transparent & non-transparent)
I tested several binaries:
  • myrocks - the Facebook patch for MySQL, 5.6.X and the RocksDB storage engine
  • fb56 - the Facebook patch for MySQL, 5.6.X and InnoDB
  • orig56.ps - upstream 5.6.26 with the performance schema (PS) enabled
  • orig57.ps - upstream 5.7.8 with non-transparent compression and PS enabled
  • orig57.tc - upstream 5.7.8 with transparent compression and PS enabled
The test was done in two parts. First I measure the load performance, then I run the query test for 12 1-hour intervals. The data below is the insert rate from the load (load ips), the size after load (load gb), the QPS rate during the second and twelfth 1-hour runs (2h qps, 12h qps) and the size after the second and twelfth 1-hour runs (2h gb, 12h gb).

p0.c0
load    load    2h      2h      12h     12h
ips     gb      qps     gb      qps     gb      config
136041  14      43784   18      24298   20      myrocks
109724  22      49881   31      48459   51      fb56
103030  24      39979   34      39582   54      orig56.ps
116343  24      48506   35      48112   58      orig57.ps

p0.c1
load    load    2h      2h      12h     12h
ips     gb      qps     gb      qps     gb      config
 73115  15      42508   20      35766   32      fb56
 45660  16      36474   22      33107   34      orig56.ps
 46737  16      40890   22      37305   36      orig57.ps
101966  17      33716   23      29695   37      orig57.tc

p1.c0
load    load    2h      2h      12h     12h
ips     gb      qps     gb      qps     gb      config
101783  26      34342   30      21883   36      myrocks
105099  24      48686   33      47369   52      fb56
 97931  27      39343   36      39000   55      orig56.ps
109230  27      46671   37      46155   59      orig57.ps

p1.c1
load    load    2h      2h      12h     12h
ips     gb      qps     gb      qps     gb      config
 91884  15      46852   21      45223   36      fb56
 74080  17      39379   23      38627   38      orig56.ps
 77037  17      45156   24      44070   40      orig57.ps
 87708  19      37062   25      32424   40      orig57.tc

Graphs!

And for people who want graphs this has the average insert rate from the load and the average query rate from the twelfth hour for the p0.c0 test (no partitioning, no compression).



No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...