Thursday, August 24, 2017

Sysbench, a large server and fast SSD vs MyRocks, InnoDB and TokuDB

I have new performance results to share starting with modern sysbench, a large server, fast SSD and a database that is larger than RAM. Competitive read performance with much better space and write efficiency is why I like MyRocks and this is an example of that. However, we have work to do on read efficiency that will be apparent in future perf reports.

tl;dr
  • For write-heavy workloads MyRocks competes with InnoDB 5.7 and 8.0 and does better than 5.6 at all concurrency levels. 
  • For read-heavy workloads MyRocks is usually within 10% of InnoDB 5.6.
  • For write-heavy workloads InnoDB in 5.7 and 8.0 do better than 5.6.
  • For some read-heavy workloads InnoDB has a perf regression from 5.6 to 5.7/8.0.
  • TokuDB usually has the worst QPS

Configuration

I use my sysbench fork and helper scripts, release specific my.cnf files and a server with 48 HW threads, fast SSD and 256gb of RAM although only 50gb was available to the database and OS page cache. The binlog was enabled and sync-on-commit was disabled for the binlog and database log. I remembered to disable SSL.

I tested MyRocks, TokuDB and InnoDB, with buffered IO and a 10g database cache for MyRocks/TokuDB and O_DIRECT and a 35gb buffer pool for InnoDB. The server is shared by the sysbench client and mysqld. For MyRocks I used a build from August 15 with git hash 0d76ae and repeated the test first without compression and then no compression for L0/L1/L2, LZ4 for interior levels and zstd for the max level of the leveled LSM. For TokuDB I used Percona Server 5.7.17-12 and repeated the test first without compression and then with zlib compression. For InnoDB I used upstream 5.6.35, 5.7.17 and 8.0.2. For InnoDB 8.0.2 I used latin1 charset and latin1_swedish_ci collation. Compression was not used for InnoDB. More details are in the release specific my.cnf files and I used the same my.cnf for InnoDB with 8.0.1 and 8.0.2.

The test used 8 tables with 100M rows/table. My use of sysbench is explained here. Tests are run in an interesting pattern -- load, write-heavy, read-only, insert-only. On the large server each test is run for 1, 2, 4, 8, 16, 24, 32, 40, 48 and 64 concurrent connections for either 3 or 5 minutes per concurrency level. So each test runs for either 30 or 50 minutes total and I hope that is long enough to get the database into a steady state. An example command line to run the test with my helper scripts is:
bash all.sh 8 100000000 180 300 180 innodb 1 0 /orig5717/bin/mysql none /sysbench.new

Results without pictures

I have a lot of numbers to share and struggled with the presentation. All of the data is here. I will start with relative QPS - the QPS for an engine versus the QPS for InnoDB from upstream 5.6.35. There is one column for MyRocks without compression, one for TokuDB without compression and one for InnoDB from upstream 5.7.17. I did not include columns for compressed MyRocks, compressed TokuDB and InnoDB from upstream 8.0.2 to improve readability and because MyRocks/TokuDB QPS with compression is similar to it without compression and InnoDB QPS from 8.0.2 is similar to InnoDB from 5.7.17.

When the QPS ratio is greater than one then the engine is faster than InnoDB 5.6.35. I show ratios for 1, 8 and 48 connections to compare low, medium and high concurrency workloads. Things that I notice in the results include:
  • For write-heavy workloads MyRocks competes with InnoDB 5.7 and 8.0 and does better than 5.6 at all concurrency levels.
  • For read-heavy workloads MyRocks is usually within 10% of InnoDB 5.6. 
  • For write-heavy workloads InnoDB in 5.7 and 8.0 do better than 5.6.
  • For some read-heavy workloads InnoDB has a perf regression from 5.6 to 5.7/8.0.
  • TokuDB usually has the worst QPS

QPS ratio:
* rocks = myrocks.none / inno5635
* inno = inno5717 / inno5635
* toku = toku5717.none / inno5635

1 connection
rocks   inno    toku    engine/test
2.516   1.338   1.395   update-index
1.103   1.210   0.482   update-nonindex
0.907   1.067   0.376   update-nonindex-special
1.325   1.099   0.337   delete-only
0.942   1.149   0.317   read-write.range100
1.064   1.157   1.210   read-write.range10000
0.798   0.968   0.323   read-only.range100
0.980   0.946   1.280   read-only.range10000
0.850   0.917   0.226   point-query
0.933   0.900   0.300   random-points
0.457   0.904   0.608   hot-points
0.890   1.009   0.317   insert-only

8 connections
rocks   inno    toku    engine/test
4.892   2.862   2.304   update-index
1.769   1.681   0.169   update-nonindex
1.382   1.506   0.476   update-nonindex-special
1.409   1.178   0.250   delete-only
0.989   1.082   0.202   read-write.range100
0.893   1.141   1.055   read-write.range10000
0.838   0.895   0.279   read-only.range100
0.870   1.046   1.258   read-only.range10000
0.857   0.840   0.215   point-query
0.911   0.824   0.245   random-points
0.621   1.194   0.819   hot-points
0.828   1.056   0.313   insert-only

48 connections
rocks   inno    toku    engine/test
2.294   2.223   0.817   update-index
2.685   2.726   0.393   update-nonindex
2.172   3.031   0.734   update-nonindex-special
2.830   3.054   0.283   delete-only
2.758   2.706   0.192   read-write.range100
1.113   1.394   0.577   read-write.range10000
0.810   0.996   0.101   read-only.range100
0.893   1.203   0.589   read-only.range10000
0.845   0.954   0.089   point-query
0.864   0.941   0.125   random-points
1.412   3.166   2.055   hot-points
1.111   1.855   0.579   insert-only

Results with pictures

Some people prefer graphs, so there is the relative QPS as a chart. I truncated the x-axis at 2 to make it easier see differences. The first chart is from the test with 1 connection.
At 8 connections
At 48 connections

2 comments:

  1. As RocksDB is write-optimized I'd expected MyRocks to be significantly faster than InnoDB in write-heavy tests.

    But your results show that InnoDB is actually faster in many write-heavy tests such as update-nonindex-special & insert-only

    Why is that?

    ReplyDelete
    Replies
    1. Most writes (update, delete, insert) in SQL are read-modify-write (RMW) -- secondary index maintenance, enforcement of unique and PK constraints. InnoDB makes the read part of that faster. Write-optimized will make the write side more efficient (from a storage perspective) and sometimes faster. Running this on slower storage might make the write-optimizations more important and favor MyRocks, but the storage in this case was fast.

      MyRocks was faster for update-index because non-unique secondary index maintenance is read-free, so in that case write-optimized helps on the read side of RMW.

      I have shared results in the past and will again that show where MyRocks is able to sustain higher write rates. The insert benchmark is one example. But it doesn't make all write-heavy workloads faster.

      Delete

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...