tl;dr - RocksDB did better than InnoDB for a write-heavy workload and a range-scan heavy workload. The former is expected, the latter is a welcome surprise. I hope we get RocksDB into a supported MySQL distribution to make it easier for you to try it.
Graphs
If you just want the graphs here they are.
Test pattern
I used the small server (Intel NUC, 2 cores, 8G RAM, m.2 Samsung SSD) for this test.`The workload is the insert benchmark using the pattern described below. I use this pattern to include a mix of workloads (write-only, reads with write-heavy, reads without write-heavy). I don't include a read-only step. The mixed workload (reads + writes) is very important especially when the database is larger than RAM:
- Insert 250M rows into a table in PK order. The table has 3 secondary indexes and the inserts are in random order for each of the secondary indexes. Measure insert performance.
 - Insert 5M rows into the table with the writer limited to 1000/second and measure query performance for short range queries. The range queries use LIMIT 10 and randomly select the index (secondary or PK) and the starting offset for the range. The index is covering for the query. Assuming the writer is able to do 1000/second this takes 5000 seconds. For this step I care about throughput and response time for reads. For writes my main concern is whether the rate was able to sustain the rate of 1000/second.
 - Same as #2 except 500k rows are inserted and the writer is limited to 100/second. Assuming the writer can do 100/second this also takes 5000 seconds.
 
Performance & Efficiency
I am interested in performance and efficiency. I include both. Sometimes better efficiency explains better performance. Sometimes we need better efficiency even if that comes at the cost of decreased performance.
The metrics I use for performance are throughput and response time. For performance I measure the average rate for queries and inserts. I also want to determine whether there are stalls in throughput so the benchmark client reports average throughput per interval (~10 seconds each) and at test end I compute the 50th, 75th, 90th, 95th and 99th percentile per-interval throughput. For many tests including the insert benchmark it is even better to show throughput over time as some database engines suffer from stalls or throughput degrades. Some of that is visible via the p95 and p99 throughput metric.
For efficiency I measure HW consumed per operation and including CPU microseconds, storage reads, storage KB read/written. For efficiency I also include the database size and the RSS of the database process. I don't distinguish storage reads and writes done for user inserts versus those done for user queries. If the storage read rate is 200/second, the query rate is 10/second and the insert rate is 5/second then the metrics below will show 20 reads/query and 40 reads/insert.
The metrics I use for performance are throughput and response time. For performance I measure the average rate for queries and inserts. I also want to determine whether there are stalls in throughput so the benchmark client reports average throughput per interval (~10 seconds each) and at test end I compute the 50th, 75th, 90th, 95th and 99th percentile per-interval throughput. For many tests including the insert benchmark it is even better to show throughput over time as some database engines suffer from stalls or throughput degrades. Some of that is visible via the p95 and p99 throughput metric.
For efficiency I measure HW consumed per operation and including CPU microseconds, storage reads, storage KB read/written. For efficiency I also include the database size and the RSS of the database process. I don't distinguish storage reads and writes done for user inserts versus those done for user queries. If the storage read rate is 200/second, the query rate is 10/second and the insert rate is 5/second then the metrics below will show 20 reads/query and 40 reads/insert.
I usually ignore storage writes for 2 reasons. First, the penalty for random writes is small with SSD. Endurance is still an issue and KB written captures that. Second, it is hard to distinguish between slow and fast write operations. Some write operations are fast because they are mostly sequential or are small log writes that benefit from battery backed write cache on a RAID device.
I use vmstat and iostat to collect efficiency metrics. Note that Linux in mid-2016 still counts trim as bytes written. So with a database engine like RocksDB that deletes files as frequently as it creates them in the steady state and a kernel+filesystem+device setup to do trim on file delete then KB written as reported by iostat will be twice the real rate. You can confirm this by comparing iostat metrics with metrics reported by your SSD.
For RocksDB I used Facebook MySQL with the git hash 2406e9. For InnoDB I used 5.6.29 and 5.7.10. All mysqld builds used jemalloc. The binlog was enabled for all tests but sync-on-commit was disabled for the database engine and the binlog. The server runs Ubuntu 14.04.
MyRocks is changing fast. I used a build from May 7 and since then several performance bugs have been fixed. My my.cnf settings for RocksDB are also stale compared to the latest & greatest values suggested by Yoshinori. It is fun to work on a project that improving so quickly.
The my.cnf files for RocksDB, InnoDB in MySQL 5.6, InnoDB in MySQL 5.7.
I use vmstat and iostat to collect efficiency metrics. Note that Linux in mid-2016 still counts trim as bytes written. So with a database engine like RocksDB that deletes files as frequently as it creates them in the steady state and a kernel+filesystem+device setup to do trim on file delete then KB written as reported by iostat will be twice the real rate. You can confirm this by comparing iostat metrics with metrics reported by your SSD.
Configuration
I used a 2G block cache for RocksDB which uses the rest of RAM for compressed blocks in the OS page cache. I used a 6G buffer pool for InnoDB which used direct IO.For RocksDB I used Facebook MySQL with the git hash 2406e9. For InnoDB I used 5.6.29 and 5.7.10. All mysqld builds used jemalloc. The binlog was enabled for all tests but sync-on-commit was disabled for the database engine and the binlog. The server runs Ubuntu 14.04.
MyRocks is changing fast. I used a build from May 7 and since then several performance bugs have been fixed. My my.cnf settings for RocksDB are also stale compared to the latest & greatest values suggested by Yoshinori. It is fun to work on a project that improving so quickly.
The my.cnf files for RocksDB, InnoDB in MySQL 5.6, InnoDB in MySQL 5.7.
Results for insert-only
Summary:
- InnoDB writes between 20X and 30X more per insert than RocksDB
 - RocksDB does between 4X and 10X more inserts/second. The difference increases as the database:RAM ratio increases because the insert rate for RocksDB is almost flat over time while InnoDB drops dramatically once the database gets larger than RAM.
 
* ips.av - average insert rate
* ips.75, ips.95, ips.99 - 75th, 95th and 99th percentile insert rate over the
per-interval (~10 seconds) average rates
* r/i - storage reads per insert via iostat
* rkb/i, wkb/i - storage KB read and written per insert via iostat. iostat
* ips.75, ips.95, ips.99 - 75th, 95th and 99th percentile insert rate over the
per-interval (~10 seconds) average rates
* r/i - storage reads per insert via iostat
* rkb/i, wkb/i - storage KB read and written per insert via iostat. iostat
                 overstates the wkb/i result for RocksDB by 2X because it
                 counts trim as bytes written and the result here have
                 been corrected.
* Mcpu/i - usecs CPU per insert via vmstat
* size - database size in GB at test end
* rss - mysqld RSS in GB at test end
* Mcpu/i - usecs CPU per insert via vmstat
* size - database size in GB at test end
* rss - mysqld RSS in GB at test end
* engine - format is $e.$comp[.skip] and .skip means that unique constraints
           are not checked (only on RocksDB). For $e, "rx" is RocksDB,
"i7" is InnoDB in MySQL 5.7.10 and "i6" is InnoDB in MySQL 5.6.29.
For $comp "snap" is snappy, "zlib" is zlib level 1, "zlib-6" is zlib
level 6 and "none" is no compression.
"i7" is InnoDB in MySQL 5.7.10 and "i6" is InnoDB in MySQL 5.6.29.
For $comp "snap" is snappy, "zlib" is zlib level 1, "zlib-6" is zlib
level 6 and "none" is no compression.
ips.av  ips.75  ips.95  ips.99  r/i     rkb/i   wkb/i   Mcpu/i  size    rss     engine
23129   24515   12276    9515   0.02    0.25     1.55    3354   19g     2.47    rx.snap.skip
18177   18356    7283    6245   0.02    0.23     1.46    3805   17g     2.52    rx.zlib.skip
17705   15976   15282   14766   0.02    0.25     1.75    3944   18g     2.59    rx.snap
15357   14945   11532    8114   0.02    0.23     1.57    4920   16g     2.67    rx.zlib
12800   14557    5681    4694   0.01    0.19     1.46    5433   16g     2.53    rx.zlib-6
12008   10525    9355    9007   0.02    0.41     2.01    4585   27g     2.64    rx.none
 3811    2854    2607    2533   0.06    1.76    30.69    2715   42g     6.98    i7.none
 2699    1995    1497    1281   0.29    5.97    36.17    4565   42g     6.78    i6.none
 2120    1619    1338    1278   0.09    0.79    51.70   11142   23g     7.06    i7.zlib
 2121    1604    1335    1275   0.09    0.81    51.34   11137   23g     6.85    i6.zlib
Results with inserts limited to 1000/second
RocksDB gets more QPS because it does fewer reads from storage per query. That is expected for uncompressed InnoDB because that database is about 2X the size compared to RocksDB. But the compressed InnoDB database isn't much larger than RocksDB so it should be able to keep a similar amount of hot data in cache. I wonder if key prefix compression as done by RocksDB allows it to get a better cache hit rate.
RocksDB also benefits from not doing reads for non-unique secondary index pages during index maintenance. That is a write-only operation for RocksDB.
legend:
* ips.av, qps.av - average insert and query rates
* ips.99, qps.99 - 99th percentile query rate over the per-interval average rates
* r/q - storage reads per insert and per query query via iostat
* rkb/q, wkb/q - storage KB read and written per query
* ips.99, qps.99 - 99th percentile query rate over the per-interval average rates
* r/q - storage reads per insert and per query query via iostat
* rkb/q, wkb/q - storage KB read and written per query
* Mcpu/i - usecs CPU per query via vmstat
* size - database size in GB at test end
* rss - mysqld RSS in GB at test end
* engine - format is $e.$comp[.skip] and .skip means that unique constraints\
are not checked (only on RocksDB). For $e, "rx" is RocksDB,
"i7" is InnoDB in MySQL 5.7.10 and "i6" is InnoDB in MySQL 5.6.29.
For $comp "snap" is snappy, "zlib" is zlib level 1, "zlib-6" is zlib
level 6 and "none" is no compression.
* size - database size in GB at test end
* rss - mysqld RSS in GB at test end
* engine - format is $e.$comp[.skip] and .skip means that unique constraints\
are not checked (only on RocksDB). For $e, "rx" is RocksDB,
"i7" is InnoDB in MySQL 5.7.10 and "i6" is InnoDB in MySQL 5.6.29.
For $comp "snap" is snappy, "zlib" is zlib level 1, "zlib-6" is zlib
level 6 and "none" is no compression.
ips.av  ips.99  qps.av  qps.99  r/q     rkb/q   Mcpu/q  size    rss     engine
 998     997     2877   1579    0.27     3.40   12190   19g     2.53    rx.snap
 998     997     2635   1526    0.22     2.32   11544   16g     2.34    rx.zlib
 998     997     2652   1531    0.23     2.49   12396   15g     2.45    rx.zlib-6
 999     961      256     10    6.68   123.54   29334   42g     6.98    i7.none
 999     966      256     12    6.86   128.08   42332   42g     6.79    i6.none
 999     967      218    173    6.86    58.00  110277   23g     7.07    i7.zlib
 999     966      210    161    6.71    59.37  113406   23g     6.85    i6.zlib
Results with inserts limited to 100/second
InnoDB still does many more storage reads per user request and storage reads done for secondary index maintenance don't explain why RocksDB is more efficient.
ips.av  ips.99  qps.av  qps.99  r/q     rkb/q   Mcpu/q  size    rss     engine
 100     100    3059    2339    0.22    2.59    9754    19g     2.41    rx.snap
 100     100    2773    2312    0.18    1.61    9051    15g     2.33    rx.zlib
 100     100    2899    2158    0.18    1.60    8329    15g     2.42    rx.zlib-6
 100     100     728     438    5.78   93.25   11258    42g     6.98    i7.none
 100     100     637     463    5.82   94.04   16334    42g     6.79    i6.none
 100     100     456     439    4.64   37.10   42641    23g     7.08    i7.zlib
 100     100     457     354    4.72   37.98   43306    23g     6.85    i6.zlib


