- MyRocks loads data faster than InnoDB. This isn't a new result. Non-unique secondary index maintenance doesn't require a read before the write (unlike a B-Tree). This is also helped by less random IO on writes and better compression.
- MyRocks compression is much better than compressed InnoDB. After 24 hours it used between 56% and 64% of the space compared to the compressed InnoDB configurations.
- MyRocks QPS degrades over time. This will be fixed real soon.
- Partitioning improves InnoDB load performance in MySQL 5.6 for compressed and non-compressed tables. This reduces stalls from the per-index mutex used by InnoDB when inserts cause or might cause a page split (pessimistic code path) because there is one mutex per partition. With MySQL 5.7 partitioning doesn't help in the non-compressed table case. There has been work in 5.7 to reduce contention on the per-index mutex and I think it helped. I suspect it is still needed with old-style compression because compressed page splits are more expensive as they include recompression.
- The Facebook patch for MySQL 5.6 is faster than upstream 5.6 and competitive with upstream 5.7.8. Too bad that patches might not reach upstream.
Configuration
My test server has 144G of RAM, 40 HW threads with HT enabled and fast PCIe flash storage. I configured linkbench with loaders=10, requesters=20 and maxid1=1B. This uses 10 clients for the load, 20 clients for the query runs and about 1B rows in the node table after the load. The linkbench clients share the server with mysqld. The my.cnf settings are explained in a previous post. The load was done with the binlog disabled. After the load there were 12 1-hour runs of the query test and I report results for hours 2 and 12. Then mysqld was restarted with the binlog enabled and 12 more 1-hour runs of the query test were done and I report results for hours 14 and 24. Fsync for the binlog was disabled. Fsync for the InnoDB redo log was done by a background thread (innodb_flush_log_at_trx_commit=2). Note that the InnoDB page size was 8kb so I used 2X compression for the link and count tables. The node table is not compressed for InnoDB because it is unlikely to compression by 50%.
I tested the following binaries:
- myrocks - RocksDB storage engine for MySQL using the Facebook patch for MySQL 5.6
- fb56 - InnoDB using the Facebook patch for MySQL 5.6
- orig56 - upstream MySQL 5.6.26
- orig57 - upstream MySQL 5.7.8
The partitioning and compression options are described by the following. For partitioning I use 32 partitions and transactions/queries don't span partitions. All of the DDL is here.
- p0 - no partitioning for RocksDB
- p1 - partitioning for RocksDB
- p0.c0 - no partitioning, no compression for InnoDB
- p0.c1 - no partitioning, old-style compression for InnoDB
- p1.c0 - partitioning, no compression for InnoDB
- p1.c1 - partitioning, old-style compression for InnoDB
Results
This lists the database size in GB after the load and query tests at the 2nd, 12th, 14th and 24th hours. I don't have sufficient granularity in my measurement script for databases larger than 1T. I am not sure why compression with upstream 5.6 and 5.7 uses more space than with the Facebook patch.
Update - I removed the results for myrocks, p1 because my measurements were wrong.
Update - I removed the results for myrocks, p1 because my measurements were wrong.
load 2h 12h 14h 24h
gb gb gb gb gb config
487 493 512 514 523 myrocks, p0
.
11XX 11XX 12XX 12XX 13XX fb56, p0.c0
666 697 779 787 814 fb56, p0.c1
11XX 12XX 12XX 13XX 13XX fb56, p1.c0
707 745 803 808 826 fb56, p1.c1
.
12XX 12XX 13XX 14XX 14XX orig56, p0.c0
756 790 879 889 920 orig56, p0.c1
13XX 13XX 14XX 14XX 14XX orig56, p1.c0
803 838 901 907 930 orig56, p1.c1
.
12XX 13XX 14XX 14XX 15XX orig57, p0.c0
756 796 892 902 931 orig57, p0.c1
13XX 13XX 14XX 14XX 15XX orig57, p1.c0
803 844 844 916 940 orig57, p1.c1
This lists the insert rate during the load (load ips) and the average query rates for the 2nd, 12th, 14th and 24th hours. Note that the query rate is lousy for p0.c1 immediately after the load. The problem is that the b-tree pages are almost full after the load and then over time many of them get split. There are stalls from page splits with compression and over time the page split rate drops.
load 2h 12h 14h 24h
ips qps qps qps qps config
165210 31826 22347 21293 17888 myrocks, p0
103145 30045 22376 21325 18387 myrocks, p1
.
109355 21151 23733 23478 24865 fb56, p0.c0
74210 8261 13928 14706 18656 fb56, p0.c1
104900 26953 26029 25161 25479 fb56, p1.c0
90162 19888 24431 22596 22811 fb56, p1.c1
.
105356 16472 16873 16575 17073 orig56, p0.c0
45966 7638 12492 13178 16516 orig56, p0.c1
98104 18797 18273 17625 17702 orig56, p1.c0
66738 17731 19854 19159 19418 orig56, p1.c1
.
122454 31009 30260 29905 29751 orig57, p0.c0
49101 9217 17552 18448 22092 orig57, p0.c1
114400 28191 26797 25820 25832 orig57, p1.c0
69746 22028 25204 23882 23983 orig57, p1.c1
This is the same data as above, but grouped by configuration.
load 2h 12h 14h 24h
ips qps qps qps qps config
109355 21151 23733 23478 24865 fb56, p0.c0
105356 16472 16873 16575 17073 orig56, p0.c0
122454 31009 30260 29905 29751 orig57, p0.c0
.
165210 31826 22347 21293 17888 myrocks, p0
74210 8261 13928 14706 18656 fb56, p0.c1
45966 7638 12492 13178 16516 orig56, p0.c1
49101 9217 17552 18448 22092 orig57, p0.c1
.
104900 26953 26029 25161 25479 fb56, p1.c0
98104 18797 18273 17625 17702 orig56, p1.c0
114400 28191 26797 25820 25832 orig57, p1.c0
.
103145 30045 22376 21325 18387 myrocks, p1
90162 19888 24431 22596 22811 fb56, p1.c1
66738 17731 19854 19159 19418 orig56, p1.c1
69746 22028 25204 23882 23983 orig57, p1.c1
Graphs
For people who prefer graphs I include one for the load rates and another for the QPS from the configurations that use partitioning.
No comments:
Post a Comment