Compression in MyRocks is simpler than in InnoDB. You should expect the database to use about 1.1X times the size of the compressed output. When rows compress to 60% of their original size and are 10kb before compression, then each row should use about 6.6kb in the database. The 1.1X adjustment is for space-amplification from leveled compaction.
Predicting the space used for InnoDB is harder. First, large LOB column are not stored inline and overflow pages are not shared. Second, disk pages have a fixed size and you risk using too much space or getting too many page splits when searching for a good value for key_block size. More details are here.
I ran two tests for two types of data. The first test is an insert only workload in PK-order for data with varying amounts of compressibility. The second test determined how fast point queries could be done on that data while rate-limited inserts were in progress. By varying amounts of compressibility I mean that there was one large varchar column per row and that 20%, 45%, 75% or 95% of the data in the column was random and the remainder was constant and easily compressed. Both tests used one connection for inserts. The query test also used one connection for queries.
The test pattern was run twice. In both cases the large column was a varchar. In the first case it had a length between 10,000 and 20,000 characters. In the second case it had a length between 100 and 1000 characters. The database block size was 16kb for MyRocks and InnoDB.
Insert only
For the insert-only workload the space used for MyRocks can be predicted from the compressibility of the data. That is much less true for InnoDB. For example compressed InnoDB uses about the same amount of space for pctRand in 20, 45 and 75.
MyRocks used the least amount of space. InnoDB used much more space when the column was larger (10,000 to 20,000 vs 100 to 1000). Overflow pages are the root cause.
The insert rates are better for MyRocks than for InnoDB. They were also stable for MyRocks and uncompressed InnoDB independent of the compressibility. Rates for uncompressed InnoDB are better than compressed InnoDB. While this wasn't a performance benchmark, it matches many other results I get. It is hard to get performance and compression from InnoDB. The CPU overhead per insert was similar between MyRocks and uncompressed InnoDB. CPU overheads were mostly larger for compressed InnoDB.
Legend for the data:
- ips - inserts per second
- size - database size in GB at test end
- Mcpu - microseconds of CPU per insert
- pctRand - percentage of random data in large column
- engine - rx.zlib-6 is MyRocks with zlib level 6 compression. i6n is InnoDB in MySQL 5.6.26 without compression. i6c is InnoDB in MySQL 5.6.26 with compression.
column up to 20,000 column up to 1000
ips size Mcpu ips size Mcpu pctRand engine
5489 7.7 1090 34468 11 151 20 rx.zlib-6
5540 16 1127 34824 19 149 45
5532 24 1307 34517 27 166 75
5523 30 1467 34701 33 160 95
ips size Mcpu ips size Mcpu pctRand engine
3995 87 933 23470 66 173 20 i6n
3981 87 928 23704 66 174 45
3981 86 917 23487 66` 175 75
3995 88 914 23658 66 176 95
ips size Mcpu ips size Mcpu pctRand engine
3339 36 1064 13429 33 262 20 i6c
2779 32 1278 13124 33 271 45
2133 35 1750 8767 30 392 75
1757 50 2061 7228 38 461 95
Point queries
MyRocks provides the best compression, the best query throughput, and the east CPU overhead per query. My conclusions for InnoDB space consumption are similar to the results from the insert-only workload.
Legend for the data:
- qps - queries per second
- size - database size in GB at test end
- Mcpu - microseconds of CPU per query
- pctRand - percentage of random data in large column
- engine - rx.zlib-6 is MyRocks with zlib level 6 compression. i6n is InnoDB in MySQL 5.6.26 without compression. i6c is InnoDB in MySQL 5.6.26 with compression.
qps size Mcpu qps size Mcpu pctRand engine
984 9.3 4308 2214 11 1585 20 rx.zlib-6
910 19 4532 2113 19 1627 45
846 30 4952 2102 27 1601 75
795 37 5598 2051 33 1691 95
qps size Mcpu qps size Mcpu pctRand engine
628 113 6240 1302 62 2527 20 i6n
624 110 6226 1300 63 2501 45
624 114 6312 1302 63 2536 75
628 115 6218 1305 66 2474 95
qps size Mcpu qps size Mcpu pctRand engine
708 38 5560 770 34 4450 20 i6c
629 39 6643 687 34 4895 45
513 44 8494 589 30 6046 75
418 57 10619 576 39 6599 95
Yes - same host but workload was different. Here it is modified insert benchmark client with 1 insert thread and optionally 1 query thread
ReplyDelete