Friday, June 30, 2017

One more time with sysbench, a small server & MySQL 5.6, 5.7 and 8.0

 Update - the regression isn't as bad as I have been reporting. Read this post to understand why.

The good news is that I hope to begin debugging this problem next week. After fixing a few problems to reduce variance I am repeating tests to document the performance regression from MySQL 5.6 to 8.0. The first problem was fixed by disabling turbo boost on my Intel NUC servers to avoid thermal throttling. The other problem was the impact from mutex contention for InnoDB purge threads and I repeated tests with it set to 1 and 4. This is part of my series on low-concurrency CPU regressions for bug 86215.

tl;dr for in-memory sysbench on a small server with a fast SSD
  • most of the regression is from 5.6.35 to 5.7.17, much less is from 5.7.1 to 8.0.1
  • innodb_purge_threads=4 costs 10% to 15% of the QPS for write-heavy tests
  • QPS is 30% less for 5.7.17 & 8.0.1 vs 5.6.35 on write-only tests
  • QPS is 30% to 40% less for 5.7.17 & 8.0.1 vs 5.6.35 on read-write tests
  • QPS is 40% to 50% less for 5.7.17 & 8.0.1 vs 5.6.35 on read-only tests
  • QPS is 40% less for 5.7.17 & 8.0.1 vs 5.6.35 for point-query
  • QPS is 30% less for 5.7.17 & 8.0.1 vs 5.6.35 for insert-only

Configuration

I tested MySQL with upstream 5.6.35, 5.7.17 and 8.0.1. For 8.0.1 I used the latin1 charset and latin1_swedish_ci collation. I used the i5 NUC servers described here and the my.cnf used are here. I run mysqld and the sysbench client on the same server. The binlog is enabled but sync-on-commit is disabled.

Sysbench is run with 4 tables and 1M rows per table. The database fits in the InnoDB buffer pool. My usage of sysbench is described here. That explains the helper scripts that invoke sysbench and collect performance metrics. When I return home I will update this with the sysbench command lines that are generated by my helper scripts.

Results: write-only

Sorry, no graphs this time. I run sysbench for 1, 2 and 4 concurrent clients and share both the QPS for each test and then the QPS for MySQL 5.7.17 and 8.0.1 relative to 5.6.35. The ratio is less than 1 when the QPS is larger for 5.6.35.

All of these tests are run with innodb_purge_threads=1 which is the default for 5.6.35. The default for 5.7.17 and 8.0.1 is 4.

The first batch of results is from write-only tests. Most of the QPS regression is from MySQL 5.6.35 to 5.7.17. Excluding the update-index test, going from 5.6 to 5.7 loses about 30% of QPS.

update-index : QPS
1       2       4       concurrency/engine
5806    9837    12354   inno5635
5270    8798    11677   inno5717
4909    8176    10917   inno801

update-index : QPS relative to MySQL 5.6.35
1       2       4       concurrency/engine
.91     .89     .95     inno5717
.85     .83     .88     inno801

update-nonindex : QPS
1       2       4       concurrency/engine
10435   15680   18487   inno5635
 7691   11497   14989   inno5717
 7179   10845   14186   inno801

update-nonindex : QPS relative to MySQL 5.6.35
1       2       4       concurrency/engine
.74     .73     .81     inno5717
.69     .69     .77     inno801

delete : QPS
1       2       4       concurrency/engine
19461   28797   35684   inno5635
13525   19937   25466   inno5717
12551   18810   24023   inno801

delete : QPS relative to MySQL 5.6.35
1       2       4       concurrency/engine
.69     .69     .71     inno5717
.64     .65     .67     inno801

write-only : QPS
1       2       4       concurrency/engine
16892   25376   30915   inno5635
11765   17239   22061   inno5717
10729   16108   20682   inno801

write-only : QPS relative to MySQL 5.6.35
1       2       4       concurrency/engine
.70     .68     .71     inno5717
.64     .63     .67     inno801

Results: read-write

The next batch of results is from the classic read-write OLTP sysbench test. But I repeat it using different sizes for the range query. The regression is larger here than for the write-only tests above perhaps because of the regression for range scans. Going from 5.6.35 to 5.7.17 loses between 30% and 40% of the QPS. The regression is worse for longer range scans.

read-write.range100 : QPS
1       2       4       concurrency/engine
11653   18109   25325   inno5635
 7520   10871   14498   inno5717
 6965   10274   14098   inno801

read-write.range100 : QPS relative to MySQL 5.6.35
1       2       4       concurrency/engine
.65     .60     .57     inno5717
.60     .57     .56     inno801

read-write.range10000 : QPS
1       2       4       concurrency/engine
337     604     849     inno5635
202     386     443     inno5717
200     378     436     inno801

read-write.range10000 : QPS relative to MySQL 5.6.35
1       2       4       concurrency/engine
.60     .64     .52     inno5717
.59     .63     .51     inno801

Results: read-only
The next batch of results is from the classic read-only OLTP sysbench test. But I repeat it using different sizes for the range query. Most of the regression is from 5.6.35 to 5.7.17. Going from 5.6 to 5.7 loses between 40% and 50% of the QPS so the regression here is larger than above for the read-write tests. There isn't a larger regression for larger range queries.

read-only.range10 : QPS
1       2       4       concurrency/engine
17372   30663   50570   inno5635
10829   19021   25874   inno5717
10171   18743   25713   inno801

read-only.range10 : QPS relative to MySQL 5.6.35
1       2       4       concurrency/engine
.62     .62     .51     inno5717
.59     .61     .51     inno801

read-only.range100 : QPS
1       2       4       concurrency/engine
11247   20922   32930   inno5635
 6815   12823   16225   inno5717
 6475   12308   15834   inno801

read-only.range100 : QPS relative to MySQL 5.6.35
1       2       4       concurrency/engine
.61     .61     .49     inno5717
.58     .59     .48     inno801

read-only.range1000 : QPS
1       2       4       concurrency/engine
2590    4840    6816    inno5635
1591    2979    3408    inno5717
1552    2918    3363    inno801

read-only.range1000 : QPS relatie to MySQL 5.6.35
1       2       4       concurrency/engine
.61     .62     .50     inno5717
.60     .60     .49     inno801

read-only.range10000 : QPS
1       2       4       concurrency/engine
273     497     686     inno5635
161     304     355     inno5717
159     299     350     inno801

read-only.range10000 : QPS relative to MySQL 5.6.35
1       2       4       concurrency/engine
.59     .61     .52     inno5717
.68     .60     .51     inno801

Results: point-query and insert-only

Finally results for the last two tests -- point-query and insert-only. MySQL 5.7.17 loses about 40% of the QPS for point-query and 30% of the QPS for insert-only compared to 5.6.35.

point-query : QPS
1       2       4       concurrency/engine
19674   36269   55266   inno5635
11964   22941   29174   inno5717
11624   20679   29271   inno801

point-query : QPS relative to MySQL 5.6.35
1       2       4       concurrency/engine
.61     .63     .53     inno5717
.59     .57     .53     inno801

insert : QPS
1       2       4       concurrency/engine
11288   16268   19355   inno5635
 7951   12176   15660   inno5717
 7493   11277   14857   inno801

insert : QPS relative to MySQL 5.6.35
1       2       4       concurrency/engine
.70     .75     .81     inno5717
.66     .69     .77     inno801

innodb_purge_threads

Finally I repeated tests with innodb_purge_threads=4 to show the impact from that. On a small server (2 cores, 4 HW threads) there is too much mutex from innodb_purge_threads=4. As 4 is the default for 5.7.17 and 8.0.1 they suffer more than 5.6.35 when the default is used. The results above are for innodb_purge_threads=1 and then I repeated the tests with it set to 4. Here I show the QPS with purge_threads=4 / QPS with purge_threads=1. For the tests below QPS is reduced by 10% to 15% when innodb_purge_threads=4 on a small server. The insert-only test doesn't suffer, but there isn't anything to purge from the insert-only workload.

update-index
1       2       4       concurrency/engine
.85     .76     .75     inno5635
.76     .76     .77     inno5717
.89     .96     .89     inno801

update-nonindex
1       2       4       concurrency/engine
.82     .78     .88     inno5635
.77     .79     .86     inno5717
.86     .95     .91     inno801

delete
1       2       4       concurrency/engine
.84     .81     .82     inno5635
.84     .81     .87     inno5717
.87     .92     .94     inno801

write-only
1       2       4       concurrency/engine
.89     .85     .85     inno5635
.88     .86     .87     inno5717
.91     .95     .94     inno801

insert
1       2       4       concurrency/engine
.99     .99     .99     inno5635
.99     1.00    1.00    inno5717
1.01    1.01    1.00    inno801

No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...