Monday, July 10, 2017

I overstated the CPU regression in MySQL 5.7

My series of posts on the low-concurrency CPU regression greatly overstated the regression from MySQL 5.6 to 5.7. Most of the regression was from the use of SSL which is enabled by default starting with 5.7. My results for 5.7 and 8.0 were slower courtesy of SSL while earlier versions of MySQL were not using SSL. I repeated tests for 5.7 and 8.0 with SSL disabled (ssl=0 in my.cnf) and now show a QPS loss of 10% to 20% from 5.6 to 5.7/8.0.

There is still a large regression from MySQL 5.0 to 8.0 and I will continue to document that and expect to help reduce it. I expect 86215 to be closed as not a bug.

tl;dr for in-memory sysbench at low concurrency
  • MySQL 5.7 and 8.0 lose 10% to 20% of QPS versus 5.6 for tests not dominated by range scans
  • InnoDB range scans were made faster starting in MySQL 5.7
  • In the worst case, MySQL 5.0 gets more than 2X the QPS versus 5.7 and 8.0. 
  • The largest QPS regression occurs between MySQL 5.5 and 5.6.

Why

I should have noticed earlier in the process that SSL was the problem. I frequently rant about the value of explaining performance results and this is a reminder of that value. I didn't explain why MySQL 5.7 and 8.0 did worse than 5.6 and it turned out that my results were wrong. I spent some time debugging the slowdown using PMP and Linux perf but clearly I should have spent more time on that.

The short story is that I was trying to do too much and didn't have time to focus. Too much in this case meant I was too busy running tests and collecting results and didn't have time to explain the results. Lesson learned.

Configuration

I repeated tests for 5.7 and 8.0 with SSL disabled and share results for in-memory sysbench in this post.

I tested MySQL with upstream 5.0.96, 5.1.72, 5.5.51, 5.6.35, 5.7.17 and 8.0.1. For 8.0.1 I used the latin1 charset and latin1_swedish_ci collation. I disabled SSL for 5.7.17 and 8.0.1 by adding ssl=0 to my.cnf. I used the i5 NUC servers described here and the my.cnf used are here. I run mysqld and the sysbench client on the same server. The binlog is enabled but sync-on-commit is disabled. I also set innodb_purge_threads to 1 to avoid mutex contention on a small server.

Sysbench is run with 4 tables and 1M rows per table. Tests are repeated for 1, 2 and 4 clients. The database fits in the InnoDB buffer pool. My usage of sysbench is described here. That explains the helper scripts that invoke sysbench and collect performance metrics.

Results

The first table is the QPS for MySQL 5.7.17 and 8.0.1 relative to 5.6.35 for sysbench with one client. The QPS for 5.7.17 and 8.0.1 is better than 5.6.35 when the value is greater than 1 as it is for tests that do longer range scans because something was done to make that faster in modern InnoDB. In the tests not dominated by range scans the QPS for 5.7.17 and 8.0.1 are 10% to 20% less than for 5.6.35 and the loss from 5.6 to 5.7 is similar to the loss from 5.7 to 8.0.

QPS relative to 5.6.35
----------------------
5717    801     test
1.04     .96    update-index
 .90     .84    update-nonindex
 .91     .83    delete
 .90     .82    write-only
1.06    1.03    read-write.range100
1.51    1.49    read-write.range10000
 .94     .86    read-only.range10
1.11    1.01    read-only.range100
1.49    1.42    read-only.range1000
1.44    1.41    read-only.range10000
 .95     .85    point-query
 .89     .82    insert

The next table is the QPS for 5.1.72, 5.5.51, 5.6.35, 5.7.17 and 8.0.1 relative to 5.0.96 for the same tests using 1 sysbench client. In the worst case the QPS for 5.0.96 is 2.5X better than 8.0.1 and more than 2X better than 5.7.17. The worst case is for the update-only (update-index, update-nonindex) and point-query tests.

QPS relative to 5.0.96
----------------------
5172    5551    5635    5717    801     test
.89     .73     .45     .46     .43     update-index
.89     .59     .47     .42     .39     update-nonindex
.92     .92     .75     .68     .62     delete
.93     .93     .76     .68     .62     write-only
.92     .99     .65     .69     .67     read-write.range100
.95     .92     .55     .84     .82     read-write.range10000
.92     .89     .58     .55     .50     read-only.range10
.94     .91     .57     .63     .57     read-only.range100
.96     .92     .56     .83     .80     read-only.range1000
.95     .92     .55     .80     .78     read-only.range10000
.85     .77     .47     .45     .40     point-query
.95     .93     .81     .72     .67     insert

The data below has the QPS for each test. Each test was run for 1, 2 and 4 clients and my usage of sysbench is explained in a previous post.

update-index
1       2       4       concurrency/version
13047   19521   22763   inno5096
11592   18439   19833   inno5172
 9486   14393   16389   inno5551
 5806    9837   12354   inno5635
 6013   10242   13498   inno5717
 5585    9509   12771   inno801

update-nonindex
1       2       4       concurrency/version
22333   32560   32961   inno5096
19914   29050   31829   inno5172
13082   19164   21049   inno5551
10435   15680   18487   inno5635
 9430   14120   19012   inno5717
 8730   13277   17746   inno801

delete
1       2       4       concurrency/version
25893   35659   38785   inno5096
23777   38591   41418   inno5171
23726   33260   38085   inno5551
19461   28797   35684   inno5635
17621   26110   34634   inno5717
16146   24270   32245   inno801

write-only
1       2       4       concurrency/version
22364   32090   36125   inno5096
20907   32503   35141   inno5172
20727   29856   33173   inno5551
16892   25376   30915   inno5635
15176   22631   29425   inno5717
13778   20834   27262   inno801

read-write.range100
1       2       4       concurrency/version
17972   30414   42474   inno5096
16527   30055   41973   inno5172
17857   28714   41496   inno5551
11653   18109   25325   inno5635
12387   18359   25379   inno5717
12002   17727   24253   inno801

read-write.range10000
1       2       4       concurrency/version
608     1089    1556    inno5096
580     1033    1513    inno5171
557     996     1410    inno5551
337     604      849    inno5635
509     873     1067    inno5717
501     858     1039    inno801

read-only.range10
1       2       4       concurrency/version
30016   48794   78703   inno5096
27554   49464   76052   inno5172
26863   48108   77370   inno5551
17372   30663   50570   inno5635
16383   28695   48139   inno5717
14861   26625   45225   inno801

read-only.range100
1       2       4       concurrency/version
19891   34307   56572   inno5096
18636   34808   54051   inno5172
18125   33561   51729   inno5551
11247   20922   32930   inno5635
12453   22441   34585   inno5717
11387   21221   32934   inno801

read-only.range1000
1       2       4       concurrency/version
4623    8304    12115   inno5096
4429    8114    11757   inno5171
4272    7851    11167   inno5551
2590    4840     6816   inno5635
3859    7041     8764   inno5717
3682    6877     8395   inno801

read-only.range10000
1       2       4       concurrency/version
492     882     1259    inno5096
468     827     1222    inno5172
452     808     1147    inno5551
273     497      686    inno5635
392     700      865    inno5717
386     693      841    inno801

point-query
1       2       4       concurrency/version
41951   68718   123988  inno5096
35785   66515   111602  inno5172
32385   56441    81061  inno5551
19674   36269    55266  inno5635
18749   32669    55340  inno5717
16694   30372    51878  inno801

insert
1       2       4       concurrency/version
13932   14182   12795   inno5096
13194   15157   12521   inno5171
12968   17106   14447   inno5551
11288   16268   19355   inno5635
10039   15428   19922   inno5717
 9306   14391   18763   inno801

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...