- results for 4.13 are mixed -- sometimes there is more QPS with the fix enabled, sometimes there is more with the fix disabled. The typical difference is small, about 2%.
- QPS for 4.8, which doesn't have the Meltdown fix, are usually better than with 4.13, the largest difference is ~10% and the difference tend to be larger at 1 client than at 2 or 8.
Configuration
My usage of sysbench is described here. The servers are described here. For this test I used the core i5 NUC (NUC7i5bnh) with Ubuntu 16.04. I have 3 such servers and ran tests with the fix enabled (kernel 4.13.0-26), the fix disabled via pti=off (kernel 4.13.0-26) and the old kernel (4.8.0-36) that doesn't have the fix. From cat /proc/cpuinfo I see pcid. This server uses the HWE kernels to make wireless work. I repeated tests after learning that 4.13 doesn't support the nobarrier mount option for XFS. My workaround was to switch to ext4 and the results here are from ext4.
The servers have 2 cores and 4 HW threads. I normally use them for low-concurrency benchmarks with 1 or 2 concurrent database clients. For this test I used 1, 2 and 8 concurrent clients to determine whether more concurrency and more mutex contention would cause more of a performance loss.
The sysbench test was configured to use 1 table with 4M rows and InnoDB. The InnoDB buffer pool was large enough to cache the table. The sysbench client runs on the same host as mysqld.
I just noticed that all servers had the doublewrite buffer and binlog disabled. This was leftover from debugging the XFS nobarrier change.
I just noticed that all servers had the doublewrite buffer and binlog disabled. This was leftover from debugging the XFS nobarrier change.
Results
My usage of sysbench is described here which explains the tests that I list below. Each test has QPS for 1, 2 and 8 concurrent clients. Results are provided for
- pti enabled - kernel 4.13.0-26 with the Meltdown fix enabled
- pti disabled - kernel 4.13.0-26 with the Meltdown fix disabled via pti=off
- old kernel, no pti - kernel 4.8.0-36 which doesn't have the Meltdown fix
After each of the QPS sections, there are two lines for QPS ratios. The first line compares the QPS for the kernel with the Meltdown fix enabled vs disabled. The second line compares the QPS for the kernel with the Meltdown fix vs the old kernel. A value less than one means that MySQL gets less QPS with the Meltdown fix.
update-inlist
1 2 8 concurrency
5603 7546 8212 pti enabled
5618 7483 8076 pti disabled
5847 7613 8149 old kernel, no pti
----- ----- -----
0.997 1.008 1.016 qps ratio: pti on/off
0.958 0.991 1.007 qps ratio: pti on / old kernel
update-one
1 2 8 concurrency
11764 18880 16699 pti enabled
12074 19475 17132 pti disabled
12931 19573 16559 old kernel, no pti
----- ----- -----
0.974 0.969 0.974 qps ratio: pti on/off
0.909 0.964 1.008 qps ratio: pti on / old kernel
update-index
1 2 8 concurrency
7202 12688 16738 pti enabled
7197 12581 17466 pti disabled
7443 12926 17720 old kernel, no pti
----- ----- -----
1.000 1.000 0.958 qps ratio: pti on/off
0.967 0.981 0.944 qps ratio: pti on / old kernel
update-nonindex
1 2 8 concurrency
11103 18062 22964 pti enabled
11414 18208 23076 pti disabled
12395 18529 22168 old kernel, no pti
----- ----- -----
0.972 0.991 0.995 qps ratio: pti on/off
0.895 0.974 1.035 qps ratio: pti on / old kernel
delete
1 2 8 concurrency
19197 30830 43605 pti enabled
19720 31437 44935 pti disabled
21584 32109 43660 old kernel, no pti
----- ----- -----
0.973 0.980 0.970 qps ratio: pti on/off
0.889 0.960 0.998 qps ratio: pti on / old kernel
read-write range=100
1 2 8 concurrency
11956 20047 29336 pti enabled
12475 20021 29726 pti disabled
13098 19627 30030 old kernel, no pti
----- ----- -----
0.958 1.001 0.986 qps ratio: pti on/off
0.912 1.021 0.976 qps ratio: pti on / old kernel
read-write range=10000
1 2 8 concurrency
488 815 1080 pti enabled
480 768 1073 pti disabled
504 848 1083 old kernel, no pti
----- ----- -----
1.016 1.061 1.006 qps ratio: pti on/off
0.968 0.961 0.997 qps ratio: pti on / old kernel
read-only range=100
1 2 8 concurrency
12089 21529 33487 pti enabled
12170 21595 33604 pti disabled
11948 22479 33876 old kernel, no pti
----- ----- -----
0.993 0.996 0.996 qps ratio: pti on/off
1.011 0.957 0.988 qps ratio: pti on / old kernel
read-only.pre range=10000
1 2 8 concurrency
392 709 876 pti enabled
397 707 872 pti disabled
403 726 877 old kernel, no pti
----- ----- -----
0.987 1.002 1.004 qps ratio: pti on/off
0.972 0.976 0.998 qps ratio: pti on / old kernel
read-only range=10000
1 2 8 concurrency
394 701 874 pti enabled
389 698 871 pti disabled
402 725 877 old kernel, no pti
----- ----- -----
1.012 1.004 1.003 qps ratio: pti on/off
0.980 0.966 0.996 qps ratio: pti on / old kernel
point-query.pre
1 2 8 concurrency
18490 31914 56337 pti enabled
19107 32201 58331 pti disabled
18095 32978 55590 old kernel, no pti
----- ----- -----
0.967 0.991 0.965 qps ratio: pti on/off
1.021 0.967 1.013 qps ratio: pti on / old kernel
point-query
1 2 8 concurrency
18212 31855 56116 pti enabled
18913 32123 58320 pti disabled
17907 32941 55430 old kernel, no pti
----- ----- -----
0.962 0.991 0.962 qps ratio: pti on/off
1.017 0.967 1.012 qps ratio: pti on / old kernel
random-points.pre
1 2 8 concurrency
3043 5940 8131 pti enabled
2944 5681 7984 pti disabled
3030 6015 8098 old kernel, no pti
----- ----- -----
1.033 1.045 1.018 qps ratio: pti on/off
1.004 0.987 1.004 qps ratio: pti on / old kernel
random-points
1 2 8 concurrency
3053 5930 8128 pti enabled
2949 5756 7981 pti disabled
3058 6011 8116 old kernel, no pti
----- ----- -----
1.035 1.030 1.018 qps ratio: pti on/off
0.998 0.986 1.001 qps ratio: pti on / old kernel
hot-points
1 2 8 concurrency
3931 7522 9500 pti enabled
3894 7535 9214 pti disabled
3914 7692 9448 old kernel, no pti
----- ----- -----
1.009 0.998 1.031 qps ratio: pti on/off
1.004 0.977 1.005 qps ratio: pti on / old kernel
insert
1 2 8 concurrency
12469 21418 25158 pti enabled
12561 21327 25094 pti disabled
13045 21768 21258 old kernel, no pti
----- ----- -----
0.992 1.004 1.002 qps ratio: pti on/off
0.955 0.983 1.183 qps ratio: pti on / old kernel
update-inlist
1 2 8 concurrency
5603 7546 8212 pti enabled
5618 7483 8076 pti disabled
5847 7613 8149 old kernel, no pti
----- ----- -----
0.997 1.008 1.016 qps ratio: pti on/off
0.958 0.991 1.007 qps ratio: pti on / old kernel
update-one
1 2 8 concurrency
11764 18880 16699 pti enabled
12074 19475 17132 pti disabled
12931 19573 16559 old kernel, no pti
----- ----- -----
0.974 0.969 0.974 qps ratio: pti on/off
0.909 0.964 1.008 qps ratio: pti on / old kernel
update-index
1 2 8 concurrency
7202 12688 16738 pti enabled
7197 12581 17466 pti disabled
7443 12926 17720 old kernel, no pti
----- ----- -----
1.000 1.000 0.958 qps ratio: pti on/off
0.967 0.981 0.944 qps ratio: pti on / old kernel
update-nonindex
1 2 8 concurrency
11103 18062 22964 pti enabled
11414 18208 23076 pti disabled
12395 18529 22168 old kernel, no pti
----- ----- -----
0.972 0.991 0.995 qps ratio: pti on/off
0.895 0.974 1.035 qps ratio: pti on / old kernel
delete
1 2 8 concurrency
19197 30830 43605 pti enabled
19720 31437 44935 pti disabled
21584 32109 43660 old kernel, no pti
----- ----- -----
0.973 0.980 0.970 qps ratio: pti on/off
0.889 0.960 0.998 qps ratio: pti on / old kernel
read-write range=100
1 2 8 concurrency
11956 20047 29336 pti enabled
12475 20021 29726 pti disabled
13098 19627 30030 old kernel, no pti
----- ----- -----
0.958 1.001 0.986 qps ratio: pti on/off
0.912 1.021 0.976 qps ratio: pti on / old kernel
read-write range=10000
1 2 8 concurrency
488 815 1080 pti enabled
480 768 1073 pti disabled
504 848 1083 old kernel, no pti
----- ----- -----
1.016 1.061 1.006 qps ratio: pti on/off
0.968 0.961 0.997 qps ratio: pti on / old kernel
read-only range=100
1 2 8 concurrency
12089 21529 33487 pti enabled
12170 21595 33604 pti disabled
11948 22479 33876 old kernel, no pti
----- ----- -----
0.993 0.996 0.996 qps ratio: pti on/off
1.011 0.957 0.988 qps ratio: pti on / old kernel
read-only.pre range=10000
1 2 8 concurrency
392 709 876 pti enabled
397 707 872 pti disabled
403 726 877 old kernel, no pti
----- ----- -----
0.987 1.002 1.004 qps ratio: pti on/off
0.972 0.976 0.998 qps ratio: pti on / old kernel
read-only range=10000
1 2 8 concurrency
394 701 874 pti enabled
389 698 871 pti disabled
402 725 877 old kernel, no pti
----- ----- -----
1.012 1.004 1.003 qps ratio: pti on/off
0.980 0.966 0.996 qps ratio: pti on / old kernel
point-query.pre
1 2 8 concurrency
18490 31914 56337 pti enabled
19107 32201 58331 pti disabled
18095 32978 55590 old kernel, no pti
----- ----- -----
0.967 0.991 0.965 qps ratio: pti on/off
1.021 0.967 1.013 qps ratio: pti on / old kernel
point-query
1 2 8 concurrency
18212 31855 56116 pti enabled
18913 32123 58320 pti disabled
17907 32941 55430 old kernel, no pti
----- ----- -----
0.962 0.991 0.962 qps ratio: pti on/off
1.017 0.967 1.012 qps ratio: pti on / old kernel
random-points.pre
1 2 8 concurrency
3043 5940 8131 pti enabled
2944 5681 7984 pti disabled
3030 6015 8098 old kernel, no pti
----- ----- -----
1.033 1.045 1.018 qps ratio: pti on/off
1.004 0.987 1.004 qps ratio: pti on / old kernel
random-points
1 2 8 concurrency
3053 5930 8128 pti enabled
2949 5756 7981 pti disabled
3058 6011 8116 old kernel, no pti
----- ----- -----
1.035 1.030 1.018 qps ratio: pti on/off
0.998 0.986 1.001 qps ratio: pti on / old kernel
hot-points
1 2 8 concurrency
3931 7522 9500 pti enabled
3894 7535 9214 pti disabled
3914 7692 9448 old kernel, no pti
----- ----- -----
1.009 0.998 1.031 qps ratio: pti on/off
1.004 0.977 1.005 qps ratio: pti on / old kernel
insert
1 2 8 concurrency
12469 21418 25158 pti enabled
12561 21327 25094 pti disabled
13045 21768 21258 old kernel, no pti
----- ----- -----
0.992 1.004 1.002 qps ratio: pti on/off
0.955 0.983 1.183 qps ratio: pti on / old kernel
Thanks for sharing.
ReplyDeleteI was wondering what the host context switching rate per second ("sar -w 5" (or "sar -w" to view the history if sadc is running) , cs from "vmstat 5") , or the process/thread rates ( "pidstat -wt | sort -nrk6 | head" ) are when the (most contended) tests are running?
From the point-query test where InnoDB does ~18k, ~32k, ~56k QPS at 1,2,8 threads the cs rates from vmstat are ~69k, ~121k, ~107k. The number of context switches per query is ~4, ~4, ~2 for 1,2,8 threads.
Delete