Performance regressions arrived in InnoDB with MySQL 8.0.30. Eventually multiple bugs were filed. The worst regressions were from changes to the hash function (perhaps fixed in 8.0.36) and from changes to how functions are inlined for InnoDB (bug 111538). The problems are obvious if you run CPU-bound workloads, and my CPU-bound workload is sysbench with a cached database.
Bug 111538 is now closed and marked as fixed in 8.0.40. Alas, there are still significant CPU perf regressions in 8.0.40 relative to 8.0.28. My advice to upstream is to stop innovating if you don't have the CI setup to catch the new performance problems that your innovation creates. Using something like Nyrkio would help.
This post has results from sysbench on several servers using MySQL 8.0.28, 8.0.30, 8.0.33, 8.0.39 and 8.0.40 to show there are large regressions starting in 8.0.30, many of which are still there in 8.0.40. Tests were repeated on 5 different servers.
tl;dr
- SELECT statements with a large in-list use much less CPU starting in MySQL 8.0.31 because bug 102037 was fixed. I found that via sysbench and filed a bug vs 8.0.22
- bug 111538 should not have been closed as fixed
- The scan microbenchmark still has regressions from 8.0.28 to 8.0.40
- For all servers the QPS is less in 8.0.40 than in 8.0.28
- On 4 of 5 servers the QPS is less in 8.0.40 than in 8.0.30
- The update microbenchmark still has regressions from 8.0.28 to 8.0.40
- In all cases the QPS is less in 8.0.40 than in 8.0.28
- From 8.0.30 to 8.0.40 -- on 2 servers the QPS is less in 8.0.40, on 2 it is about the same and one 1 it has improved
- Regressions after 8.0.28 are bad for the small servers (beelink and pn53 below) and really bad (QPS drops in half) for one of the large servers (delll32 below). That is the subject of a pending blog post.
Updates
- MySQL 8.0.40 fixes regressions for many of the read-only microbenchmarks. I failed to acknowledge that when I first posted this. But the large regression for long range scans remains.
Builds
I used MySQL 8.0.28, 8.0.30, 8.0.33, 8.0.39 and 8.0.40 compiled from source using CMAKE_BUILD_TYPE =Release, -O2 and -fno-omit-frame-pointer.
Hardware
The servers are
- beelink
- Beelink SER4 with an AMD Ryzen 7 4700 CPU with SMT disabled, 8 cores, 16G of RAM, Ubuntu 22.04 and ext4 on 1 NVMe device.
- pn53
- ASUS ExpertCenter PN53 with AMD Ryzen 7 7735HS, with SMT disabled, 8 cores, 32G RAM, Ubuntu 22.04 and ext4 on 1 NVMe device.
- socket2
- SuperMicro SuperWorkstation 7049A-T with 2 sockets, 12 cores/socket, 64G RAM, one m.2 SSD (2TB, ext4). The CPUs are Intel Xeon Silver 4214R CPU @ 2.40GHz
- dell32
- Dell Precision 7865 Tower Workstation with 1 socket, 128G RAM, AMD Ryzen Threadripper PRO 5975WX with 32-Cores, 2 m.2 SSD (each 2TB, RAID SW 0, ext4)
- ax162-s
- AMD EPYC 9454P 48-Core Processor with SMT disabled, 128G RAM, Ubuntu 22.04 and ext4 on 2 NVMe devices with SW RAID 1. This is in the Hetzner cloud.
Benchmark
I used sysbench and my usage is
explained here. A full run has 42 microbenchmarks and most test only 1 type of SQL statement. In some cases I skip the read-only tests that run prior to writes to save time. The database is cached by InnoDB.
The benchmark is run with ...
- beelink - 1 thread, 1 table, 30M rows
- pn53 - 1 thread, 1 table, 50M rows
- socket2 - 16 threads, 8 tables, 10M rows/table
- dell32 - 24 threads, 8 tables, 10M rows/table
- ax162-s - 40 threads, 8 tables, 10M rows/table
Each microbenchmark runs for 300 seconds if read-only and 600 seconds otherwise. Prepared statements were enabled.
Results
All of the results se relative QPS (rQPS) where:
- rQPS is: (QPS for my version / QPS for base version)
- base version is the QPS from MySQL 8.0.28
- my version is one of the other versions (8.0.30, 8.0.33, 8.0.39, 8.0.40)
The scan microbenchmark is the canary in the coal mine for bug 111538 as most of the CPU time is spent in InnoDB. Regressions that arrived after 8.0.28 remain unfixed in 8.0.40. The rQPS drops from MySQL 8.0.30 to 8.0.40 on 4 of the 5 servers.
rQPS rQPS
server 8.0.30 8.0.40
------ ------ ------
beelink 0.89 0.78
pn53 0.91 0.83
socket2 0.90 0.84
dell32 0.70 0.91
ax162-s 0.91 0.83
The update-index microbenchmark also has large regressions after 8.0.28. The QPS drops from 8.0.30 to 8.0.40 on 2 servers (beelink, pn53), remains about the same on 2 of them (socket2, dell32) and improves on 1 (ax162-s). The result for dell32 is lousy as update QPS drops almost in half after 8.0.28 and I will have more about that in a pending blog post.
rQPS rQPS
server 8.0.30 8.0.40
------ ------ ------
beelink 0.88 0.76
pn53 0.91 0.78
socket2 0.89 0.92
dell32 0.56 0.56
ax162-s 0.80 0.90
Results: beelink
Summary
- the big improvement to random-points starting in 8.0.33 is from fixing bug 102037
- for scan the problem is new CPU overhead after 8.0.28
- for update-index the problem is more CPU overhead, (maybe) more context switches and (maybe) more KB written to storage per update
Relative to: x.my8028_rel_o2nofp.z11a_bee.pk1
col-1 : x.my8030_rel_o2nofp.z11a_bee.pk1
col-2 : x.my8033_rel_o2nofp.z11a_bee.pk1
col-3 : x.my8039_rel_o2nofp.z11a_bee.pk1
col-4 : x.my8040_rel_o2nofp.z11a_bee.pk1
col-1 col-2 col-3 col-4
0.91 1.06 1.09 1.13 hot-points_range=100
1.00 0.91 0.92 0.92 point-query.pre_range=100
1.00 0.91 0.93 0.92 point-query_range=100
0.91 1.01 1.03 1.06 points-covered-pk.pre_range=100
0.90 1.00 1.03 1.06 points-covered-pk_range=100
0.89 1.00 1.03 1.07 points-covered-si.pre_range=100
0.89 1.00 1.03 1.06 points-covered-si_range=100
0.89 1.00 1.02 1.04 points-notcovered-pk.pre_range=100
0.89 1.00 1.02 1.05 points-notcovered-pk_range=100
0.88 0.93 0.98 1.01 points-notcovered-si.pre_range=100
0.88 0.93 0.98 1.01 points-notcovered-si_range=100
0.96 2.20 2.30 2.35 random-points.pre_range=1000
0.90 1.00 1.03 1.05 random-points.pre_range=100
0.94 0.91 0.93 0.94 random-points.pre_range=10
0.94 2.20 2.30 2.35 random-points_range=1000
0.90 1.00 1.03 1.05 random-points_range=100
0.93 0.91 0.93 0.94 random-points_range=10
0.95 0.90 0.90 0.96 range-covered-pk.pre_range=100
0.95 0.90 0.90 0.96 range-covered-pk_range=100
0.96 0.91 0.91 0.97 range-covered-si.pre_range=100
0.95 0.90 0.90 0.98 range-covered-si_range=100
0.97 0.93 0.94 0.97 range-notcovered-pk.pre_range=100
0.96 0.92 0.93 0.97 range-notcovered-pk_range=100
0.88 0.86 0.91 0.94 range-notcovered-si.pre_range=100
0.87 0.86 0.91 0.94 range-notcovered-si_range=100
0.97 0.94 0.93 0.98 read-only.pre_range=10000
0.97 0.93 0.93 0.94 read-only.pre_range=100
0.99 0.93 0.94 0.94 read-only.pre_range=10
0.98 0.95 0.94 0.99 read-only_range=10000
1.00 0.95 0.95 0.97 read-only_range=100
0.98 0.94 0.94 0.96 read-only_range=10
0.89 0.82 0.80 0.78 scan_range=100
0.95 0.88 0.89 0.90 delete_range=100
0.94 0.88 0.88 0.88 insert_range=100
0.97 0.93 0.93 0.95 read-write_range=100
0.96 0.92 0.92 0.93 read-write_range=10
0.88 0.82 0.84 0.76 update-index_range=100
0.94 0.87 0.89 0.90 update-inlist_range=100
0.96 0.89 0.90 0.91 update-nonindex_range=100
0.97 0.91 0.91 0.91 update-one_range=100
0.95 0.90 0.90 0.91 update-zipf_range=100
0.93 0.88 0.89 0.87 write-only_range=10000
For scan the problem is CPU overhead (see cpu/o below) which is 1.28X larger in 8.0.40 vs 8.0.28.
sb.met.scan.range100.pk1.dop1
--- absolute
cpu/o cs/o r/o rKB/o wKB/o o/s dbms
0.222553 2.534 0 0.001 0.035 55 x.my8028_rel_o2nofp
0.251913 6.640 0 0 0.052 49 x.my8030_rel_o2nofp
0.273314 7.107 0 0 0.039 45 x.my8033_rel_o2nofp
0.282176 15.578 0 0 0.053 44 x.my8039_rel_o2nofp
0.285792 7.622 0 0 0.041 43 x.my8040_rel_o2nofp
--- relative to first result
1.13 2.62 1 0.00 1.49 0.89 x.my8030_rel_o2nofp
1.23 2.80 1 0.00 1.11 0.82 x.my8033_rel_o2nofp
1.27 6.15 1 0.00 1.51 0.80 x.my8039_rel_o2nofp
1.28 3.01 1 0.00 1.17 0.78 x.my8040_rel_o2nofp
For update-index there are several changes after 8.0.28
- CPU per update (cpu/o) is 1.33X larger in 8.0.40 vs 8.0.28
- KB written per update (wKB/o) is 1.47X larger in 8.0.40 vs 8.0.28.
- A possible cause is that the writeback code path isn't slowed by regressions while the update codepath is slowed. But I am just waving my hands.
- This one is confusing because I use the same redo log size for all versions. However, configuration for the redo log changes in 8.0.30 (see here) and as a result I used 15 X 1G log segments in 8.0.28 vs 32 X 480M log segments in 8.0.40. I am repeating tests for 8.0.28 with 32 X 480M log segments. Using the output of the Pages ..., written line from SHOW ENGINE INNODB STATUS the pages written per update value is 0.97 for 8.0.28 vs 1.31 for 8.0.40. Perhaps 8.0.30+ is more aggressive about doing dirty page writeback but I use the same values for innodb_max_dirty_pages_pct_lwm (=80) and innodb_max_dirty_pages_pct (=90) for 8.0.28 to 8.0.40. I repeated tests for 8.0.28 using 32 redo log segments, but results (QPS, HW metrics) doesn't change with that.
- Context switches per update (cs/o) is 1.24X larger in 8.0.40 vs 8.0.28, but I am less sure this is a problem as the absolute rates (15.312 vs 18.917 per update) are not that big.
sb.met.update-index.range100.pk1.dop1
--- absolute
cpu/o cs/o r/o rKB/o wKB/o o/s dbms
0.008211 15.312 0 0 47.066 2649 x.my8028_rel_o2nofp
0.009487 17.353 0 0 66.043 2337 x.my8030_rel_o2nofp
0.009875 17.636 0 0 66.695 2178 x.my8033_rel_o2nofp
0.010162 17.959 0 0 67.203 2216 x.my8039_rel_o2nofp
0.010903 18.917 0 0 68.954 2024 x.my8040_rel_o2nofp
--- relative to first result
1.16 1.13 1 1 1.40 0.88 x.my8030_rel_o2nofp
1.20 1.15 1 1 1.42 0.82 x.my8033_rel_o2nofp
1.24 1.17 1 1 1.43 0.84 x.my8039_rel_o2nofp
1.33 1.24 1 1 1.47 0.76 x.my8040_rel_o2nofp
Results: pn53
Summary
- the big improvement to random-points starting in 8.0.33 is from fixing bug 102037
- for scan the problem is new CPU overhead after 8.0.28
- for update-index the problem is more CPU overhead, (maybe) more context switches and (maybe) more KB written to storage per update
- the changes to HW overheads here are similar to the changes above for the beelink server
Relative to: x.my8028_rel_o2nofp.z11a_c8r32.pk1
col-1 : x.my8030_rel_o2nofp.z11a_c8r32.pk1
col-2 : x.my8033_rel_o2nofp.z11a_c8r32.pk1
col-3 : x.my8039_rel_o2nofp.z11a_c8r32.pk1
col-4 : x.my8040_rel_o2nofp.z11a_c8r32.pk1
col-1 col-2 col-3 col-4
0.89 1.12 1.13 1.18 hot-points_range=100
0.97 0.94 0.93 0.94 point-query.pre_range=100
0.98 0.95 0.94 0.94 point-query_range=100
0.89 1.03 1.05 1.10 points-covered-pk_range=100
0.87 1.02 1.04 1.09 points-covered-si_range=100
0.91 1.03 1.05 1.11 points-notcovered-pk_range=100
0.84 0.96 1.01 1.06 points-notcovered-si_range=100
0.95 2.14 2.21 2.30 random-points_range=1000
0.89 1.03 1.06 1.10 random-points_range=100
0.96 0.94 0.95 0.97 random-points_range=10
0.97 0.93 0.92 0.99 range-covered-pk_range=100
0.98 0.93 0.93 0.98 range-covered-si_range=100
0.98 0.92 0.94 0.97 range-notcovered-pk_range=100
0.91 0.89 0.92 0.99 range-notcovered-si.pre_range=100
0.92 0.89 0.92 0.99 range-notcovered-si_range=100
0.97 0.94 0.98 1.01 read-only_range=10000
0.98 0.94 0.95 0.96 read-only_range=100
0.98 0.94 0.94 0.96 read-only_range=10
0.91 0.83 0.83 0.83 scan_range=100
0.96 0.91 0.92 0.92 delete_range=100
0.94 0.90 0.90 0.91 insert_range=100
0.97 0.93 0.94 0.95 read-write_range=100
0.97 0.93 0.94 0.95 read-write_range=10
0.91 0.85 0.84 0.78 update-index_range=100
0.94 0.91 0.91 0.94 update-inlist_range=100
0.96 0.92 0.92 0.93 update-nonindex_range=100
0.96 0.92 0.92 0.92 update-one_range=100
0.96 0.92 0.92 0.93 update-zipf_range=100
0.94 0.90 0.91 0.90 write-only_range=10000
For scan the problem is CPU overhead (see cpu/o below) which is 1.22X larger in 8.0.40 vs 8.0.28.
sb.met.scan.range100.pk1.dop1
--- absolute
cpu/o cs/o r/o rKB/o wKB/o o/s dbms
0.532922 6.523 0 0 0.092 23 x.my8028_rel_o2nofp
0.586497 15.751 0 0 0.122 21 x.my8030_rel_o2nofp
0.650201 17.147 0 0 0.105 19 x.my8033_rel_o2nofp
0.651073 17.145 0 0 0.098 19 x.my8039_rel_o2nofp
0.649427 17.975 0 0 0.128 19 x.my8040_rel_o2nofp
--- relative to first result
1.10 2.41 1 1 1.33 0.91 x.my8030_rel_o2nofp
1.22 2.63 1 1 1.14 0.83 x.my8033_rel_o2nofp
1.22 2.63 1 1 1.07 0.83 x.my8039_rel_o2nofp
1.22 2.76 1 1 1.39 0.83 x.my8040_rel_o2nofp
For update-index there are several changes after 8.0.28
- the changes here are similar to the ones above for the beelink server
- CPU per update (cpu/o) is 1.29X larger in 8.0.40 vs 8.0.28
- KB written per update (wKB/o) is 1.48X larger in 8.0.40 vs 8.0.28
- context switches per update (cs/o) is 1.22X larger in 8.0.40 vs 8.0.28
sb.met.update-index.range100.pk1.dop1
--- absolute
cpu/o cs/o r/o rKB/o wKB/o o/s dbms
0.004972 11.052 0 0 43.246 4198 x.my8028_rel_o2nofp
0.005658 11.948 0 0 61.606 3810 x.my8030_rel_o2nofp
0.005927 12.288 0 0 62.306 3578 x.my8033_rel_o2nofp
0.005997 12.536 0 0 62.804 3544 x.my8039_rel_o2nofp
0.006401 13.486 0 0 63.98 3294 x.my8040_rel_o2nofp
--- relative to first result
1.14 1.08 1 1 1.42 0.91 x.my8030_rel_o2nofp
1.19 1.11 1 1 1.44 0.85 x.my8033_rel_o2nofp
1.21 1.13 1 1 1.45 0.84 x.my8039_rel_o2nofp
1.29 1.22 1 1 1.48 0.78 x.my8040_rel_o2nofp
Results: socket2
Summary
- the big improvement to random-points starting in 8.0.33 is from fixing bug 102037
- for scan the problem is new CPU overhead after 8.0.28
- for update-index the problem is more CPU overhead, (maybe) more context switches and (maybe) more KB written to storage per update
- the changes to HW overheads here and the regressions here are similar to but smaller than the changes above for the beelink and pn53 servers
Relative to: x.my8028_rel_o2nofp.z11a_c24r64.pk1
col-1 : x.my8030_rel_o2nofp.z11a_c24r64.pk1
col-2 : x.my8033_rel_o2nofp.z11a_c24r64.pk1
col-3 : x.my8039_rel_o2nofp.z11a_c24r64.pk1
col-4 : x.my8040_rel_o2nofp.z11a_c24r64.pk1
col-1 col-2 col-3 col-4
0.94 1.04 1.01 1.10 hot-points_range=100
0.97 0.94 0.97 0.96 point-query.pre_range=100
0.99 0.96 0.97 0.97 point-query_range=100
0.92 1.00 1.04 1.06 points-covered-pk_range=100
0.90 0.98 1.03 1.04 points-covered-si_range=100
0.92 1.00 1.04 1.06 points-notcovered-pk_range=100
0.89 0.94 0.98 1.00 points-notcovered-si_range=100
0.93 1.76 1.81 1.87 random-points_range=1000
0.92 1.00 1.03 1.06 random-points_range=100
0.94 0.93 0.95 0.97 random-points_range=10
0.98 0.94 0.95 1.00 range-covered-pk_range=100
0.97 0.94 0.94 1.00 range-covered-si_range=100
0.98 0.94 0.95 0.98 range-notcovered-pk_range=100
0.91 0.91 0.94 0.98 range-notcovered-si.pre_range=100
0.90 0.90 0.93 0.98 range-notcovered-si_range=100
0.98 0.97 0.97 1.00 read-only_range=10000
0.99 0.96 0.96 0.98 read-only_range=100
0.99 0.95 0.96 0.97 read-only_range=10
0.90 0.84 0.80 0.84 scan_range=100
0.95 0.93 0.94 0.94 delete_range=100
0.93 0.92 0.93 0.93 insert_range=100
0.97 0.94 0.95 0.96 read-write_range=100
0.97 0.94 0.95 0.96 read-write_range=10
0.89 0.90 0.88 0.92 update-index_range=100
0.96 0.96 0.98 0.98 update-inlist_range=100
0.95 0.94 0.94 0.94 update-nonindex_range=100
0.95 0.93 0.93 0.93 update-one_range=100
0.96 0.95 0.94 0.94 update-zipf_range=100
0.91 0.90 0.92 0.90 write-only_range=10000
For scan the problem is CPU overhead (see cpu/o below) which is 1.16X larger in 8.0.40 vs 8.0.28.
sb.met.scan.range100.pk1.dop8
--- absolute
cpu/o cs/o r/o rKB/o wKB/o o/s dbms
0.180593 4.756 0 0 0.007 174 x.my8028_rel_o2nofp
0.199852 5.846 0 0 0.008 156 x.my8030_rel_o2nofp
0.213651 6.156 0 0 0.008 146 x.my8033_rel_o2nofp
0.223563 6.615 0 0 0.008 140 x.my8039_rel_o2nofp
0.209727 6.136 0 0 0.008 146 x.my8040_rel_o2nofp
--- relative to first result
1.11 1.23 1 1 1.14 0.90 x.my8030_rel_o2nofp
1.18 1.29 1 1 1.14 0.84 x.my8033_rel_o2nofp
1.24 1.39 1 1 1.14 0.80 x.my8039_rel_o2nofp
1.16 1.29 1 1 1.14 0.84 x.my8040_rel_o2nofp
For update-index there are several changes after 8.0.28
- the changes here are similar to the ones above for the beelink and pn53 servers, but not as large. And the regression here is also not as large.
- CPU per update (cpu/o) is 1.09X larger in 8.0.40 vs 8.0.28
- KB written per update (wKB/o) is 1.30X larger in 8.0.40 vs 8.0.28
- context switches per update (cs/o) is 1.05X larger in 8.0.40 vs 8.0.28
sb.met.update-index.range100.pk1.dop16--- absolute
cpu/o cs/o r/o rKB/o wKB/o o/s dbms
0.001677 11.125 0 0 16.578 38477 x.my8028_rel_o2nofp
0.001859 12.496 0 0 24.596 34256 x.my8030_rel_o2nof
0.001849 11.945 0 0 22.664 34786 x.my8033_rel_o2nofp
0.001907 12.115 0 0 24.136 33971 x.my8039_rel_o2nofp
0.001823 11.670 0 0 21.581 35336 x.my8040_rel_o2nofp
--- relative to first result
1.11 1.12 1 1 1.48 0.89 x.my8030_rel_o2nofp
1.10 1.07 1 1 1.37 0.90 x.my8033_rel_o2nofp
1.14 1.09 1 1 1.46 0.88 x.my8039_rel_o2nofp
1.09 1.05 1 1 1.30 0.92 x.my8040_rel_o2nofp
Results: dell32
Summary
- the big improvement to random-points starting in 8.0.33 is from fixing bug 102037
- for scan the problem is new CPU overhead after 8.0.28
- for update-index the problem is more CPU overhead, (maybe) more context switches and (maybe) more KB written to storage per update. I still don't understand why the regression for update-index here is so much worse than on the other servers.
Relative to: x.my8028_rel_o2nofp.z11a_c32r128.pk1
col-1 : x.my8030_rel_o2nofp.z11a_c32r128.pk1
col-2 : x.my8033_rel_o2nofp.z11a_c32r128.pk1
col-3 : x.my8039_rel_o2nofp.z11a_c32r128.pk1
col-4 : x.my8040_rel_o2nofp.z11a_c32r128.pk1
col-1 col-2 col-3 col-4
0.91 1.01 1.04 1.06 hot-points_range=100
0.98 0.94 0.93 0.95 point-query.pre_range=100
0.98 0.94 0.92 0.95 point-query_range=100
0.89 0.96 1.00 1.05 points-covered-pk.pre_range=100
0.89 0.96 1.00 1.05 points-covered-pk_range=100
0.87 0.93 1.00 1.04 points-covered-si.pre_range=100
0.88 0.94 1.01 1.05 points-covered-si_range=100
0.89 0.96 1.00 1.05 points-notcovered-pk.pre_range=100
0.89 0.96 1.00 1.05 points-notcovered-pk_range=100
0.89 0.90 0.98 1.02 points-notcovered-si.pre_range=100
0.88 0.90 0.97 1.01 points-notcovered-si_range=100
0.92 1.73 1.80 1.89 random-points.pre_range=1000
0.89 0.96 1.00 1.05 random-points.pre_range=100
0.94 0.92 0.93 0.96 random-points.pre_range=10
0.92 1.75 1.82 1.91 random-points_range=1000
0.89 0.96 1.00 1.05 random-points_range=100
0.94 0.92 0.93 0.96 random-points_range=10
0.96 0.93 0.93 0.99 range-covered-pk.pre_range=100
0.96 0.93 0.93 0.99 range-covered-pk_range=100
0.97 0.93 0.93 0.99 range-covered-si.pre_range=100
0.97 0.93 0.93 0.99 range-covered-si_range=100
0.96 0.92 0.93 0.96 range-notcovered-pk.pre_range=100
0.96 0.92 0.93 0.96 range-notcovered-pk_range=100
0.88 0.86 0.92 0.97 range-notcovered-si.pre_range=100
0.88 0.86 0.91 0.97 range-notcovered-si_range=100
0.98 0.96 0.98 1.01 read-only.pre_range=10000
0.97 0.93 0.94 0.95 read-only.pre_range=100
0.97 0.94 0.93 0.95 read-only.pre_range=10
0.99 0.96 0.97 1.01 read-only_range=10000
0.97 0.93 0.94 0.95 read-only_range=100
0.97 0.94 0.94 0.95 read-only_range=10
0.70 0.87 0.85 0.91 scan_range=100
0.94 0.92 0.93 0.93 delete_range=100
0.95 0.94 0.94 0.94 insert_range=100
0.97 0.93 0.94 0.95 read-write_range=100
0.97 0.93 0.93 0.95 read-write_range=10
0.56 0.56 0.56 0.56 update-index_range=100
0.99 0.99 1.00 1.03 update-inlist_range=100
0.96 0.95 0.95 0.95 update-nonindex_range=100
0.96 0.93 0.94 0.94 update-one_range=100
0.96 0.95 0.95 0.96 update-zipf_range=100
0.95 0.93 0.93 0.94 write-only_range=10000
For scan the problem is CPU overhead (see cpu/o below) which is 1.13X larger in 8.0.40 vs 8.0.28.
sb.met.scan.range100.pk1.dop8
--- absolute
cpu/o cs/o r/o rKB/o wKB/o o/s dbms
0.093496 3.256 0 0 0.006 246 x.my8028_rel_o2nofp
0.104925 5.301 0 0 0.007 172 x.my8030_rel_o2nofp
0.110480 4.195 0 0 0.006 215 x.my8033_rel_o2nofp
0.113259 4.306 0 0 0.006 210 x.my8039_rel_o2nofp
0.106105 4.065 0 0 0.006 225 x.my8040_rel_o2nofp
--- relative to first result
1.12 1.63 1 1 1.17 0.70 x.my8030_rel_o2nofp
1.18 1.29 1 1 1.00 0.87 x.my8033_rel_o2nofp
1.21 1.32 1 1 1.00 0.85 x.my8039_rel_o2nofp
1.13 1.25 1 1 1.00 0.91 x.my8040_rel_o2nofp
For update-index there are several changes after 8.0.28
- the regression here is huge and I am not sure about the root causes
- CPU per update (cpu/o) is 1.50X larger in 8.0.40 vs 8.0.28
- KB written per update (wKB/o) is 2.32X larger in 8.0.40 vs 8.0.28
- context switches per update (cs/o) is 1.41X larger in 8.0.40 vs 8.0.28
sb.met.update-index.range100.pk1.dop24
--- absolute
cpu/o cs/o r/o rKB/o wKB/o o/s dbms
0.001185 10.777 0 0 14.275 51319 x.my8028_rel_o2nofp
0.001739 15.475 0 0 33.68 28515 x.my8030_rel_o2nofp
0.001794 15.192 0 0 32.953 28872 x.my8033_rel_o2nofp
0.001783 15.196 0 0 33.05 28983 x.my8039_rel_o2nofp
0.001781 15.242 0 0 33.139 28966 x.my8040_rel_o2nofp
--- relative to first result
1.47 1.44 1 1 2.36 0.56 x.my8030_rel_o2nofp
1.51 1.41 1 1 2.31 0.56 x.my8033_rel_o2nofp
1.50 1.41 1 1 2.32 0.56 x.my8039_rel_o2nofp
1.50 1.41 1 1 2.32 0.56 x.my8040_rel_o2nofp
Results: ax162-s
Summary
- the big improvement to random-points starting in 8.0.33 is from fixing bug 102037
- for scan the problem is new CPU overhead after 8.0.28
- for update-index the problem is more CPU overhead, (maybe) more context switches and (maybe) more KB written to storage per update
- the changes to HW overheads here and the regressions here are similar to but smaller than the changes above for the beelink and pn53 servers
Relative to: x.my8028_rel_o2nofp.z11a_c32r128.pk1
col-1 : x.my8030_rel_o2nofp.z11a_c32r128.pk1
col-2 : x.my8033_rel_o2nofp.z11a_c32r128.pk1
col-3 : x.my8039_rel_o2nofp.z11a_c32r128.pk1
col-4 : x.my8040_rel_o2nofp.z11a_c32r128.pk1
col-1 col-2 col-3 col-4
0.89 0.93 0.99 0.99 hot-points_range=100
0.98 0.94 0.94 0.94 point-query.pre_range=100
0.96 0.93 0.93 0.94 point-query_range=100
0.90 0.96 1.01 1.02 points-covered-pk.pre_range=100
0.89 0.95 1.00 1.02 points-covered-pk_range=100
0.87 0.89 0.95 0.98 points-covered-si.pre_range=100
0.88 0.89 0.96 0.99 points-covered-si_range=100
0.90 0.96 1.01 1.02 points-notcovered-pk.pre_range=100
0.89 0.95 1.00 1.02 points-notcovered-pk_range=100
0.86 0.88 0.94 0.96 points-notcovered-si.pre_range=100
0.87 0.88 0.94 0.96 points-notcovered-si_range=100
0.95 1.61 1.68 1.72 random-points.pre_range=1000
0.90 0.96 1.01 1.02 random-points.pre_range=100
0.92 0.91 0.95 0.94 random-points.pre_range=10
0.95 1.62 1.69 1.72 random-points_range=1000
0.89 0.95 1.00 1.02 random-points_range=100
0.91 0.90 0.94 0.94 random-points_range=10
0.92 0.90 0.92 0.95 range-covered-pk.pre_range=100
0.92 0.90 0.92 0.95 range-covered-pk_range=100
0.93 0.91 0.93 0.96 range-covered-si.pre_range=100
0.94 0.91 0.94 0.96 range-covered-si_range=100
0.93 0.90 0.93 0.94 range-notcovered-pk.pre_range=100
0.93 0.90 0.93 0.94 range-notcovered-pk_range=100
0.87 0.87 0.92 0.93 range-notcovered-si.pre_range=100
0.87 0.86 0.92 0.93 range-notcovered-si_range=100
0.99 0.97 0.98 1.01 read-only.pre_range=10000
0.95 0.92 0.93 0.94 read-only.pre_range=100
0.96 0.92 0.94 0.94 read-only.pre_range=10
0.99 0.97 0.98 1.00 read-only_range=10000
0.95 0.92 0.93 0.93 read-only_range=100
0.96 0.92 0.93 0.94 read-only_range=10
0.91 0.82 0.81 0.83 scan_range=100
0.93 0.91 0.94 0.93 delete_range=100
0.96 0.94 0.94 0.94 insert_range=100
0.95 0.94 0.95 0.95 read-write_range=100
0.96 0.94 0.95 0.95 read-write_range=10
0.80 0.85 0.83 0.90 update-index_range=100
0.99 1.04 1.03 1.02 update-inlist_range=100
0.98 0.97 0.96 0.96 update-nonindex_range=100
0.96 0.93 0.94 0.95 update-one_range=100
0.97 0.97 0.96 0.95 update-zipf_range=100
0.94 0.93 0.94 0.92 write-only_range=10000
For scan the problem is CPU overhead (see cpu/o below) which is 1.20X larger in 8.0.40 vs 8.0.28.
sb.met.scan.range100.pk1.dop8
--- absolute
cpu/o cs/o r/o rKB/o wKB/o o/s dbms
0.018767 0.552 0 0 0.052 872 x.my8028_rel_o2nofp
0.020741 0.746 0 0 0.016 793 x.my8030_rel_o2nofp
0.022690 0.808 0 0 0.004 713 x.my8033_rel_o2nofp
0.023079 0.791 0 0 0.003 706 x.my8039_rel_o2nofp
0.022533 0.800 0 0 0.013 725 x.my8040_rel_o2nofp
--- relative to first result
1.11 1.35 1 1 0.31 0.91 x.my8030_rel_o2nofp
1.21 1.46 1 1 0.08 0.82 x.my8033_rel_o2nofp
1.23 1.43 1 1 0.06 0.81 x.my8039_rel_o2nofp
1.20 1.45 1 1 0.25 0.83 x.my8040_rel_o2nofp
For update-index there are several changes after 8.0.28
- the changes here are similar to the ones above for the beelink and pn53 servers, but not as large. And the regression here is also not as large
- CPU per update (cpu/o) is 1.09X larger in 8.0.40 vs 8.0.28
- KB written per update (wKB/o) is 1.37X larger in 8.0.40 vs 8.0.28
- context switches per update (cs/o) is 1.03X larger in 8.0.40 vs 8.0.28
sb.met.update-index.range100.pk1.dop40
--- absolute
cpu/o cs/o r/o rKB/o wKB/o o/s dbms
0.000550 12.157 0 0 8.354 83129 x.my8028_rel_o2nofp
0.000641 13.426 0 0 14.791 66432 x.my8030_rel_o2nofp
0.000623 12.988 0 0 13.155 70890 x.my8033_rel_o2nofp
0.000637 13.044 0 0 14.158 68996 x.my8039_rel_o2nofp
0.000600 12.514 0 0 11.416 74917 x.my8040_rel_o2nofp
--- relative to first result
1.17 1.10 1 1 1.77 0.80 x.my8030_rel_o2nofp
1.13 1.07 1 1 1.57 0.85 x.my8033_rel_o2nofp
1.16 1.07 1 1 1.69 0.83 x.my8039_rel_o2nofp
1.09 1.03 1 1 1.37 0.90 x.my8040_rel_o2nofp