Tuesday, October 29, 2024

Trying out Advanced MySQL

I recently learned of the Advanced MySQL project on github via a tweet. There is a book and a repo for an enhanced version of 8.0.40. I wish I had the time to read some of that book and learn more about the enhancements but for now I just ran my sysbench tests for it on a few large servers. Hopefully I will also run the Insert Benchmark for it on the same servers.

tl;dr

  • Advanced MySQL is good for performance
  • On the dell32 server that has a large regression for the update-index microbenchmark in MySQL 8.0.40, the Advanced MySQL project fixes most of the regression
  • On the ax162-s server for which the update-index regression is smaller, the Advanced MySQL project still does a lot better than upstream
Builds

I used upstream MySQL 8.0.28 and 8.0.40 as well as the advanced MySQL version of 8.0.40. All were compiled from source using CMAKE_BUILD_TYPE =Release, -O2 and -fno-omit-frame-pointer.

The builds are named:
  • my8028_rel_o2nofp - upstream MySQL 8.0.28
  • my8040_rel_o2nofp - upstream MySQL 8.0.40
  • my8040adv_rel_o2nofp - Advanced MySQL 8.0.40
The my.cnf files are here for 8.0.28 and for 8.0.40.

Hardware

The servers are
  • dell32
    • Dell Precision 7865 Tower Workstation with 1 socket, 128G RAM, AMD Ryzen Threadripper PRO 5975WX with 32-Cores, 2 m.2 SSD (each 2TB, RAID SW 0, ext4). 
  • ax162-s
    • AMD EPYC 9454P 48-Core Processor with SMT disabled, 128G RAM, Ubuntu 22.04 and ext4 on 2 NVMe devices with SW RAID 1. This is in the Hetzner cloud.
Benchmark

I used sysbench and my usage is explained here. A full run has 42 microbenchmarks and most test only 1 type of SQL statement.

The benchmark is run with ...
  • dell32 - 24 threads, 8 tables, 10M rows/table
  • ax162-s - 40 threads, 8 tables, 10M rows/table
Each microbenchmark runs for 300 seconds if read-only and 600 seconds otherwise. Prepared statements were enabled.

Results: overview

All of the results use relative QPS (rQPS) where:
  • rQPS is: (QPS for my version / QPS for base version)
  • base version is the QPS from MySQL 8.0.28
  • my version is one of the other versions (8.0.40 from upstream, 8.0.40 from Advanced MySQL)
Results: dell32

Summary
  • 8.0.40 from upstream and Advanced MySQL have similar performance on most microbenchmarks
  • Advanced MySQL does a lot better on the update-index microbenchmark while upstream has a large regression that started in 8.0.30 (and perhaps in 8.0.29). 
Relative to: my8028_rel_o2nofp
col-1 : my8040_rel_o2nofp
col-2 : my8040adv_rel_o2nofp

col-1   col-2
1.06    1.07    hot-points_range=100
0.95    0.95    point-query.pre_range=100
0.95    0.95    point-query_range=100
1.05    1.05    points-covered-pk.pre_range=100
1.05    1.05    points-covered-pk_range=100
1.04    1.06    points-covered-si.pre_range=100
1.05    1.07    points-covered-si_range=100
1.05    1.05    points-notcovered-pk.pre_range=100
1.05    1.06    points-notcovered-pk_range=100
1.02    1.05    points-notcovered-si.pre_range=100
1.01    1.04    points-notcovered-si_range=100
1.89    1.89    random-points.pre_range=1000
1.05    1.05    random-points.pre_range=100
0.96    0.96    random-points.pre_range=10
1.91    1.92    random-points_range=1000
1.05    1.05    random-points_range=100
0.96    0.96    random-points_range=10
0.99    0.97    range-covered-pk.pre_range=100
0.99    0.97    range-covered-pk_range=100
0.99    0.96    range-covered-si.pre_range=100
0.99    0.97    range-covered-si_range=100
0.96    0.97    range-notcovered-pk.pre_range=100
0.96    0.96    range-notcovered-pk_range=100
0.97    0.99    range-notcovered-si.pre_range=100
0.97    0.98    range-notcovered-si_range=100
1.01    1.01    read-only.pre_range=10000
0.95    0.95    read-only.pre_range=100
0.95    0.95    read-only.pre_range=10
1.01    1.01    read-only_range=10000
0.95    0.95    read-only_range=100
0.95    0.95    read-only_range=10
0.91    0.85    scan_range=100
0.93    1.00    delete_range=100
0.94    0.96    insert_range=100
0.95    0.95    read-write_range=100
0.95    0.95    read-write_range=10
0.56    0.84    update-index_range=100
1.03    1.03    update-inlist_range=100
0.95    0.97    update-nonindex_range=100
0.94    0.94    update-one_range=100
0.96    0.97    update-zipf_range=100
0.94    0.98    write-only_range=10000

From vmstat metrics both the CPU overhead (cpu/o) and context switch rate (cs/o) are much lower in Advanced MySQL 8.0.40. Metrics from vmstat and iostat for all of the microbenchmarks are here.

sb.met.update-index.range100.pk1.dop40
--- absolute
cpu/o           cs/o    r/o     rKB/o   wKB/o   o/s     dbms
0.000550        12.157  0       0       8.354   83129   my8028_rel_o2nofp
0.000600        12.514  0       0       11.416  74917   my8040_rel_o2nofp
0.000469        8.063   0       0       7.437   98180   my8040adv_rel_o2nofp
--- relative to first result
1.09            1.03    1       1       1.37    0.90    my8040_rel_o2nofp
0.85            0.66    1       1       0.89    1.18    my8040adv_rel_o2nofp

sb.met.update-nonindex.range100.pk1.dop40
--- absolute
cpu/o           cs/o    r/o     rKB/o   wKB/o   o/s     dbms
0.000363        12.236  0       0       2.611   114970  my8028_rel_o2nofp
0.000383        12.235  0       0       2.794   109896  my8040_rel_o2nofp
0.000310        8.128   0       0       2.663   121196  my8040adv_rel_o2nofp
--- relative to first result
1.06            1.00    1       1       1.07    0.96    my8040_rel_o2nofp
0.85            0.66    1       1       1.02    1.05    my8040adv_rel_o2nofp

Results: ax162-s

Summary
  • 8.0.40 from upstream and Advanced MySQL have similar performance on most microbenchmarks
  • Advanced MySQL does a lot better on the update-index microbenchmark while upstream has a large regression that started in 8.0.30 (and perhaps in 8.0.29). 

Relative to: my8028_rel_o2nofp
col-1 : my8040_rel_o2nofp
col-2 : my8040adv_rel_o2nofp

col-1   col-2
1.06    1.07    hot-points_range=100
0.95    0.95    point-query.pre_range=100
0.95    0.95    point-query_range=100
1.05    1.05    points-covered-pk.pre_range=100
1.05    1.05    points-covered-pk_range=100
1.04    1.06    points-covered-si.pre_range=100
1.05    1.07    points-covered-si_range=100
1.05    1.05    points-notcovered-pk.pre_range=100
1.05    1.06    points-notcovered-pk_range=100
1.02    1.05    points-notcovered-si.pre_range=100
1.01    1.04    points-notcovered-si_range=100
1.89    1.89    random-points.pre_range=1000
1.05    1.05    random-points.pre_range=100
0.96    0.96    random-points.pre_range=10
1.91    1.92    random-points_range=1000
1.05    1.05    random-points_range=100
0.96    0.96    random-points_range=10
0.99    0.97    range-covered-pk.pre_range=100
0.99    0.97    range-covered-pk_range=100
0.99    0.96    range-covered-si.pre_range=100
0.99    0.97    range-covered-si_range=100
0.96    0.97    range-notcovered-pk.pre_range=100
0.96    0.96    range-notcovered-pk_range=100
0.97    0.99    range-notcovered-si.pre_range=100
0.97    0.98    range-notcovered-si_range=100
1.01    1.01    read-only.pre_range=10000
0.95    0.95    read-only.pre_range=100
0.95    0.95    read-only.pre_range=10
1.01    1.01    read-only_range=10000
0.95    0.95    read-only_range=100
0.95    0.95    read-only_range=10
0.91    0.85    scan_range=100
0.93    1.00    delete_range=100
0.94    0.96    insert_range=100
0.95    0.95    read-write_range=100
0.95    0.95    read-write_range=10
0.56    0.84    update-index_range=100
1.03    1.03    update-inlist_range=100
0.95    0.97    update-nonindex_range=100
0.94    0.94    update-one_range=100
0.96    0.97    update-zipf_range=100
0.94    0.98    write-only_range=10000

From vmstat metrics both the CPU overhead (cpu/o) and context switch rate (cs/o) are much lower in Advanced MySQL 8.0.40. Metrics from vmstat and iostat for all of the microbenchmarks are here.

sb.met.update-index.range100.pk1.dop24
--- absolute
cpu/o           cs/o    r/o     rKB/o   wKB/o   o/s     dbms
0.001185        10.777  0       0       14.275  51319   my8028_rel_o2nofp
0.001781        15.242  0       0       33.139  28966   my8040_rel_o2nofp
0.001328        9.994   0       0       21.068  43145   my8040adv_rel_o2nofp
--- relative to first result
1.50            1.41    1       1       2.32    0.56    my8040_rel_o2nofp
1.12            0.93    1       1       1.48    0.84    my8040adv_rel_o2nofp

sb.met.update-nonindex.range100.pk1.dop24
--- absolute
cpu/o           cs/o    r/o     rKB/o   wKB/o   o/s     dbms
0.000610        9.890   0       0       2.788   84660   my8028_rel_o2nofp
0.000645        10.006  0       0       3.215   80750   my8040_rel_o2nofp
0.000590        7.445   0       0       3.123   81943   my8040adv_rel_o2nofp
--- relative to first result
1.06            1.01    1       1       1.15    0.95    my8040_rel_o2nofp
0.97            0.75    1       1       1.12    0.97    my8040adv_rel_o2nofp





















No comments:

Post a Comment

Battle of the Mallocators

If you use RocksDB and want to avoid OOM then use jemalloc or tcmalloc and avoid glibc malloc. That was true in 2015 and remains true in 202...