Tuesday, September 24, 2024

The size of the mysqld binary as a proxy for innovation

I have been documenting performance regressions over time in MySQL. The regressions mean that some workloads get less throughput because the server uses more CPU per SQL operation. From perf stat I see there is more instruction cache and TLB activity per query. I also see that it takes more instructions per query. A recent blog post from me has more details.

This is bloat if you are a pessimist and a side-effect of innovation if you are an optimist. I must repeat that the issue is more serious for low-concurrency workloads because there has been much great work to reduce mutex contention.

The table below shows the size of the mysqld binary both as-is (not stripped, includes debug symbols) and stripped. The size of the stripped binary (almost) doubled from the last 5.6 release (5.6.51) to the last 5.7 release (5.7.44). It doubled again from 5.7.44 to a somewhat recent 8.0 release (8.0.28).

Assuming the binary size is a proxy for innovation then there is much innovation (2X per major release). But I am not sure that ends well given the impact on CPU overhead.

So I ask two things:

  1. Be more careful about innovation going forward
  2. Start using Nyrkio to detect regressions early in the development cycle

        -- size in MB --
version as-is   stripped
5651     28     16
571      29     15
573      31     16
575      40     19
577      69     26
579      71     27
5710     72     27
5719     72     27
5727     73     28
5735     80     30
5744     81     30
800     117     39
801     109     39
802     159     47
804     176     47
8012    179     50
8013    190     52
8014    192     52
8015    192     52
8016    194     54
8017    196     55
8018    207     57
8019    209     57
8020    213     58
8021    219     58
8022    221     59
8023    223     60
8024    225     60
8025    225     60
8026    242     62
8027    232     61
8028    233     61

No comments:

Post a Comment

Vector indexes, MariaDB & pgvector, large server, small dataset: part 2

This post has results for vector index support in MariaDB and Postgres. This work was done by  Small Datum LLC  and sponsored by the MariaDB...