Thursday, September 26, 2024

InnoDB code bloat in MySQL 8.0 makes me sad

InnoDB uses a lot more CPU per query in MySQL 8.0 which makes it a lot slower for low-concurrency workloads that are CPU-bound. That is offset by improvements which reduce mutex contention which means it might not be slower for high-concurrency workloads?

It would be great to get the reduction in mutex contention without the increase in CPU overhead. Alas, I don't think that is ever going to happen and this makes me sad. I doubt the performance regressions for InnoDB in MySQL 8.0 will ever be fixed and the workarounds are: MyRocks, MariaDB, MySQL 5.7 forever, Postgres.

I explained in a recent post that the problems are code bloat:

  • MySQL 8.0 uses more instructions per query
  • MySQL 8.0 wastes more time on cache and TLB activity per query 
Another recent post shows that the MySQL binary is growing at an alarming rate. This post has more details with a focus on InnoDB. Again, this makes me sad. InnoDB was very good to me for a long time but that time is ending.

I have spent much time trying to find a way to undo some of the bloat and I have learned multiple times that I can't undo the damage via compile-time options, even though there are a few things that might make things 5% to 20% faster. But some of those things also make older MySQL faster.

Updates
  • v1 of this post claimed that InnoDB started to use Boost in MySQL 8.0.2. That is incorrect, it started to use STL (unordered_map, array) in 8.0.2
  • see Bloat in 8.0.2 from STL below for details on the impact of the diff that adds STL in 8.0.2
Bloat in 8.0.2 from STL

While I suspect that the new usage of STL with InnoDB in fil0fil.cc contributes to code bloat that I report below, and there is a big jump in the size of fil0fil.o from 8.0.1 to 8.0.2, it is difficult to prove that is one of the causes. And even the results here don't prove it because the diff that adds STL in 8.0.2 is a large squash merge with many other changes.

I compiled MySQL at the diff that first uses STL in fil0fil.cc and the one prior to it:
  • 201b2b20d1
    • adds the usage of STL. But it is a squash merge that combines many commits. So STL remains a suspect but I am still not certain. Regardless, the results show that things grow a lot from this commit.
  • 817379925c
    • the diff prior to 201b2b20d1. 
The patches that I had to apply to get these to compile and the CMake command line are here. Too bad it is hard to automate the search for bloat because all of these builds need patches to compile. I won't blame C++ for this because I don't need patches to compile old versions of RocksDB.

The conclusion, from 817379925c to 201b2b20d1
  • libinnobase.a grows by ~5M from 26492K to 31518K
  • fil0fil.o grows by 2.6X from 548K to 1440K
A few things that don't have a big impact

A short summary of things I tried:
  • MySQL 5.7 started to use -fPIC while compiling the server. That was not used in 5.6. Alas, switching from -fPIC to -fpic doesn't have a big impact.
  • Avoiding -fPIC via -DDISABLE_SHARED (in the cases where that doesn't break the build) also doesn't have a big impact. 
  • Compiling with -DWITH_LTO=ON makes things ~5% faster
  • PGO can be a big deal, but needs more evaluation to understand whether I need PGO build per workload which is costly to maintain
Measuring code bloat

InnoDB started to use STL in MySQL 8.0.2-dmr and that is definitely part of the code-bloat problem.

Here I measure bloat using several methods. I am sure some of them are more flawed then others but they all show much bloat in MySQL 8.0. The methods are:
  • lines of code (yes, this is far from perfect)
  • size of libinnobase.a for MySQL compiled with CMAKE_BUILD_TYPE=Release
  • size of object files for MySQL compiled with CMAKE_BUILD_TYPE=Release
From the files I checked, fil0fil.cc grew the most
Up and up, keep on growing!
Based on the size of object files there is much innovation in fil0fil.cc. There is a big jump from 8.0.1 to 8.0.2 -- STL (unordered_map, array) is first used in 8.0.2, and there were many other changes.
There is small growth up to 8.0.28 and then things take off. If 8.0.40 really fixes bug 111538 then some of this will be undone.

7 comments:

  1. This comment has been removed by the author.

    ReplyDelete
    Replies
    1. At this point it is more correlation than causation. I know that the usage of STL arrives in fil0fil.cc in 8.0.2. Will update the blog posts after I attempt to compile it at that commit and the prior commit.

      Delete
  2. > InnoDB started to use STL in MySQL 8.0.2-dmr and that is definitely part of the code-bloat problem.
    Hmm I realize how could that be a problem. Templatized code would lead to code bloat. So you may ignore my previous and this comment.

    ReplyDelete
  3. I wouldn't over-index on code bloat for performance: while you typically won't ever get the size of software back to where is was (very hard to undo adding features), you can get back performance. For example, MongoDB gradually got slower from 4.4 version to 7.3, as new features and capabilities slowed hot code paths. However in 8.0, we addressed most of those regressions and improved performance for some important workloads to heights not seen before. As you can slow performance through hundreds of tiny regressions, you can also improve it. Additionally, you need to shift from just fighting regressions to looking holistically at areas you can claw back perf.

    ReplyDelete
    Replies
    1. The focus on code bloat is motivated by results (linked in this blog post) showing that iTLB and icache activity is greatly increased over time.

      While I am all for clawing back performance, I don't have to capacity to do that on my own. And if it is done without cooperation from upstream, then it will all get undone.

      Delete
  4. Interesting you are mentioning fil0fil.cc, I am spending a lot of time in there for MySQL Startup with many tables. From what I understand, the new data dictionary in 8.0 generated part of that code bloat, unclear if this impacted runtime.

    https://jfg-mysql.blogspot.com/2024/09/blog-post.html

    ReplyDelete
    Replies
    1. Thanks for making MySQL better.

      It is very expensive to map regressions back to diffs so long after the fact. Anything older than MySQL 8.0.24 needs patches to compile on modern Ubuntu and while I already have the patches archived for the point releases, those patches are usually not sufficient for the diffs in between point releases.

      Delete

Vector indexes, MariaDB & pgvector, large server, small dataset: part 2

This post has results for vector index support in MariaDB and Postgres. This work was done by  Small Datum LLC  and sponsored by the MariaDB...