Showing posts with label mysql. Show all posts
Showing posts with label mysql. Show all posts

Monday, August 25, 2025

MySQL 5.6 thru 9.4: small server, Insert Benchmark

This has results for the Insert Benchmark on a small server with InnoDB from MySQL 5.6 through 9.4. The workload here uses low concurrency (1 client), a small server and a cached database. I run it this way to look for CPU regressions before moving on to IO-bound workloads with high concurrency.

tl;dr

  • good news - there are no large regressions after MySQL 8.0
  • bad news - there are large regressions from MySQL 5.6 to 5.7 to 8.0
    • load in 8.0, 8.4 and 9.4 gets about 60% of the throughput vs 5.6
    • queries in 8.0, 8.4 and 9.4 get between 60% and 70% of the throughput vs 5.6

Builds, configuration and hardware

I compiled MySQL 5.6.51, 5.7.44, 8.0.43, 8.4.6 and 9.4.0 from source.

The server is an ASUS PN53 with 8 cores, AMD SMT disabled and 32G of RAM. The OS is Ubuntu 24.04. Storage is 1 NVMe device with ext4. More details on it are here.

I used the cz12a_c8r32 config file (my.cnf) which is here for 5.6.51, 5.7.44, 8.0.43, 8.4.6 and 9.4.0.

The Benchmark

The benchmark is explained here. I recently updated the benchmark client to connect via socket rather than TCP so that I can get non-SSL connections for all versions tested. AFAIK, with TCP I can only get SSL connections for MySQL 8.4 and 9.4.

The workload uses 1 client, 1 table with 30M rows and a cached database.

The benchmark steps are:

  • l.i0
    • insert 30 million rows per table in PK order. The table has a PK index but no secondary indexes. There is one connection per client.
  • l.x
    • create 3 secondary indexes per table. There is one connection per client.
  • l.i1
    • use 2 connections/client. One inserts 40 million rows per table and the other does deletes at the same rate as the inserts. Each transaction modifies 50 rows (big transactions). This step is run for a fixed number of inserts, so the run time varies depending on the insert rate.
  • l.i2
    • like l.i1 but each transaction modifies 5 rows (small transactions) and 10 million rows are inserted and deleted per table.
    • Wait for N seconds after the step finishes to reduce variance during the read-write benchmark steps that follow. The value of N is a function of the table size.
  • qr100
    • use 3 connections/client. One does range queries and performance is reported for this. The second does does 100 inserts/s and the third does 100 deletes/s. The second and third are less busy than the first. The range queries use covering secondary indexes. This step is run for 1800 seconds. If the target insert rate is not sustained then that is considered to be an SLA failure. If the target insert rate is sustained then the step does the same number of inserts for all systems tested.
  • qp100
    • like qr100 except uses point queries on the PK index
  • qr500
    • like qr100 but the insert and delete rates are increased from 100/s to 500/s
  • qp500
    • like qp100 but the insert and delete rates are increased from 100/s to 500/s
  • qr1000
    • like qr100 but the insert and delete rates are increased from 100/s to 1000/s
  • qp1000
    • like qp100 but the insert and delete rates are increased from 100/s to 1000/s
Results: overview

The performance report is here.

The summary section has 3 tables. The first shows absolute throughput by DBMS tested X benchmark step. The second has throughput relative to the version from the first row of the table. The third shows the background insert rate for benchmark steps with background inserts. The second table makes it easy to see how performance changes over time. The third table makes it easy to see which DBMS+configs failed to meet the SLA. The summary section is here.

Below I use relative QPS (rQPS) to explain how performance changes. It is: (QPS for $me / QPS for $base) where $me is the result for some version $base is the result from MySQL 5.6.51.

When rQPS is > 1.0 then performance improved over time. When it is < 1.0 then there are regressions. When it is 0.90 then I claim there is a 10% regression. The Q in relative QPS measures: 
  • insert/s for l.i0, l.i1, l.i2
  • indexed rows/s for l.x
  • range queries/s for qr100, qr500, qr1000
  • point queries/s for qp100, qp500, qp1000
Below I use colors to highlight the relative QPS values with yellow for regressions and blue for improvements.

Results: details

This table is a copy of the second table in the summary section. It lists the relative QPS (rQPS) for each benchmark step where rQPS is explained above.

The benchmark steps are explained above, they are:
  • l.i0 - initial load in PK order
  • l.x - create 3 secondary indexes per table
  • l.i1, l.i2 - random inserts and random deletes
  • qr100, qr500, qr1000 - short range queries with background writes
  • qp100, qp500, qp1000 - point queries with background writes

dbmsl.i0l.xl.i1l.i2qr100qp100qr500qp500qr1000qp1000
5.6.511.001.001.001.001.001.001.001.001.001.00
5.7.440.891.521.141.080.830.840.830.840.840.84
8.0.430.602.501.040.860.690.620.690.630.700.62
8.4.60.602.531.030.860.680.610.670.610.680.61
9.4.00.602.531.030.870.700.630.700.630.700.62



The summary is:
  • l.i0
    • there are large regressions starting in 8.0 and modern MySQL only gets ~60% of the throughput relative to 5.6 because modern MySQL has more CPU overhead
  • l.x
    • I ignore this but there have been improvements
  • l.i1, l.i2
    • there was a large improvement in 5.7 but new CPU overhead since 8.0 reduces that
  • qr100, qr500, qr1000
    • there are large regressions from 5.6 to 5.7 and then again from 5.7 to 8.0
    • throughput in modern MySQL is ~60% to 70% of what it was in 5.6


    Thursday, August 21, 2025

    Sysbench for MySQL 5.6 thru 9.4 on a small server

    This has performance results for InnoDB from MySQL 5.6.51, 5.7.44, 8.0.43, 8.4.6 and 9.4.0 on a small server with sysbench microbenchmarks. The workload here is cached by InnoDB and my focus is on regressions from new CPU overheads. This work was done by Small Datum LLC and not sponsored. 

    tl;dr

    • Low concurrency (1 client) is the worst case for regressions in modern MySQL
    • MySQL 8.0, 8.4 and 9.4 are much slower than 5.6.51 in all but 2 of the 32 microbenchmarks
      • The bad news - performance regressions aren't getting fixed
      • The good news - regressions after MySQL 8.0 are small

    Builds, configuration and hardware

    I compiled MySQL from source for versions 5.6.51, 5.7.44, 8.0.43, 8.4.6 and 9.4.0.

    The server is an ASUS ExpertCenter PN53 with AMD Ryzen 7 7735HS, 32G RAM and an m.2 device for the database. More details on it are here. The OS is Ubuntu 24.04 and the database filesystem is ext4 with discard enabled.

    The my.cnf files are here.

    Benchmark

    I used sysbench and my usage is explained here. To save time I only run 32 of the 42 microbenchmarks and most test only 1 type of SQL statement. Benchmarks are run with the database cached by InnoDB.

    The tests are run using 1 table with 50M rows. The read-heavy microbenchmarks run for 600 seconds and the write-heavy for 900 seconds.

    Results

    All files I saved from the benchmark are here and the spreadsheet is here.

    The microbenchmarks are split into 4 groups -- 1 for point queries, 2 for range queries, 1 for writes. For the range query microbenchmarks, part 1 has queries that don't do aggregation while part 2 has queries that do aggregation. 

    I provide charts below with relative QPS. The relative QPS is the following:
    (QPS for some version) / (QPS for MySQL 5.6.51)
    When the relative QPS is > 1 then some version is faster than 5.6.51.  When it is < 1 then there might be a regression. Values from iostat and vmstat divided by QPS are also provided here. These can help to explain why something is faster or slower because it shows how much HW is used per request.

    Results: point queries

    Based on results from vmstat the regressions are from new CPU overheads.
    Results: range queries without aggregation

    Based on results from vmstat the regressions are from new CPU overheads.
    Results; range queries with aggregation

    Based on results from vmstat the regressions are from new CPU overheads.
    Results: writes

    Based on results from vmstat the regressions are from new CPU overheads.


    Thursday, May 1, 2025

    The impact of innodb_doublewrite_pages in MySQL 8.0.41

    After reading a blog post from JFG on changes to innodb_doublewrite_pages and bug 111353, I wanted to understand the impact from that on the Insert Benchmark using a large server.

    I test the impact from:

    • using a larger (non-default) value for innodb_doublewrite_pages
    • disabling the doublewrite buffer

    tl;dr

    • Using a larger value for innodb_doublewrite_pages improves QPS by up to 10%
    • Disabling the InnoDB doublewrite buffer is great for performance, but bad for durability. I don't suggest you do this in production.

    Builds, configuration and hardware

    I compiled upstream MySQL 8.0.41 from source.

    The server is an ax162-s from Hetzner with 48 cores, AMD 128G RAM and AMD SMT disabled. It uses Ubuntu 22.04 and storage is ext4 using SW RAID 1 over 2 locally attached NVMe devices. More details on it are here. At list prices a similar server from Google Cloud costs 10X more than from Hetzner.

    The MySQL configuration files are:
    • cz11a_c32r128 - the base configuration file that does not set innodb_doublewrite_pages and gets innodb_doublewrite_pages=8
    • cz11e_c32r128 - adds innodb_doublewrite_pages=128 to the base config
    • cz11f_c32r128 - adds innodb_doublewrite=0 to the base config (disables doublewrite)
    The Benchmark

    The benchmark is explained here and is run with 20 clients and a table per client with an IO-bound workload. The database is larger than memory with 200M rows per table and 20 tables.

    The benchmark steps are:

    • l.i0
      • insert 200 million rows per table in PK order. The table has a PK index but no secondary indexes. There is one connection per client.
    • l.x
      • create 3 secondary indexes per table. There is one connection per client.
    • l.i1
      • use 2 connections/client. One inserts 4M rows per table and the other does deletes at the same rate as the inserts. Each transaction modifies 50 rows (big transactions). This step is run for a fixed number of inserts, so the run time varies depending on the insert rate.
    • l.i2
      • like l.i1 but each transaction modifies 5 rows (small transactions) and 1M rows are inserted and deleted per table.
      • Wait for X seconds after the step finishes to reduce variance during the read-write benchmark steps that follow. The value of X is a function of the table size.
    • qr100
      • use 3 connections/client. One does range queries and performance is reported for this. The second does does 100 inserts/s and the third does 100 deletes/s. The second and third are less busy than the first. The range queries use covering secondary indexes. This step is run for 1800 seconds. If the target insert rate is not sustained then that is considered to be an SLA failure. If the target insert rate is sustained then the step does the same number of inserts for all systems tested.
    • qp100
      • like qr100 except uses point queries on the PK index
    • qr500
      • like qr100 but the insert and delete rates are increased from 100/s to 500/s
    • qp500
      • like qp100 but the insert and delete rates are increased from 100/s to 500/s
    • qr1000
      • like qr100 but the insert and delete rates are increased from 100/s to 1000/s
    • qp1000
      • like qp100 but the insert and delete rates are increased from 100/s to 1000/s
    Results: overview

    The performance report is here.

    The summary section in the performance report has 3 tables. The first shows absolute throughput by DBMS tested X benchmark step. The second has throughput relative to the version from the first row of the table. The third shows the background insert rate for benchmark steps with background inserts and all systems sustained the target rates. The second table makes it easy to see how performance changes over time. The third table makes it easy to see which DBMS+configs failed to meet the SLA.

    Below I use relative QPS to explain how performance changes. It is: (QPS for $me / QPS for $base) where $me is the result with the cz11e_c32r128 or cz11f_c32r128 configs and $base is the result from the cz11a_c32r128 config. The configs are explained above, cz11e_c32r128 increases innodb_doublewrite_pages and cz11f_c32r128 disabled the doublewrite buffer.

    When relative QPS is > 1.0 then performance improved over time. When it is < 1.0 then there are regressions. The Q in relative QPS measures: 
    • insert/s for l.i0, l.i1, l.i2
    • indexed rows/s for l.x
    • range queries/s for qr100, qr500, qr1000
    • point queries/s for qp100, qp500, qp1000
    Below I use colors to highlight the relative QPS values with red for <= 0.95, green for >= 1.05 and grey for values between 0.95 and 1.05.

    Results: more IO-bound

    The performance summary is here.

    From the cz11e_c32r128 config that increases innodb_doublewrite_pages to 128:
    • the impact on write-heavy steps is mixed: create index was ~7% slower and l.i2 was ~10% faster
    • the impact on range query + write steps is positive but small. The improvements were 0%, 0% and 4%. Note that these steps are not as IO-bound as point query + write steps and the range queries do ~0.3 reads per query (see here).
    • the impact on point query + write steps is positive and larger. The improvements were 3%, 8% and 9%. These benchmark steps are much more IO-bound than the steps that do range queries.
    From the cz11f_c32r128 config that disables the InnoDB doublewrite buffer:
    • the impact on write-heavy steps is large -- from 1% to 36% faster.
    • the impact on range query + write steps is positive but small. The improvements were 0%, 2% and 15%. Note that these steps are not as IO-bound as point query + write steps and the range queries do ~0.3 reads per query (see here).
    • the impact on point query + write steps is positive and larger. The improvements were 14%, 41% and 42%.

    Sunday, March 16, 2025

    At what level of concurrency do MySQL 5.7 and 8.0 become faster than 5.6?

    Are MySQL 5.7 and 8.0 faster than 5.6? That depends a lot on the workload -- both types of SQL and amount of concurrency. Here I summarize results from sysbench on a larger server (48 cores) using 1, 4, 6, 8, 10, 20 and 40 clients to show how things change.

    tl;dr

    • the workload here is microbenchmarks with a database cached by InnoDB
    • 5.7.44 is faster than 8.0.x at all concurrency levels on most microbenchmarks
    • for 5.6.51 vs 8.0.x
      • for point queries, 5.6.51 is faster at <= 8 clients
      • for range queries without aggregation 5.6.51 is always faster
      • for range queries with aggregation 5.6.51 is faster except at 40 clients
      • for writes, 5.6.51 is almost always faster at 10 or fewer clients (excluding update-index)
    Performance summaries

    For point queries:
    • 5.7.44 is always faster than 8.0
    • 8.0.28 suffers from bug 102037 - found by me with sysbench, fixed in 8.0.31
    • at what level of concurrency do most things get faster in 5.7 & 8.0 vs 5.6?
      • 5.7.44 becomes faster than 5.6.51 at 6+ clients
      • 8.0.x becomes faster than 5.6.51 at between 10 and 20 clients
      • Two of the microbenchmarks are always faster in 5.6.51 - both do non-covering queries on a secondary index
    For range queries without aggregation
    • 5.7.44 is always faster than 8.0x
    • 5.6.51 is always faster than 5.7.44 and 8.0.x
    For range queries with aggregation
    • 5.7.44 is almost always faster than 8.0.x
    • 5.7.44 becomes faster than 5.6.51 at 6+ clients
    • 8.0.x becomes faster than 5.6.51 at 40 clients
    For writes
    • For update-index
      • 5.7.44 and 8.0.x are always faster than 5.6.51 at 4+ clients
      • There is an odd drop from ~6X to ~3X for 8.0.32 and 8.0.39 at 20 clients
    • 5.7.44 is mostly faster than 8.0.x for 1 to 20 clients and they have similar results at 40 clients
    • 5.7.44 & 8.0.x are always faster than 5.6.51 at 20+ clients
    Builds, configuration and hardware

    I compiled MySQL from source for versions 5.6.51, 5.7.44, 8.0.28, 8.0.32, 8.0.39 and 8.0.41.

    The server is an ax162-s from Hetzner with 48 cores (AMD EPYC 9454P), 128G RAM and AMD SMT disabled. It uses Ubuntu 22.04 and storage is ext4 with SW RAID 1 over 2 locally attached NVMe devices. More details on it are here. At list prices a similar server from Google Cloud costs 10X more than from Hetzner.

    The configuration files are named my.cnf.cz11a_c32r128 and here for 5.6.51, 5.7.44, 8.0.28, 8.0.32, 8.0.39 and 8.0.41.

    Benchmark

    I used sysbench and my usage is explained here. To save time I only run 27 of the 42 microbenchmarks and most test only 1 type of SQL statement. Benchmarks are run with the database cached by InnoDB.

    The tests run with 8 tables and 10M rows/table. There are 40 client threads, read-heavy microbenchmarks run for 180 seconds and write-heavy run for 300 seconds.

    The command lines to run all tests are:
    bash r.sh 8 10000000 180 300 md2 1 1 1
    bash r.sh 8 10000000 180 300 md2 1 1 4
    bash r.sh 8 10000000 180 300 md2 1 1 6
    bash r.sh 8 10000000 180 300 md2 1 1 8
    bash r.sh 8 10000000 180 300 md2 1 1 10
    bash r.sh 8 10000000 180 300 md2 1 1 20
    bash r.sh 8 10000000 180 300 md2 1 1 40

    Results

    For the results below I split the microbenchmarks into 4 groups: point queries, range queries without aggregation, range queries with queries, writes. The spreadsheet with all data is here. Files with performance summaries for relative and absolute QPS are hereValues from iostat and vmstat per microbenchmark are here for 1 client, 4 clients, 6 clients, 8 clients, 10 clients, 20 clients and 40 clients. These help to explain why something is faster or slower because it shows how much HW is used per query.

    The relative QPS is the following where $version is >= 5.7.44.
    (QPS for $version) / (QPS for MySQL 5.6.51)
    The numbers in the spreadsheets are the relative QPS. When the relative QPS is > 1 then $version is faster than MySQL 5.6.51. When it is 3.0 then $version is 3X faster than the base case.

    Results: charts 

    Notes on the charts

    • the y-axis shows the relative QPS
    • the y-axis starts at 0.80 to make it easier to see differences
    • in some cases the y-axis truncates the good outliers, cases where the relative QPS is greater than 1.5. I do this to improve readability for values near 1.0. Regardless, the improvements are nice.
    Results: point queries

    Summary
    • 5.7.44 is always faster than 8.0
    • 8.0.28 suffers from bug 102037 - found by me with sysbench, fixed in 8.0.31
    • at what level of concurrency do most things get faster in 5.7 & 8.0 vs 5.6?
      • 5.7.44 becomes faster than 5.6.51 at 6+ clients
      • 8.0.x becomes faster than 5.6.51 at between 10 and 20 clients
      • Two of the microbenchmarks are always faster in 5.6.51 - both do non-covering queries on a secondary index
    Results: range queries without aggregation

    Summary
    • 5.7.44 is always faster than 8.0x
    • 5.6.51 is always faster than 5.7.44 and 8.0.x
    Results: range queries with aggregation

    Summary
    • 5.7.44 is almost always faster than 8.0.x
    • 5.7.44 becomes faster than 5.6.51 at 6+ clients
    • 8.0.x becomes faster than 5.6.51 at 40 clients
    Results: writes

    The relative speedup for the update-index microbenchmark is frequently so large that it obscures the smaller changes on other microbenchmarks. So here I truncate the y-axis for some of the charts (for 6+ clients) and the section that follows has the charts without truncation.

    Summary
    • For update-index
      • 5.7.44 and 8.0.x are always faster than 5.6.51 at 4+ clients
      • There is an odd drop from ~6X to ~3X for 8.0.32 and 8.0.39 at 20 clients but you can't see that on the charts in this section because of the truncation. It is visible in the next section. From vmstat I see an increase in CPU/operation (cpu/o) and context switches /operation (cs/o) at 20 clients but not at 40 clients.
    • 5.7.44 is mostly faster than 8.0.x for 1 to 20 clients and they have similar results at 40 clients
    • 5.7.44 & 8.0.x are always faster than 5.6.51 at 20+ clients
    Results: charts for writes without truncation

    The y-axis is truncated the the charts for writes in the previous section for 6+ clients. This section has those charts without truncation.

    Thursday, February 20, 2025

    How to find Lua scripts for sysbench using LUA_PATH

    sysbench is a great tool for benchmarks and I appreciate all of the work the maintainer (Alexey Kopytov) put into it as that is often a thankless task. Today I struggled to figure out how to load Lua scripts from something other than the default location that was determined when sysbench was compiled. It turns out that LUA_PATH is the thing to set, but the syntax isn't what I expected.

    My first attempt was this, because the PATH in LUA_PATH implies directory names. But that failed.
      LUA_PATH="/mnt/data/sysbench.lua/lua" sysbench ... oltp_insert run

    It turns out that LUA_PATH uses special semantics and this worked:
      LUA_PATH="/mnt/data/sysbench.lua/lua/?.lua" sysbench ... oltp_insert run


    The usage above replaces the existing search path. The usage below prepends the new path to the existing (compiled in) path:

      LUA_PATH="/mnt/data/sysbench.lua/lua/?.lua;;" sysbench ... oltp_insert run


    Wednesday, February 19, 2025

    My database communities

    I have been working on databases since 1996. In some cases I just worked on the product (Oracle & Informix), in others I consider myself a member of the community (MySQL, Postgres & RocksDB). And for MongoDB I used to be in the community.

    I worked on Informix XPS in 1996. I chose Informix because I could live in Portland OR and walk to work. I was fresh out of school, didn't know much about DBMS, but got a great starter project (star query optimization). The company wasn't in great shape so I left by 1997 for Oracle. I never used Informix in production and didn't consider myself as part of the Informix community.

    I was at Oracle from 1997 to 2005. The first 3 years were in Portland implementing JMS for the app server team and the last 5 years at Oracle HQ working on query execution.  I fixed many bugs, added support for ieee754 types, rewrote sort and maintained the sort and bitmap index row sources. The people there were great and I learned a lot but I did not enjoy the code base and left for a startup. I never used Oracle in production and don't consider myself as part of the Oracle community.

    I lead the MySQL engineering teams at Google for 4 years and at Facebook/Meta for 10 years. I was very much immersed in production and have been active in the community since 2006. The MySQL teams got much done at both Google (GTID, semi-sync, crash-safe replication, rewrote the InnoDB rw lock) and Facebook/Meta (MyRocks and too many other things to mention). Over the years at FB/Meta my job duties got in the way of programming so I used performance testing as a way to remain current. I also filed many bugs might still be in the top-10 for bug reports. While Oracle has been a great steward for the MySQL project I have been critical about the performance regressions from older MySQL to newer MySQL. I hope that eventually stops because it will become a big problem.

    I contributed some code to RocksDB, mostly for monitoring. I spent much more time doing performance QA for it, and filing a few bugs. I am definitely in the community.

    I don't use Postgres in production but have spent much time doing performance QA for it over the past ~10 years. A small part of that was done while at Meta, I had a business case, and was able to use some of their HW and my time. But most of this has been a volunteer effort -- more than 100 hours of my time and 10,000+ hours of server time. Some of those server hours are in public clouds (Google, Hetzner) so I am also spending a bit on this. I found a few performance bugs. I have not found large performance regressions over time which is impressive. I have met many of the contributors working on the bits I care about, and that has been a nice benefit.

    I used to be a member of the MongoDB community. Like Postgres, I never supported it in production but I spent much time doing performance QA with it. I wrote mostly positive blog posts, filed more than a few bugs and even won the William Zola Community Award. But I am busy enough with MySQL, Postgres and RocksDB so I haven't tried to use it for years. Regardless, I continue to be impressed by how fast they pay down tech debt, with one exception (no cost-based optimizer).

    Thursday, January 9, 2025

    Sysbench performance over time for InnoDB and MyRocks: part 4

    This is part 4 in my (possibly) final series on performance regressions in MySQL using cached sysbench as the workload. For previous posts, see part 1part 2 and part 3. This post covers performance differences between InnoDB in upstream MySQL 8.0.32, InnoDB in FB MySQL 8.0.32 and MyRocks in FB MySQL 8.0.32 using a server with 32 cores and 128G of RAM.

    I don't claim that the MyRocks CPU overhead isn't relevant, but this workload (CPU-bound, database is cached) is a worst-case for it.

    tl;dr 

    • InnoDB from FB MySQL is no worse than ~10% slower than InnoDB from upstream
    • Fixing bug 1506 is important for InnoDB in FB MySQL
    • MyRocks is ~30% slower than upstream InnoDB at low concurrency and ~45% slower at high, as it uses ~1.5X more CPU/query
    • For writes, MyRocks does worse at high concurrency than at low
    Updates: For writes, MyRocks does worse at high concurrency than at low

    I looked at vmstat metrics for the update-nonindex benchmark and the number of context switches per update is about 1.2X larger for MyRocks vs InnoDB at high concurrency. 

    Then I looked at PMP stacks and MyRocks has more samples for commit processing. The top stacks are here. This should not be a big surprise because MyRocks does more work at commit time (pushes changes from a per-session buffer into the memtable). But I need to look at this more closely.

    I browsed the code in Commit_stage_manager::enroll_for, which is on the call stack for the mutext contention, and it is kind of complicated. I am trying to figure out how many mutexes are locked in there and figuring that out will take some time. 

    Benchmark, Hardware

    Much more detail on the benchmark and hardware is here. I am trying to avoid repeating that information in the posts that follow. 

    Results here are from the c32r128 server with 32 CPU cores and 128G of RAM. The benchmarks were repeated for 1 and 24 threads. On the charts below that is indicated by NT=1 and NT=24.

    Builds

    The previous post has more detail on the builds, my.cnf files and bug fixes.

    The encoded names for these builds is:
    • my8032_rel_o2nofp
      • InnoDB from upstream MySQL 8.0.32
    • fbmy8032_rel_o2nofp_end_241023_ba9709c9_971
      • FB MySQL 8.0.32 at git hash ba9709c9 (as of 2024/10/23) using RocksDB 9.7.1. This supports InnoDB and MyRocks.
    • fbmy8032_rel_o2nofp_241023_ba9709c9_971_bug1473_1481_1482_1506
      • FB MySQL 8.0.32 at git hash ba9709c9 (as of 2024/10/23) using RocksDB 9.7.1 with patches applied for bugs 1473, 1481, 1482 and 1506, This supports InnoDB and MyRocks.
    The my.cnf files are:
    Relative QPS

    The charts and summary statistics that follow use a number that I call the relative QPS (rQPS) where:
    • rQPS is: (QPS for my version) / (QPS for base version)
    • base version is InnoDB from upstream MySQL 8.0.32 (my8032_rel_o2nofp)
    • my version is one of the other versions
    Results

    The microbenchmarks are split into three groups: point queries, range queries, writes. The tables below have summary statistics for InnoDB and MyRocks using the relative QPS (the same data as the charts).

    Results are provided in two formats: charts and summary statistics. The summary statistics table have the min, max, average and median relative QPS per group (group = point, range and writes).

    The spreadsheets and charts are also here. I don't know how to prevent the microbenchmark names on the x-axis from getting truncated in the png files I use here but they are easier to read on the spreadsheet.

    The charts use NT=1, NT=16 and NT=24 to indicate whether sysbench was run with 1, 16 or 24 threads. The charts and table use the following abbreviations for the DBMS versions:
    • fbinno-nofix
      • InnoDB from fbmy8032_rel_o2nofp_end_241023_ba9709c9_971
    • fbinno-somefix
      • InnoDB from fbmy8032_rel_o2nofp_241023_ba9709c9_971_bug1473_1481_1482_1506
    • myrocks-nofix
      • MyRocks from fbmy8032_rel_o2nofp_end_241023_ba9709c9_971
    • myrocks-somefix
      • MyRocks from fbmy8032_rel_o2nofp_241023_ba9709c9_971_bug1473_1481_1482_1506

    Summary statistics: InnoDB

    Summary:

    • InnoDB from FB MySQL is no worse than ~10% slower than InnoDB from upstream
    • Fixing bug 1506 is important for InnoDB in FB MySQL
    1 thread

    fbinno-nofixminmaxaveragemedian
    point0.890.960.920.91
    range0.630.930.820.82
    writes0.860.980.890.88
    fbinno-somefixminmaxaveragemedian
    point0.921.000.960.95
    range0.890.960.910.91
    writes0.890.990.920.92

    24 threads

    fbinno-nofixminmaxaveragemedian
    point0.920.960.940.94
    range0.620.960.810.82
    writes0.840.940.880.87
    fbinno-somefixminmaxaveragemedian
    point0.940.990.970.98
    range0.780.990.890.91
    writes0.860.950.900.88

    Summary statistics: MyRocks

    Summary:

    • MyRocks does better at low concurrency than at high. The fix might be as simple as enabling the hyper clock block cache
    • MyRocks is ~30% slower than upstream InnoDB at low concurrency and ~45% slower at high
    • For writes, MyRocks does worse at high concurrency than at low
    1 thread

    myrocks-nofixminmaxaveragemedian
    point0.520.750.660.68
    range0.370.720.600.60
    writes0.651.210.790.73
    myrocks-somefixminmaxaveragemedian
    point0.510.790.680.70
    range0.430.760.620.61
    writes0.661.230.800.74

    24 threads

    myrocks-nofixminmaxaveragemedian
    point0.400.760.490.43
    range0.400.710.580.60
    writes0.441.370.650.55
    myrocks-somefixminmaxaveragemedian
    point0.480.770.550.51
    range0.430.710.600.60
    writes0.451.390.660.55

    Results: c32r128 with InnoDB and point queries

    Summary
    • InnoDB from FB MySQL is no worse than 10% slower than upstream

    Results: c32r128 with MyRocks and point queries

    Summary
    • at low concurrency the worst case for MyRocks are the tests that do point lookup on secondary indexes because that uses a range scan rather than a point lookup on the LSM tree, which means that bloom filters cannot be used
    • at high concurrency the difference between primary and secondary index queries is less significant, perhaps this is dominated by mutex contention from the LRU block cache and solved by using hyper clock

    Results: c32r128 with InnoDB and range queries

    Summary

    • the worst case for InnoDB from FB MySQL are the long range scans and fixing bug 1506 will be a big deal

    Results: c32r128 with MyRocks and range queries

    Summary

    • while long range scans are the worst case here, bug 1506 is not an issue as that is InnoDB-only

    Results: c32r128 with InnoDB and writes

    Summary

    • results are stable here, InnoDB from FB MySQL is no worse than ~10% slower than upstream but results at high concurrency are a bit worse than at low

    Results: c32r128 with MyRocks and writes

    Summary

    • while MyRocks does much better than InnoDB for update-index because it does blind writes rather than RMW for non-unique secondary index maintenance
    • MyRocks does worse at high concurrency than at low




    Postgres 18rc1 vs sysbench

    This post has results for Postgres 18rc1 vs sysbench on small and large servers. Results for Postgres 18beta3 are here for a small and larg...