Tuesday, January 19, 2021

Sysbench: in-memory scans for MyRocks, InnoDB and Postgres

I added a test that does a full scan of the test table(s) to the sequence of tests I use with sysbench and repeated the full sequence with an in-memory workload, but I only share the results for scan here. 

Summary

  • For in-memory scans, InnoDB was fastest, then Postgres and MyRocks was the slowest. MyRocks was also reading from storage (SSD) on this test for a table that should fit in cache.
  • Throughput decreases by ~15% for InnoDB in MySQL 8.0 vs 5.6 (see CPU overhead)
  • Throughput increases by ~16% for Postgres 13.1 vs 11.10 (see CPU overhead)
  • I am wary of having strong conclusions for a test that takes < 10 seconds

Overview

As explained in a previous post I share numbers for absolute throughput, relative throughput and HW efficiency. The HW efficiency numbers are computed as metric / throughput and the rows scanned/second is the value for throughput in the denominator. The files that have all of the results are in github.

Tests were run using 1 table with 10M rows and 1 client that scans the table. The SELECT statement has a filter predicate such that the result set is empty. The table is (should be) cached by the DBMS and the scan finishes in a few seconds and I am wary of putting too much faith into results from a short running test. A test like this is better done on a server with more memory than the 16G in my NUC servers. I shared the DBMS configs in previous posts and won't reshare here but I will mention that parallel query was disabled for Postgres.

Finally, I suspect the clustered PK B-Tree index scanned for InnoDB has not suffered much fragmentation based on the sysbench workload and with more fragmentation the results for InnoDB will suffer.

I use ratios to describe performance and efficiency and my usage is fully explained in a previous post. Below I will indicate which DBMS provides values for the numerator and denominator in the ratios. A throughput ratio > 1 means the DBMS in the numerator is faster than the one in the denominator. A HW efficiency ratio > 1 means the DBMS in the numerator uses more HW/operation than the one in the denominator.

The results below are explained using several comparisons:

  1. MyRocks-only - MyRocks in MySQL 5.6.35 & 8.0.17
  2. InnoDB-only - InnoDB in MySQL 5.6.49, 5.7.31 & 8.0.22
  3. Postgres-only - Postgres in versions 11.10, 12.4 and 13.1
  4. InnoDB-vs-Postgres - MySQL 5.6.49 vs Postgres 11.10, MySQL 8.0.22 vs Postgres 13.1
MyRocks-only
  • Scan rates for 5.6, 8.0 were 1.178, 1.034 in millions of rows/second
  • Scan rate for 8.0 relative to 5.6 was 0.88
  • CPU/row ratio for 8.0 relative to 5.6 was 1.18
  • Both 5.6 and 8.0 did ~100M/s of read IO for a table that should fit in cache
InnoDB-only
  • Scan rates for 5.6, 5.7, 8.0 were 4.673, 3.939, 3.901 in millions of rows/second
  • Scan rates for 5.7, 8.0 relative to 5.6 were 0.84, 0.83
  • CPU/row ratio for 5.7, 8.0 relative to 5.6 were 1.17, 1.33
Postgres-only
  • Scan rates for 11.10, 12.4, 13.1 were 1.917, 2.216, 2.216 in millions of rows/second
  • Scan rates for 12.4, 13.1 relative to 11.10 were 1.16, 1.16
  • CPU/row ratio for 12.4, 13.1 relative to 11.10 were 0.86, 0.86
InnoDB-vs-Postgres
  • Scan rates are listed above
  • Scan rate for Postgres 11.10 relative to MySQL 5.6.49 was 0.41
  • Scan rate for Postgres 13.1 relative to MySQL 8.0.22 was 0.57

1 comment:

  1. When I developed some scan optimisations for MySQL NDB Cluster 7.6 a few years ago I measured that after optimisation of 3x NDB could handle about 4M rows per second in range scans, didn't focus so much on full table scans that most likely is slightly slower therefore, at the time InnoDB was slightly slower, remember that this means NDB scanning rows using pushed conditions, so no involvement of MySQL code other than for result rows

    ReplyDelete

Fixing some of the InnoDB scan perf regressions in a MySQL fork

I recently learned of Advanced MySQL , a MySQL fork, and ran my sysbench benchmarks for it. It fixed some, but not all, of the regressions f...