My primary goal with the benchmarks I run has been to identify performance regressions, especially ones that can be fixed to make open source databases better. And so I focus on comparing old and new versions of one DBMS at a time to identify where things get better or worse. But here I compare Postgres with MySQL (InnoDB & MyRocks) to show that neither is the best for the Insert Benchmark -- all are good, but none are perfect.
The per-DBMS results are here for Postgres, InnoDB and MyRocks. Those posts also have links to the configurations and builds that I used. This post shares the same result but makes it easier to compare across DBMS.
Results here are from a small server (8 cores) with a low concurrency workload (1 client, <= 3 concurrent connections). Results from a larger server are pending and might not be the same as what I share here.
Summary of throughput for the IO-bound workload
- Initial load in key order (l.i0)
- Postgres is fastest
- Write-only with secondary index maintenance (l.i1, l.i2)
- MyRocks is fastest
- Range queries (qr100, qr500, qr1000)
- Postgres is fastest
- Point queries (qp100, qp500, qp1000)
- MyRocks is fastest, Postgres failed to sustain the target write rate for qp1000
- Space efficiency
- MyRocks is best, Postgres/InnoDB used ~4X/~3x more space
- Write efficiency
- MyRocks is best and on the l.i1 benchmark step Postgres and InnoDB write ~9X and ~80X more KB to storage per insert than MyRocks.
- Read efficiency
- MyRocks is the best and that might surprise people. Both InnoDB and Postgres do more read IO per query for both point and range queries. Bloom filters and less space amplification might explain this.
- All DBMS have noise (variance) in some cases. Results for MyRocks aren't any worse than for Postgres or InnoDB.
Versions tested
- pg162_def.cx9a2a_bee
- Postgres 16.2 and the cx9a2_bee config
- my8036_rel.cz10a_bee
- Upstream MySQL 8.0.36 with InnoDB and the cz10a_bee config
- fbmy8028_rel_221222.cza1_bee
- MyRocks 8.0.28 from code as of 2023-12-22 at git hash 2ad105fc, RocksDB 8.7.0 at git hash 29005f0b, cza1_bee config
- Compression is enabled, which saves space at the cost of more CPU
- cached - database has 30M rows and fits in memory
- IO-bound - database has 800M rows and is larger than memory,
The benchmark steps are:
- l.i0
- insert X million rows per table in PK order. The table has a PK index but no secondary indexes. There is one connection per client. X is 30M for cached and 800M for IO-bound.
- l.x
- create 3 secondary indexes per table. There is one connection per client.
- l.i1
- use 2 connections/client. One inserts Y rows and the other does deletes at the same rate as the inserts. Each transaction modifies 50 rows (big transactions). This step is run for a fixed number of inserts, so the run time varies depending on the insert rate. Y is 80M for cached and 4M for IO-bound.
- l.i2
- like l.i1 but each transaction modifies 5 rows (small transactions) and Y is 20M for cached and 1M for IO-bound.
- Wait for X seconds after the step finishes to reduce variance during the read-write benchmark steps that follow.
- qr100
- use 3 connections/client. One does range queries for Z seconds and performance is reported for this. The second does does 100 inserts/s and the third does 100 deletes/s. The second and third are less busy than the first. The range queries use covering secondary indexes. This step is run for a fixed amount of time. If the target insert rate is not sustained then that is considered to be an SLA failure. If the target insert rate is sustained then the step does the same number of inserts for all systems tested. Z is 3600 for cached and 1800 for IO-bound.
- qp100
- like qr100 except uses point queries on the PK index
- qr500
- like qr100 but the insert and delete rates are increased from 100/s to 500/s
- qp500
- like qp100 but the insert and delete rates are increased from 100/s to 500/s
- qr1000
- like qr100 but the insert and delete rates are increased from 100/s to 1000/s
- qp1000
- like qp100 but the insert and delete rates are increased from 100/s to 1000/s
- insert/s for l.i0, l.i1, l.i2
- indexed rows/s for l.x
- range queries/s for qr100, qr500, qr1000
- point queries/s for qp100, qp500, qp1000
- the base case is Postgres 16.2, numbers in red mean Postgres is faster
- comparing InnoDB and MyRocks with the base case
- l.i0
- InnoDB - relative QPS is 0.75
- MyRocks - relative QPS is 0.77
- l.x - I ignore this for now
- l.i1, l.i2
- InnoDB - relative QPS is 0.86, 1.63
- MyRocks - relative QPS is 1.12, 1.47
- qr100, qr500, qr1000
- InnoDB - relative QPS is 0.40, 0.42, 0.41
- MyRocks - relative QPS is 0.19, 0.16, 0.16
- qp100, qp500, qp1000
- InnoDB - relative QPS is 0.82, 0.81, 0.82
- MyRocks - relative QPS is 0.71, 0.70, 0.69
- the base case is Postgres 16.2, numbers in red mean Postgres is faster
- comparing InnoDB and MyRocks with the base case
- l.i0
- InnoDB - relative QPS is 0.74
- MyRocks - relative QPS is 0.77
- l.x - I ignore this for now
- l.i1, l.i2
- InnoDB - relative QPS is 0.83, 18.35
- MyRocks - relative QPS is 11.45, 73.55
- qr100, qr500, qr1000
- InnoDB - relative QPS is 0.42, 0.47, 0.55
- MyRocks - relative QPS is 0.07, 0.06, 0.06
- qp100, qp500, qp1000
- InnoDB - relative QPS is 1.56, 1.46, 1.44
- MyRocks - relative QPS is 2.15, 2.13, 2.21
- Postgres failed to sustain the target write rate during qp1000. The target was ~1000/s and it sustained 927/s.
Here I focus on the results from the IO-bound workload. The efficiency section of the IO-bound perf report has a lot of information.
- Data is here
- Postgres uses the least CPU per statement (see cpupq, CPU per query). It is ~1.2X larger with InnoDB and MyRocks. CPU probably explains the perf difference.
- MyRocks write the least to storage per statement (see wkbpi, KB written per insert)
- Data is here
- MyRocks does the fewest reads from storage per statement (see rpq, reads per query). The rate is ~278X larger for Postgres and ~859X larger for InnoDB. The ratio is so large because non-unique secondary index maintenance is read free for MyRocks. The InnoDB change buffer provides a similar but less significant benefit (I enabled the change buffer for these tests). Alas, with Postgres the leaf pages for secondary indexes must undergo read-modify-write as the heap-only tuple optimization can't be used for this schema.
- MyRocks uses the least CPU per statement (see cpupq, CPU per query). It is ~3X larger with Postgres and ~5X larger with InnoDB.
- MyRocks has the best write efficiency (see wkbpi, KB written to storage per insert). It is ~9X larger for Postgres and ~80X larger for InnoDB.
- Data is here
- Results are similar to l.i1 above with one exception. The CPU overhead for Postgres was ~3X larger than MyRocks for l.i1 but here it is more than 20X larger because of the problem with the optimizer spending too much time in get_actual_variable_range.
- Data is here
- The read IO overhead is similar for Postgres and MyRocks (see rpq, read per query) while it is ~8X larger for InnoDB. A standard hand-waving analysis would predict that MyRocks wasn't going to be as read IO efficient as Postgres, but prefix bloom filters and less space amplification help it here.
- Postgres has the smallest CPU overhead (see cpupq, CPU per query). It is ~2.6X larger for InnoDB and ~15X larger for MyRocks. I hope to explain why MyRocks uses so much more CPU.
- Data is here
- MyRocks has the best read IO efficiency (see rpq, read per query). It is ~2.2X and ~1.3X larger for Postgres and InnoDB. Bloom filters and better space amplification might explain this.
- All DBMS have a similar CPU overhead (see cpupq, CPU per query).
- Data is here for Postgres, InnoDB and MyRocks
- Results are stable for all DBMS but MyRocks has the most noise
- Data is here for Postgres, InnoDB and MyRocks
- The insert/s rate declines over time for Postgres which is expected but it grows over time for InnoDB from ~1000/s to ~3000/s. I assume that InnoDB initially suffers more from page splits and perf increases as that is less likely over time.
- Data is here for Postgres, InnoDB and MyRocks
- The insert/s and delete/s rate for Postgres decreases slightly over time. I assume the issue is that the optimizer CPU overhead for delete statements grows over time which is apparent on the chart for max delete response time (start here and scroll down).
- Data is here for qr100, qr500 and qr1000
- For qr100 measured at 1-second intervals
- For Postgres the query rate has noise (ranges between 8000/s and 12000/s, the max insert response time is noisy and the max delete response time is stable but grows over time.
- For InnoDB the query rate is stable, the max insert response time is noisy and the max delete response time is stable.
- For MyRocks the query rate has noise (ranges between 500/s and 600/s), the max insert response time is stable and the max delete response time is stable.
- For qr1000 measured at 1-second intervals
- For Postgres the query rate has noise and declines over time, the max insert response time is noisy and the max delete response time is stable but grows over time.
- For InnoDB the query rate is stable, the max insert response time is noisy and the max delete response time is noisy.
- For MyRocks the query rate is noisy, the max insert response time is noisy and the max delete response time is noisy.
- Data is here for qp100, qp500 and qp1000
- For qp100 measured at 1-second intervals
- For Postgres the query rate has noise, the max insert response time has noise, the max delete response time has noise and is growing
- For InnoDB the query rate is stable, the max insert response time has noise, the max delete response time is stable
- For MyRocks the query rate is stable, the max insert response time has noise, the max delete response time has noise
- For qp1000 measured at 1-second intervals
- For Postgres the query rate has noise, the max insert response time has too much noise especially from 1200s to 1600s which explains why it failed to sustain the target insert and delete rates, the max delete response time is stable and growing.
- For InnoDB the query rate is stable, the max insert response time has too much noise, the max delete response time has much noise
- For MyRocks the query rate is stable, the max insert response time has noise, the max delete response time has noise