Wednesday, January 20, 2021

Sysbench: IO-bound and Postgres

This has results for IO-bound sysbench with Postsgres versions 11.10, 12.4 and 13.1. The test is similar to what I used for in-memory sysbench except the table has 400M rows instead of 10M and the test table is much larger than memory. The goal is to understand how performance and efficiency change over time. I also have posts for InnoDB and MyRocks.

Summary:

  • Many tests get between 5% and 10% more throughput in 13.1 relative to 11.10
  • Tests that do covering queries on secondary indexes do ~20% less read IO/query and I am not sure whether this is a benefit from index deduplication
  • There aren't significant regressions from 11.10 to 13.1

Overview

I use my sysbench fork. I have yet to change the code but added Lua scripts for new tests. Tests are run in a sequence (prepare, pause to let write back & compaction catch up, read-only, write-heavy, pause again, read-only, delete, insert) via all_small.sh which calls another helper script, run.sh, to run tests and collect HW metrics.

The tests use 1 table with 400M rows and each test is run for 300 seconds for 1, 2 and 3 threads. The test servers have 4 CPU cores with HT disabled, 16G of RAM and NVMe SSD. The test table is much larger than RAM and I call this an IO-bound setup.

Tests used Postgres versions 11.10, 12.4 and 13.1. The servers use Ubuntu 20.04 and XFS. 

Results

The tests are in 5 groups based on the sequence in which they are run: load, read-only run before write-heavy, write-heavy, read-only run after write-heavy and insert/delete. 

I have scripts that generate 3 summaries -- absolute throughput, relative throughput and HW efficiency. Absolute throughput is the QPS or TPS for a test. Relative throughput is the QPS or TPS relative to the base case. The HW efficiency report has absolute and relative results for CPU and IO per operation. In this post the base case is the result for Postgres 11.10.

I use ratios (relative throughput & relative HW efficiency) to explain performance. For this post the denominator (the base case) is Postgres 11.10 and the numerator is Postgres 12.4 or 13.1. A throughput ratio < 1 means that 12.4 or 13.1 are slower. For HW efficiency, CPU and IO per operation, a ratio > 1 means that 12.4 or 13.1 use more CPU or IO per operation.

Files are in github including summaries for absolute throughputrelative throughput and HW efficiency. I annotate the results below with a focus on the 5.6 vs 8.0 comparison. The Postgres config files are also in github for 11.10, 12.4 and 13.1.

Below I describe throughput and efficiency for 13.1 and 12.4 relative to 11.10. I use v13 for 13.1 and v12 for 12.4 to improve readability.

Load:
  • Inserts/second ratio for v13 is 1.14 (here)
  • Inserts/second ratio for v12 is 1.01
  • CPU/insert ratio is 0.99 for v12 and 0.94 for v13 (here)
  • IO read KB/insert ratio is 1.00 for v12 and 1.34 for v13
  • IO write KB/insert ratio is 1.00 for v12 and 0.93 for v13
Read-only before write-heavy:
  • For the first 4 tests that do point queries
    • QPS ratios for v13 are 1.07, 1.04, 0.96, 0.83 (here to here)
    • QPS ratios for v12 are 1.07, 1.02, 1.00, 1.00
    • CPU/query ratios for v13 are 0.85, 0.92, 1.05, 1.03 (here to here)
  • The next 3 tests have range scans from oltp_read_write.lua with ranges of size 10, 100 & 10,000
    • QPS ratios for v13 are 1.02, 0.92, 0.98 (here to here)
    • QPS ratios for v12 are 1.07, 0.99, 0.98
    • CPU/query ratios for v13 are 0.94, 1.10, 1.02 (here to here)
  • The next 2 tests do point queries via in-lists that are covering and not covering for the PK index
    • QPS ratios for v13 are 0.96, 1.00 (here to here)
    • QPS ratios for v12 are 0.98, 1.02
    • CPU/query ratios for v13 are 1.04, 0.98 (here to here)
  • The next 2 tests are similar to the previous but use the secondary index
    • QPS ratios for v13 are 1.07, 1.08 (here to here)
    • QPS ratios for v12 are 0.99, 1.00
    • CPU/query ratios for v13 are 0.99, 0.92 (here to here)
    • IO read KB/query ratios for v13 are 0.82, 0.91
  • The next 2 tests do range queries that are covering and not covering for the PK index
    • QPS ratios for v13 are 1.00, 0.99 (here to here)
    • QPS ratios for v12 are 1.03, 1.02
    • CPU/query ratios for v13 are 1.00, 1.03 (here to here)
  • The next 2 tests are similar to the previous but use the secondary index
    • QPS ratios for v13 are 1.14, 0.99 (here to here)
    • QPS ratios for v12 are 1.01, 1.01
    • CPU/query ratios for v13 are 0.91, 1.01 (here to here)
    • IO read KB/query ratios for v13 are 0.82, 1.01
Write-heavy
  • For the next 5 tests that are update-only
    • QPS ratios for v13 are 1.02, 1.02, 1.01, 1.02, 0.97 (here to here)
    • QPS ratios for v12 are 1.01, 1.00, 1.01, 1.00, 0.99
    • CPU/statement ratios for v13 are 0.99, 0.99, 0.98, 0.98, 0.98 (here to here)
  • The next test is write-only that has the writes from oltp_read_write.lua
    • QPS ratio for v13 is 1.01 (here)
    • QPS ratio for v12 is 0.98
    • CPU/transaction ratio is 1.02 for v12 and 0.99 for v13 (here)
  • The next 2 tests are the traditional sysbench tests with ranges of size 10 & 100
    • QPS ratios for v13 are 0.98, 0.99 (here to here)
    • QPS ratios for v12 is 0.99, 1.01
    • CPU/transaction ratios for v13 are 1.04, 1.00 (here to here)
Read-only after write-heavy includes tests that were run before write-heavy.
  • The next 3 tests have range scans from oltp_read_write.lua with ranges of size 10, 100 & 10,000
    • QPS ratios for v13 are 0.95, 1.00, 0.98 (here to here). 
    • QPS ratios for v12 are 0.93, 0.98, 0.98
    • CPU/transaction ratios for v13 are 1.10, 1.01, 1.01 (here to here)
  • The next 5 tests do point queries
    • QPS ratios for v13 are 1.08, 1.01, 1.00, 1.00, 0.96 (here to here)
    • QPS ratios for v12 are 1.08, 0.99, 1.00, 1.00, 0.99
    • CPU/query ratios for v13 are 0.89, 0.99, 0.99, 1.02, 1.03 (here to here)
  • The next 2 tests do point queries via in-lists that are covering and not covering for the PK index
    • QPS ratios for v13 are 0.97, 1.00 (here to here)
    • QPS ratios for v12 are 0.98, 0.98
    • CPU/query ratios for v13 are 1.03, 0.99 (here to here)
  • The next 2 tests are similar to the previous test but use the secondary index
    • QPS ratios for v13 are 1.08, 1.08 (here to here)
    • QPS ratios for v12 are 0.99, 1.00
    • CPU/query ratios for v13 are 0.97, 0.94 (here to here)
    • IO read KB/query ratios for v13 are 0.81, 0.91
  • The next 2 tests do range queries that are covering and not covering for the PK index
    • QPS ratios for v13 are 1.03, 0.99 (here to here)
    • QPS ratios for v12 are 1.05, 0.97
    • CPU/query ratios for v13 are 0.97, 1.02 (here to here)
  • The next 2 tests are similar to the previous but use the secondary index
    • QPS ratios for v13 are 1.13, 1.00 (here to here)
    • QPS ratios for v12 are 1.00, 1.02
    • CPU/query ratios for v13 are 0.92, 0.98 (here to here)
    • IO read KB/query ratios for v13 are 0.82, 1.00
  • The next test does a single-threaded full scan of the test table with a filter so that the result set is empty.
    • Rows scanned/second ratio for v13 is 1.14 (here)
    • Rows scanned/second ratio for v12 is 1.14
    • CPU/query ratio for v13 is 0.87 (here)
    • It isn't in the tables I shared but the SSD read 417 MB/s for v13
    Insert/delete

    • QPS ratios for v13 are 1.01 for delete and 1.01 for insert
    • QPS ratios for v12 are 1.02 for delete and 0.98 for insert
    • CPU/statement ratios for v13 are 0.99 for delete and 1.00 for insert

    No comments:

    Post a Comment

    RocksDB on a big server: LRU vs hyperclock, v2

    This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...