I run many benchmarks to understand DBMS performance which means I repeat a lot of benchmarks to try new tuning options or because I made a mistake. This post is about the impact of huge pages for Postgres and InnoDB using the sysbench benchmark and AWS c6i.16xlarge instances. After sharing my last round of sysbench results a Postgres expert suggested I try it with huge pages enabled.
tl;dr
- Throughput with huge pages was up to 1.32X better for Postgres
- Throughput with huge pages was up to 1.06X better for InnoDB
- In future work I might try to explain why this was much better for Postgres than for InnoDB
- I will try to enable huge pages for future InnoDB and Postgres benchmarks. All this makes my automation more complicated because I won't enable this for MyRocks.
Benchmark
I use upstream sysbench with a few extra Lua scripts from here driven by shell scripts from here. The benchmark is invoked by this script (r.sh) which then invoked cmp_pg.sh, cmp_in.sh and cmp_rx.sh. All scripts end up calling all_small.sh to run sysbench for a sequence of Lua scripts. Each Lua script was run for 10 minutes using 8 tables with 32 clients (threads, connections).
The benchmark was run on a c6i.16xlarge instance with Ubuntu 22.04, 64 vCPU, 128G RAM and EBS (io2, 2T, 49k IOPs).
I used Postgres 15.1 and MySQL 8.0.31.
The test database needs ~42G to fit in the DBMS buffer pool.
Enabling huge pages
Advice is here for InnoDB and here for Postgres. I started with this my.cnf for InnoDB and then added large_pages to it. I started with this config file for Postgres and then changed huge_pages=off to huge_pages=on.
To enable huge pages:
- Determine how many 2M pages are needed for a ~42G buffer pool -- I used 45,000
- Add vm.nr_hugepages=45000 to /etc/sysctl.conf
- Run: sysctl -p
- Start the DBMS
FileHugePages: 0 kB
HugePages_Total: 45000
HugePages_Free: 43631
HugePages_Rsvd: 43577
HugePages_Surp: 0
Hugepagesize: 2048 kB
Hugetlb: 92160000 kB
Results
I don't have scripts that generate nice graphs but I do have reporting scripts that generate text files that list absolute results and results relative to a base case. Here I share four reports (2 for InnoDB, 2 for Posgres). For each the base case is without huge pages and the results for huge pages are compared to it. There are two reports for InnoDB and two for Postgres -- one is for sysbench run with most queries using a PK index (the default) and one is for it run with most queries using a secondary index (sysbench --secondary).
In these graphs:
- the 2 lines after absolute list throughput (o/s is operations/second) and HW efficiency (cpu/o is CPU per operation). The 2 lines are the base case (huge pages off) and then with huge pages on.
- The one line after relative lists the values from the second line (huge pages on) relative to the values from the first line (huge pages off).
- From the results for Postgres using a secondary index and the random-points.range100 benchmark step the throughput is 1.32X better with huge pages on per the o/s column here as the QPS improved from 56792/s to 748835/s while the CPU/query overhead (cpu/o) decreased from 0.001683 to 0.001276.
- Postgres with a secondary index (here)
- Postgres with a PK index (here)
- InnoDB with a secondary index (here)
- InnoDB with a PK index (here)
With InnoDB, I imagine the gain comes from the more efficient address translation (TLB cache entries more relevant). With PostgreSQL, there is a similar gain + a gain in memory because of the multi-process architecture. The size of the page table in each process is smaller. This can account for a significant memory at scale. Is the x1.32 gain of PostgreSQL mainly due to this second effect?
ReplyDeleteThat is an interesting question. Hopefully someones tries to get the answer.
Delete