Vacuum arrived in the original POSTGRES storage system in the 1980s. Today it serves several purposes: 1) reclaim space for old versions of rows that are no longer visible, 2) update statistics for the query planner, 3) update bits in the visibility map to help index-only queries and 4) preventing transactionID wraparound.
I didn't expect vacuum to help much with the insert benchmark because it doesn't do updates or deletes and there is no space for vacuum to reclaim. Also, query plans were OK and there weren't enough transactions for wraparound to be an issue.
What I missed was the impact of the visibility map on the index-only range scans run during the insert benchmark. These bits are only set by vacuum and if the bit is not set for a heap page, then the base row must be read from the heap page during an index-only scan for any index entries that point to that heap page. The impact from reading the base row is significant when the working set is not cached but still visible when the working set is cached.
Fortunately, the insert benchmark pattern of inserts seems to be friendly to Postgres in that a heap page will become full after some inserts and then doesn't get more inserts. So once the visibility map bit gets set it remains set for that heap page.
Autovacuum is triggered for a table based on the number of updates and deletes done to the table. So while an insert-only workload can benefit from vacuum when it also does range scans, the inserts won't trigger vacuum. I haven't read the code to confirm that inserts don't trigger autovacuum. The main docs aren't clear as the routine vacuuming section states that When enabled, autovacuum checks for tables that have had a large number of inserted, updated or deleted tuples. But the docs for autovacuum_vacuum threshold state that it is only updates and deletes: Specifies the minimum number of updated or deleted tuples needed to trigger a
VACUUMin any one table.
I now run the insert benchmark as a sequence of steps: load some data with only PK indexes, create secondary indexes, load more data, then read+write. The read+write step does short range scans as fast as possible and the writes are rate-limited inserts where there is a target on the insert rate per second. As part of this I optionally run vacuum after creating the secondary indexes. Each read+write test runs for 1 hour and vacuum is started at the end of each hour. While my scripts wait for vacuum to finish after creating the indexes, and don't wait for it to finish during the read+write tests that doesn't matter much because vacuum is fast during the insert benchmark.
This is a brief description. Hopefully it is truthy. The key point is that vacuum can require full index scans for every index of the vacuumed table. That can take a long time. Alas, this isn't an issue for the insert benchmark because it is insert-only and doesn't create old versions of rows so there are no full scans of all indexes during autovacuum.
The Internals of Postgres web-site is a great place to start after reading the online Postgres docs if you want to learn more -- see chapter 5 and chapter 6. Note that chapter 6 of the Internals site lists steps 3 before step 2 and I need to figure that out before I write too many more blog posts on this topic. There are many useful blog posts on this topic including this blog post.
I don't think vacuum in original POSTGRES did full index scans. I have read some of the discussion that explains why a full index scan is done today but I didn't save links to that. My preference is for vacuum to do index probes rather than full index scans to reduce the time between consecutive vacuums for the same table.
Regardless let me explain some of the work that vacuum can do to make it obvious that vacuum for a table can take a long time, and does when I run Linkbench.
- Scan heap pages for which visibility map bits are not set. From each heap page if a row is an old version (because delete or update) and not visible to any current or future transaction then copy the row's CTID (into an array?)
- If CTIDs were found in the previous step then scan all secondary indexes for this table and for each index entry determine whether its CTID is in the array from step 1. If yes then reclaim the space for that index entry.
- Repeat step 1 but this time reclaim the space for the dead rows.
Finally, the memory used to buffer CTIDs is determined by maintenance_work_mem. If that limit is reached then the steps (and full index scans) are repeated.