There are at least three ways to do compression for InnoDB - classic, holepunch and outsource.
The classic approach (table compression) was used and enhanced by the FB MySQL team. It might not have been widely used elsewhere. While it works, even for read/write workloads, the implementation is complicated and it isn't clear that it has a bright future.
The holepunch approach (page compression) is much simpler than the classic approach. Alas, I am skeptical that a filesystem will be happy tracking the metadata from doing a holepunch for every (or most) pages written. I am also skeptical that unlink() response times of seconds to minutes (because of the holepunch usage) will be good for a production DBMS. I wrote a few posts about my experience with the holepunch approach: here, here, here and here.
The outsource approach is the most simple from the perspective of InnoDB - let the filesystem or storage do the compression for you. In this case InnoDB continues to do filesystem reads and writes as if pages have a fixed size and the FS/storage compresses prior to writing to storage, decompresses after reading form storage and does all of the magic to make this work. While there are log-structured filesystems in OSS that might make this possible, such filesystems aren't widely used relative to XFS and the EXT family. There is also at least one storage device on the market that supports this.
tl;dr - the outsource approach is useful when the data is sufficiently compressible and the cost of this approach (more write-amp) is greatly reduced when it provides atomic page writes.
After publishing I noticed this interesting Twitter thread on support for atomic writes.
I have a simple performance model to understand when the outsource approach will work. As always, the performance model makes assumptions that can be incorrect. Regardless the model is a good start when comparing the space and write amplification with the outsource approach relative to uncompressed InnoDB.
- The log-structured filesystem has 1+ log segments open for writing. Compressed (variable length) pages are written to the end of an open segment. Such a write can create garbage - the previous version of the page stored elsewhere in a log segment. Garbage collection (GC) copies live data from previously written log segments into open log segments to reclaim space from old log segments that have too much garbage.
- The garbage in log segments is a source of space amplification. Garbage collection is a source of write amplification.
- g - represents the average percentage of garbage (free space) in a previously written log segment.
- r - represents the compression rate. With r=0.5 then a 16kb page is compressed to 8kb.
- Write and space amplification are functions of g:
- write-amp = 100 / g
- space-amp = 100 / (100 - g)
- Assumes that g is constant across log segments. A better perf model would allow for variance.
- Assumes that r is constant across pages. A better perf model might allow for variance.
- Estimates of write and space amplification might be more complicated than the formula above.
- s is the space-amp ratio where s = r * 100 / (100 - g).
- w1 is the write-amp ratio assuming the doublewrite buffer is enabled where w1 = r * 100/g.
- w2 is the write-amp ratio assuming the doublewrite buffer is disabled. This assumes that outsource provides atomic page writes for free. The formula is w2 = r/2 * 100/g.