Wednesday, October 23, 2019

Write vs space amplification for an LSM and index+log

There is an inverse relationship between write and space amplification for the LSM and index+log index structures -- with more space (write) there is less write (space) amplification. I expect that the tradeoff is better for index+log than for an LSM with leveled compaction which means that index+log has less write-amp for a given amount of space-amp. It will be interesting to determine whether my expectation is true.

My expectation about index+log is based on the following assumptions and all models are wrong, some are useful applies here.
  • For index+log write-amp is 100 / (100-pfull), space-amp is 100/pfull, pfull is the percentage of the device capacity that can be used and (100-pfull) is the percentage of device capacity set aside to reduce write-amp.
  • For an LSM with leveled compaction write-amp is f * L * total-fo^(1/L), space-amp is 1 + 1/total-fo^(1/L), f is ~0.8 (see this paper), L is the number of levels and total-fo is the total fanout -- size(database) / size(memtable)
The following chart displays write-amp as a function of space-amp assuming that total fanout is 1024 for the LSM. The numbers are based on the formulas above. It shows that write-amp for index+log is always less than write-amp for the leveled LSM for a similar amount of space-amp. Of course lunch isn't free and you usually pay elsewhere with index+log via more cache-amp and more expensive range scans.


Capacity vs Endurance

There is another way to illustrate the relationship between write-amp and space-amp for index+log. I use %cap and %end to indicate the percentage of device capacity and device endurance that is available to a user after accounting for the impact of pfull. When pfull is larger then write-amp increases and space-amp decreases.

It is interesting that %cap + %end = 100.

Assuming:
pfull is percentage of device capacity available to user
space-amp = 100/pfull
write-amp = 100/(100-pfull)
%cap = 100/space-amp

%end = 100/write-amp

Then:
%cap + %end 
  = 100/space-amp + 100/write-amp
  = 100/(100/pfull) + 100/(100/(100-pfull))
  = pfull + (100-pfull)
  = 100

pfull  write-amp  space-amp  %cap  %end
90     10          1.11      90    10
80      5          1.25      80    20
70      3.33       1.43      70    30
60      2.5        1.67      60    40
50      2          2         50    50
40      1.67       2.5       40    60
30      1.43       3.33      30    70
20      1.24       5         20    80
10      1.11      10         90    10

No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...