Small Datum: Geek code for database algorithms

Monday, May 14, 2018

Geek code for database algorithms

I like to read academic papers on database systems but I usually don't have time to do more than browse. If only there were a geek code for this. Part of the geek code would explain the performance vs efficiency tradeoff. While it helps to know that something new is faster, I want to know the cost of faster. Does it require more storage (tiered vs leveled compaction)? Does it hurt SSD endurance (update-in-place vs write-optimized)? Read, write, space and cache amplification are a framework for explaining the tradeoffs.

The next part of the geek code is to group algorithms into one of page-based, LSM, index+log or something else. I suspect that few will go into the something else group. These groups can be used for both tree-based and hash-based algorithms, so I am redefining LSM to mean log structured merge rather than log structured merge tree.

3 comments:

Bradley C. KuszmaulMay 14, 2018 at 9:08 AM
Your redefinition isn't even a redefinition, since it should be "LSM Tree"
ReplyDelete
Replies
Woonhak KangMay 25, 2018 at 2:22 PM
I had the framework to check ssd endurance inside ssd. It uses S.M.A.R.T infor from SSD, which is provided by vendros. If you want, I can share that with you.
ReplyDelete
Replies

Add comment

Monday, May 14, 2018

Geek code for database algorithms

3 comments:

The insert benchmark on a small server, IO-bound workload : Postgres 19 beta1