Monday, May 14, 2018

Geek code for database algorithms

I like to read academic papers on database systems but I usually don't have time to do more than browse. If only there were a geek code for this. Part of the geek code would explain the performance vs efficiency tradeoff. While it helps to know that something new is faster, I want to know the cost of faster. Does it require more storage (tiered vs leveled compaction)? Does it hurt SSD endurance (update-in-place vs write-optimized)? Read, write, space and cache amplification are a framework for explaining the tradeoffs.

The next part of the geek code is to group algorithms into one of page-based, LSM, index+log or something else. I suspect that few will go into the something else group. These groups can be used for both tree-based and hash-based algorithms, so I am redefining LSM to mean log structured merge rather than log structured merge tree.

3 comments:

  1. Your redefinition isn't even a redefinition, since it should be "LSM Tree"

    ReplyDelete
    Replies
    1. I agree but wrote this because some people (maybe including me) assume the "tree" part

      Delete
  2. I had the framework to check ssd endurance inside ssd. It uses S.M.A.R.T infor from SSD, which is provided by vendros. If you want, I can share that with you.

    ReplyDelete

Postgres 18rc1 vs sysbench

This post has results for Postgres 18rc1 vs sysbench on small and large servers. Results for Postgres 18beta3 are here for a small and larg...