First there was the RUM Conjecture - an index structure can't be optimal for all of read, write & space efficiency. It became the CRUM Conjecture when I added a C for cache amplification because some algorithms require more data in cache per indexed row if they are to be performant and DRAM is costly (see key-value separation).
Can I add another letter to it and rename it the CRUMT conjecture where T is for Tuning amplification? By this I mean the cost (in HW and my time) from repeating tests after changing config options. I spend much time repeating tests and the common reasons are 1) I made a mistake and 2) I want to change a config option. While my mistakes are hard to avoid, it would be nice to spend less time tweaking options and self-tuning index structures will be great for production.
There are efforts in place (thanks OtterTune) to apply ML to the problem. But I hope for an additional effort to move the cleverness into the DBMS. Don't treat this as a black box. Figure out cost models, add objective functions and SLAs and then build things that are better at self-tuning.
Wednesday, January 13, 2021
T is for Tuning amplification
Subscribe to:
Post Comments (Atom)
Evaluating vector indexes in MariaDB and pgvector: part 2
This post has results from the ann-benchmarks with the fashion-mnist-784-euclidean dataset for MariaDB and Postgres (pgvector) with conc...
-
This provides additional results for Postgres versions 11 through 16 vs Sysbench on a medium server. My previous post is here . The goal is ...
-
I often use HWE kernels with Ubuntu and currently use Ubuntu 22.04. Until recently that meant I ran Linux 6.2 but after a recent update I am...
-
I am trying out a dedicated server from Hetzner for my performance work. I am trying the ax162-s that has 48 cores (96 vCPU), 128G of RAM a...
No comments:
Post a Comment