Posts

Showing posts from December, 2018

New small servers for performance testing

My old NUC cluster found a new home and I downsized to 2 new NUC servers. The new server is NUC8i7beh with 16g RAM , 500g Samsung 860 EVO for the OS and 500g Samsung 970 EVO for performance. The Samsung 860 is SATA and the Samsung 970 is an m.2 device. I expect to wear out the performance devices as I have done that in the past. With the OS on a separate device I avoid the need to reinstall the OS when that happens. The new NUC has a post-Skylake CPU ( i7-8559u ), provides 4 cores (8 HW threads) compared to 2 cores (4 HW threads) in the old NUCs. I disabled turbo boost again to avoid performance variance as mentioned in the old post. I am not sure these have sufficient cooling for sustained boost and when boost isn't sustained there are frequent changed in CPU performance. I also disabled hyperthreads out of concern for both the impact from Spectre fixes and to avoid a different syscall overhead each time I update the kernel. I might use these servers to examine the impact

LSM math - size of search space for LSM tree configuration

I have written before and will write again about using 3-tuples to explain the shape of an LSM tree. This makes it easier to explain the configurations supported today and configurations we might want to support tomorrow in addition to traditional tiered and leveled compaction. The summary is that n LSM tree has N levels labeled from L1 to Ln and Lmax is another name for Ln. There is one 3-tuple per level and the components of the 3-tuple are (type, fanout, runs) for Lk (level k) where: type is Tiered or Leveled and explains compaction into that level fanout is the size of a sorted run in Lk relative to a sorted run from Lk-1, a real and >= 1 runs is the number of sorted runs in that level, an integer and >= 1 Given the above how many valid configurations exist for an LSM tree? There are additional constraints that can be imposed on the 3-tuple but I will ignore most of them except for limiting fanout and runs to be <= 20. The answer is easy - there are an infinite num

LSM math - how many levels minimizes write amplification?

How do you configure an LSM tree with leveled compaction to minimize write amplification? For a given number of levels write-amp is minimal when the same fanout (growth factor) is used between all levels, but that does not explain the number of levels to use. In this post I answer that question. The number of levels that minimizes write-amp is one of ceil(ln(T)) or floor(ln(T)) where T is the total fanout -- sizeof(database) / sizeof(memtable) When #1 is done then the per-level fanout is e when the number of levels is ln(t) and a value close to e when the number of levels is an integer. Introduction I don't recall reading this result elsewhere, but I am happy to update this post with a link to such a result. I was encouraged to answer this after a discussion with the RocksDB team and thank Siying Dong for stating #2 above while leaving the math to me. I assume the original LSM paper  didn't address this problem because that system used a fixed number of levels. One r

Pixelbook review

Image
This has nothing to do with databases. This is a review of a Pixelbook (Chromebook laptop) that I got on sale last month. This one has a core i5, 8gb RAM and 128gb storage. It runs Linux too but I haven't done much with that. I expected a lot from this given that my 2013 Nexus 7 tablet is still awesome. I have been mostly happy with the laptop but if you care about keyboards and don't like the new Macs thanks to the butterfly keyboard then this might not be the laptop for you. My 3 complaints: keyboard is hard to read. It is grey on grey and too hard to read when there is light on my back even with the backlight (backlit?) turned all the way up. I don't get it -- grey on grey. So this is a great device for using in a dark room or for improving your touch typing skills. touchpad control is too coarse grained so it is either too fast or too slow. The settings has 5 values via a slider (1=slowest, 5=fastest). I have been using it at 3 which is a bit too fast for me while 2