Small Datum: Comparative benchmarks and a question about describing data

Friday, January 24, 2020

Comparative benchmarks and a question about describing data

I enjoy working on database performance benchmarks. I also enjoy writing about benchmarketing. Some of my focus is on comparative benchmarks rather than competitive benchmarks. Let me try to distinguish them. Both try to do the right thing, as in get the best result for each DBMS and try to explain differences. I will try to use these definitions going forward.

The goal for a competitive benchmark is to show that your system is faster than their system and when it is that result will be published.
The goal for a comparative benchmark is to determine where your system is slower than their system and file feature requests for things that can be improved.

I haven't done many competitive benchmarks because I haven't been on a product team for a long time, although MyRocks was kind of a product that I was happy to promote. I have been doing comparative benchmarks for a long time and that will continue on my new job at MongoDB. My product for 10 years was making MySQL better and marketing bugs and feature requests was my job. I did Ok at that.

This post is a follow up to the slow perf movement manifesto.

A question

What is a good way to describe N metrics from a benchmark that used two configurations? I still struggle with this and seek something that is concise and consistent. By concise I want this to be easy to read and use less text. By consistent I want the comparison directions to not change -- use Oracle in the numerator in all cases.

First is the table of results that I want to describe. These numbers are made up in case any lawyers for Microsoft or Oracle read my blog. I didn't violate the DeWitt Clause but I am curious about results for Linkbench and the insert benchmark on your products.

Server	QPS	CPU/query	Cost	Bogomips
Oracle	110	75	180	5
Microsoft	100	100	100	100

I can describe this using percentages or ratios. My preferences is ratios v2.

Percentages - Oracle gets 10% more QPS, Oracle uses 25% less CPU, Oracle costs 80% more bitcoins, Oracle gets 95% fewer bogomips
Ratios v1 - Oracle gets 1.10X more QPS, Oracle uses 0.75X ... CPU, Oracle costs 1.80X more bitcoins, Oracle gets 0.05X ... bogomips
Ratios v2 - QPS ratio is 1.10, CPU ratio is 0.75, Cost ratio is 1.80, bogomips ratio is 0.05

I prefer ratios v2 because it concise -- in the example above the less concise approaches use more than one line which hurts readability. There are other challenges in the other approaches:

In the percentages approach when more and less are used there is the burden of less vs fewer
In the percentages approach I slow down when I read X% more or Y% less. I stop and think too much about percentage change even though the math is easy.
The percentages approach obscures big differences. In the example above there is a huge difference in bogomips as Microsoft gets 20X more. But this is described as 95% less for Oracle, and 95 is close to 100. Focus on 95% rather than 95% less and you miss that it is 20X.
In the ratios v1 approach the CPU difference description is awkward. Writing uses 0.75X more CPU is confusing because Oracle doesn't use more CPU. Writing uses 0.25X less CPU isn't clear. Writing uses 0.75X the CPU doesn't fit the pattern of more/less/fewer.

Small Datum

Friday, January 24, 2020

Comparative benchmarks and a question about describing data

No comments:

Post a Comment

Postgres 17.4 vs sysbench on a large server, revisited part 2