Thursday, January 16, 2020

The slow perf movement

I struggle with the presentation of performance data. This isn't about benchmarketing -- I try to explain performance and inform rather than create hype and FUD. This is about how I present results. One day I should take the time to learn from the experts on this topic.

Two frequent problems for me are:
  1. Balancing impact vs content
  2. Describing performance differences
Balancing impact vs content

I am a member of the slow perf (report) movement. I don't want to make it too easy to read a performance report. Credit for the name goes to Henrik Ingo.

It isn't easy to write a good performance report. The challenge is the right balance of content and conclusions. The conclusions create impact -- X is 3X faster than Y! -- while the content helps the conclusions look truthy. Write the content, then add the conclusions (executive summary). You don't want the reader to suffer through pages of detail trying to learn something.

Performance reports have a context. Too often the conclusion is remembered while the context is lost. It is great that X is faster than Y but I want to know the conditions for which it is true. Alas a perf report with context takes longer to read -- thus the slow perf movement.

A DBMS has a variety of bottlenecks. Whether one or another limits performance depends on the context (workload & hardware). The conclusion, X is faster than Y, doesn't always survive changes in the context.

One example of this is the use of bar charts. I add at least one to most perf blog posts but most of the data I present is in tables. Bar charts are great for impact, tables are better for information density. Too many bar charts is bad for users on mobile devices waiting for a post to download.

Describing performance differences

I am frequently confused when I read that X is 30% or 300% faster or slower than Y. My confusion is about the math used. I prefer to use ratios as they are less ambiguous to me and more concise. Assume I have 8 performance counters measured while running the same workload for MySQL and Postgres -- counter-1 to counter-8 (CPU/query, IO/query, etc).

Then I can report these as ratios where MySQL (or Postgres) is always the numerator:
  • counter-1 ratio is 0.8
  • counter-2 ratio is 2.3
Or I can report these as percentage differences, and hope we all agree on how to compute percentage change.
  • counter-1 is reduced by X%
  • counter-2 is increased by Y%
Sometimes I use a hybrid approach
  • counter-1 is reduced by X%
  • counter-2 is increased by 2.3X
I prefer the first approach. I frequently start with the second approach but then switch to the hybrid approach when there are huge differences for some counters. Then I realize that I am adding to the confusion with the hybrid approach.

Another problem with consistency when comparing two things (servers, configurations, etc) is using the value for thing A in the numerator for some comparisons but in the denominator for others. This is another mistake that I make. One reason for my mistake might be a desire to make all results have a ratio > 1 or a percentage increase when such a result implies an improvement.

How I think about perf reports

I still struggle with the presentation of performance data. Performance reports need to be efficient to write because I am short on time and likely to rewrite a report each time I make a mistake and have to repeat my tests. It needs to be easy to read because it isn't worth writing if nobody reads it. 

I am fan of geek codes (for LSM and other index structures) as they are an efficient way to convey information -- pay the cost once of learning the geek code and save time when you use it. This is great for people who read many of your perf reports and blog posts. This isn't always great for a first-time reader. 

I have layers of scripts to extract and condense performance data. My goal is one line per test with N columns of the most useful data for showing results and explaining differences. With one line per test it is easy to compare results across tests (where each test might be a different configuration or DBMS). But I risk using a too large value for N leading to too long lines.

It is easy for me to paste such data into a report and the result looks OK with a fixed width font. I usually format such tables twice in my scripts -- once as CSV, once as TSV. The CSV can be imported into a spreadsheet. The TSV can be pasted into reports. Pasting the TSV is easy. Using an editor (Google Docs, Blogger, Word, etc) to reformat into a proper table takes time. So I am reluctant to do that but formatting HTML tables could be added to my scripts.

I am reluctant to use bar charts for such data. First, because a bar chart takes up more space than a table of text. Second, because a bar chart takes more time to format. In the past I have had scripts that generate HTML or URLs for use with Google charts. I have yet to script the creation of bar charts.

No comments:

Post a Comment

Evaluating vector indexes in MariaDB and pgvector: part 2

This post has results from the ann-benchmarks with the   fashion-mnist-784-euclidean  dataset for MariaDB and Postgres (pgvector) with conc...