Tuesday, January 7, 2020

It is all about the constant factors

Assuming I put performance bugs I worked on into one of three buckets -- too much big-O, too much constant factor and other -- then too much big-O would have the fewest entries. Maybe we worry too much about big-O and not enough about everything else? This post is inspired by a post and tweet by Dan Luu.

By too much big-O I mean using an algorithm with more computational complexity when one with less is available and the difference causes problems in production. A common problem is using an O(n^2) algorithm when O(nlgn) or O(n) are possible. By too much constant factor I mean an inefficient implementation. Everything else goes into the other bucket.

It is all about the constant factors was a reply to someone who dismissed performance concerns with it is just a constant factor. That still amuses me. I assume that too much big-O is the least likely cause of performance problems given my classification scheme. This is not a rigorous study but I wasn't able to find many too much big-O bugs from my bug history. The search is complicated because it isn't easy to search for O(n*n). What search terms do you use -- O(n*n), O(n^2), squared?
  1. Pagination is O(n*n) - this happened several times in production, so it counts as many bugs. Be careful with pagination queries.
  2. O(n*n) parsing in MySQL client - this was MySQL bug 34578
  3. O(n*n) allocation on array resize - Dan Luu explained an example of this. I have also encountered this but my memory is vague.
  4. O(n*n) common subexpression elimination - while working on a closed-source DBMS I fixed code that did CSE using an O(n*n) algorithm. It fell over on machine generated SQL that had thousands of terms in the WHERE clause.
  5. The InnoDB rw-lock and mutex so many years ago had O(n*n) overhead for n threads trying to get through them (lock/unlock) because of the way that wake-all on unlock was implemented. See MySQL bug 52806.
Perhaps shared_ptr deserves its own bucket. I wrote about finding shared_ptr perf bugs in MyRocks and MongoDB. I found at least one more in MyRocks, someone else reported one for RocksDB and I have seen a few more problems from this in production.

3 comments:

  1. https://bugs.mysql.com/bug.php?id=84958
    (InnoDB's MVCC has O(N^2) behaviors)

    https://bugs.mysql.com/bug.php?id=67044
    (Secondary index updates make consistent reads do O(N^2) undo page lookup)

    ReplyDelete
  2. This reminded me of MySQL bug 49047, reported by Harrison - deadlock detection could be O(n^2), which then got improved.

    Big-O is useful, but I agree, maybe overly relied-on. One algorithm might perform much better on real datasets, but have a worse big-O because of particular, unusual worst-case scenarios. There are so many factors besides worst-case analyses.

    I like to point out my O(1) - for a fixed disk size - lookup algorithm, better than any index, apparently - scan the entire disk for your entry.

    Whether you scan RAM or SSD or HDD is "just a constant factor" for performance, so who needs caches, buffers, pre-fetching, etc?

    ReplyDelete

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...