Thursday, October 15, 2020

Better vs optimal

I read many systems papers and appreciate how research makes my career more interesting. Some interesting papers use optimal when they mean better in describing performance and I wonder if I am an outlier in thinking that optimal is occasionally misused. Optimal is formal and requires one of:

  • A proof
  • An objective function and math to show where it is minimized or maximized
  • Exhaustive search
In the context of tuning a DBMS, exhaustive search is rarely done because the search space is large. Assume there are N tuning options that each use a B-bit integer, then the search space has N*B bits and N*B quickly becomes a too-big number for all of the systems on which I work.

Objective functions and math are wonderful but they work on performance models and it is a challenge to accurately model a complex system. Proofs share that challenge.

Fortunately, there is clever work in progress to get better results without doing exhaustive search. Alas those results are better, not optimal. And now an unsponsored ad -- I appreciate that OtterTune is not only doing interesting work but they have been careful about the distinction between better and optimal.

5 comments:

  1. Optimal in a generic term probably never happens, so guess that should never be used :) I have had a few cases in my life where I felt that I had achieved the optimal performance. One case was in building a networking product where I defined optimality as always achieving to handle the interrupts before they were gone. There was 2 RS 485 interrupts per 88 microseconds and 2 RS 232 interrupts per 500 microseconds and the CPU was running at a clock speed of 1 MHz with most instructions consuming 2-5 cycles.

    The second case was in building an interpreter where I felt that I achieved an optimal implementation in the context of an interpreter not doing any major compilation efforts. Obviously in a wider definition a just-in-time compiler can far outperform such an interpreter, but in my narrow definition of implementing an interpreter I felt that I achieved an optimal implementation.

    Within NDB I have been getting close to an optimal implementation, obviously again this optimality is under the condition that one has to implement things the way I decided to do in NDB, in the specific architecture selected, with the selected data structures and so forth.

    So under specific conditions optimality can be achieved, but I don't think any computer program can live up to the conditions stated in your
    blog, this can only happen with narrow mathematical problems. So for me optimality is more when I feel satisfied that I have done everything possible to achieve the best performance under the conditions I decided.

    At reaching such a point one is ready to move on either to a new project or by redefining the conditions.

    ReplyDelete
    Replies
    1. Thank you. I hope there will be more posts about the history of NDB.

      Delete
    2. I have also read several papers that provide a model of the system and then show optimality within the model. I like that and have learned both from the model and then the math techniques to show optimality. Models are useful even when not perfect.

      Delete
  2. I think 'optimal' generally means 'state of the art' for a given algorithm, or comparison of algorithms. Just recently a more 'optimal' algorithm was discovered for the 'travelling salesperson' problem, for example, but that does not more that a more optimal solution won't be discovered in the future

    ReplyDelete
    Replies
    1. Definitely true for algorithms. But harder to apply to systems composed of several algorithms, caching, non-uniform distribution and so many constant factors.

      Delete

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...