MySQL is frequently used as the comparison system in papers and I frequently think to myself that my MySQL is faster than their MySQL when I read the results. Then I doubt the results which makes me doubt the paper and the paper has less impact. We can fix that. Get an MBA (MySQL Benchmark Advisor) to consult on your usage of MySQL before submitting the paper for review.
Benchmarking is hard, for example see some of the things you should consider when doing performance tests for the LevelDB family. It becomes harder as the number of products compared is increased. One difference between benchmarking and benchmarketing is that a benchmark explains the difference in performance. In the context of an academic paper the new idea is almost always faster otherwise the paper would not get published (see this post for more on the issue). It might be faster because it is better. Or it might be faster because:
- it was compared to products that were misconfigured
- the test wasn't run long enough to fragment the file structure (b-tree, LSM, etc)
- the test wasn't run long enough to force flash garbage collection to start
- the test database was too small relative to the storage device size
I have a lot of experience with benchmarks for MySQL, RocksDB and locally attached storage. This means I have made a lot of mistakes and sometimes learned from them. I have also published and retracted bogus results, repeated many tests and spent a lot of time explaining what I see. In my case I have to explain the result if I want to fix the problem or expect the vendor to fix the problem. Performance results in the MySQL community are subject to peer review. Much of that happens in public when other gurus question, praise or debug your results.
MySQL and InnoDB are a frequent choice when comparisons are needed for a conference paper. I have read more than one paper with results that I don't trust. There is a lot of ambiguity because papers rarely have sufficient information to repeat the test (client source and my.cnf and steps to run the test and storage description and ...) and repeatability is an open problem for the academic database community.
So this is an open offer to the academic database & systems community. The MySQL community (at least me) is willing to offer advice on results, tuning & performance debugging. This offer is limited to the academic community. Startups or vendors are best served by the excellent consultants in the MySQL community. I expect similar degrees to be created for MongoDB (MBA) and PostgreSQL (PBA).
Sign me up, I'm happy to participate on both sides (reviewing benchmarks and having mine reviewed).
ReplyDeleteTokuDB and TokuMX already gets peer review after results are published.
DeleteOK, edit that. I'm happy to review.
DeleteThis is a very generous offer. I hope that you have a lot of people take you up on it.
ReplyDeleteI have already granted 2 more MBA degrees - one to Dimitri Kravchuk (http://dimitrik.free.fr/blog) and another to Tim Callaghan (https://twitter.com/tmcallaghan).
DeleteI will grant Domas an MBA too. His first degree!
Delete