tag:blogger.com,1999:blog-9149523927864751087.post3591324338431467566..comments2024-03-26T09:43:01.052-07:00Comments on Small Datum: Peak benchmarketing season for MySQLMark Callaghanhttp://www.blogger.com/profile/09590445221922043181noreply@blogger.comBlogger11125tag:blogger.com,1999:blog-9149523927864751087.post-18813278327045415332016-09-16T10:36:23.277-07:002016-09-16T10:36:23.277-07:00Good question. Our focus is on OLTP, but maybe som...Good question. Our focus is on OLTP, but maybe someone in the community will add it. The API is too simple today to take advantage of bitmap-and, bitmap-or, etc so that would have to be designed.Mark Callaghanhttps://www.blogger.com/profile/09590445221922043181noreply@blogger.comtag:blogger.com,1999:blog-9149523927864751087.post-21818619733585501352016-09-16T09:12:28.418-07:002016-09-16T09:12:28.418-07:00You mentioned using the Rocks API for new engines....You mentioned using the Rocks API for new engines. Would it be possible to instead add a new index type to RocksDB? Justin Swanharthttps://www.blogger.com/profile/08193089637089861226noreply@blogger.comtag:blogger.com,1999:blog-9149523927864751087.post-83613410020812614772016-09-16T08:45:48.987-07:002016-09-16T08:45:48.987-07:00Didn't know it was open source. I read about b...Didn't know it was open source. I read about bitmap index work LBL was doing when I was at Oracle. LBL did word-aligned, Oracle citations implied byte-aligned and the person behind the work at Oracle was highly productive - http://dl.acm.org/citation.cfm?id=874730<br /><br />I never got to evaluate word vs byte aligned. My memory is that word aligned saves CPU (faster queries), byte aligned saves disk space. Once again the RUM Conjecture applies -- can't be optimal for read, write and space.Mark Callaghanhttps://www.blogger.com/profile/09590445221922043181noreply@blogger.comtag:blogger.com,1999:blog-9149523927864751087.post-31647905064916784972016-09-16T08:40:45.119-07:002016-09-16T08:40:45.119-07:00Have you seen this fun hack, embedded a bitmap ind...Have you seen this fun hack, embedded a bitmap indexed based RDBMS through the UDF interface:<br />https://github.com/greenlion/FastBit_UDFJustin Swanharthttps://www.blogger.com/profile/08193089637089861226noreply@blogger.comtag:blogger.com,1999:blog-9149523927864751087.post-71255947140618556172016-09-16T08:39:39.555-07:002016-09-16T08:39:39.555-07:00The SSB is based on the TPC-H so it is has a "...The SSB is based on the TPC-H so it is has a "scale factor". Scale factor 1 is 512MB of data, scale factor 20 is 10GB, etc. I used scale factor one to find an interesting different in performance of the benchmark when the defaults changed between versions:<br />https://www.percona.com/blog/2013/03/11/mysql-5-6-vs-5-5-on-the-star-schema-benchmark/<br /><br />The point of the SSB is to test join and filtering performance of databases. Star schema are particularly bad for nested-loop joins, and proprietary RDBMS like Oracle have specialized join strategies for star schema. <br /><br />I generally test it with a MySQL capable of hash joins and excellent compression (ICE). The queries that Shard-Query generates are simple enough to avoid any bugs in ICE, and there are workarounds for any other bugs I encountered in it, so it is an excellent choice for data marts. The only downside is that it doesn't support partitioning, thus there is currently a need to shard data over multiple schema for shard-query parallelism.<br /><br />It is intended normally to restart the database and flush the filesystem buffers between each query run, but I don't generally do this for most of my tests, because I'm simply comparing parallel to serial performance, or in the case of redshift, you really can't flush anything.Justin Swanharthttps://www.blogger.com/profile/08193089637089861226noreply@blogger.comtag:blogger.com,1999:blog-9149523927864751087.post-22093054258351149292016-09-16T08:24:52.204-07:002016-09-16T08:24:52.204-07:00With InnoDB you usually get compression or perform...With InnoDB you usually get compression or performance, you rarely get both. In an odd case I might try to use compressed InnoDB for in-memory -- if that lets me fit all data in memory.<br /><br />With MyRocks you don't have to worry as much as enabling compression doesn't have a huge impact on performance. When we also get zstandard into production then the impact from compression is even less.<br /><br />How big is a database with SSB? I used the star schema benchmark many years ago when I worked on bitmap indexes at Oracle. I met Patrick O'Neil many years before that when I worked at Informix. Now I work on LSM. Prof. O'Neil made major contributions to LSM, bitmap indexes, and the star schema benchmark. Mark Callaghanhttps://www.blogger.com/profile/09590445221922043181noreply@blogger.comtag:blogger.com,1999:blog-9149523927864751087.post-12117537097050376012016-09-16T07:20:05.085-07:002016-09-16T07:20:05.085-07:00>MyRocks can still be faster than InnoDB for in...>MyRocks can still be faster than InnoDB for in-memory workloads. That is more likely when the bottleneck for InnoDB is page write-back performance. So write-heavy/in-memory can still be a winner for MyRocks.<br /><br />I wanted to clarify if this was with or without compression in InnoDB. I generally wouldn't use compression for InnoDB for an in-memory workload, but I assumed this would be slower than non-compressed (uses CPU and disk). I wondered if MyRocks in this particular instance was as fast as even non-compressed InnoDB, because IO is so much slower than in-memory operations, and InnoDB does a lot of "extra" IO in that workload. It is impressive that a compressed engine can compete in all of there areas that you speak about: QOS, efficiency, and QPS.<br />Justin Swanharthttps://www.blogger.com/profile/08193089637089861226noreply@blogger.comtag:blogger.com,1999:blog-9149523927864751087.post-25182419029009638672016-09-16T06:59:36.552-07:002016-09-16T06:59:36.552-07:00We see 2X better compression and less than 1/10 th...We see 2X better compression and less than 1/10 the write rate to storage for MyRocks vs compressed InnoDB using both linkbench and production, if that is what you are asking about.Mark Callaghanhttps://www.blogger.com/profile/09590445221922043181noreply@blogger.comtag:blogger.com,1999:blog-9149523927864751087.post-72949679642029151462016-09-16T06:43:15.351-07:002016-09-16T06:43:15.351-07:00I meant specifically in this case, when you are ta...I meant specifically in this case, when you are talking about one particular in-memory workload being comparable to InnoDB. Did you mean compressed or un-compressed InnoDB? <br /><br />I am going to test MyRocks, InnoDB compressed, and TokuDB on the SSB benchmark w/ shard-query when the data size is significantly larger than the buffer pool size. I've only published the results of TokuDB in memory, and that was awhile ago. I got surprisingly good SSB results from TokuDB when data was larger than memory. This benchmark isn't like to start any wars, as it doesn't represent the typical workload unless you are using Shard-Query, Spark (which Percona recently blogged about), or some other tool that can do parallel scans of partitions. Only Shard-Query can push down all operations including aggregation, joins, and the finest possible grain of filtering.<br /><br />Justin Swanharthttps://www.blogger.com/profile/08193089637089861226noreply@blogger.comtag:blogger.com,1999:blog-9149523927864751087.post-19178891921369315822016-09-15T13:33:28.045-07:002016-09-15T13:33:28.045-07:00I usually compare to both and do my best to explai...I usually compare to both and do my best to explain that in the perf reports. I haven't shared many benchmark results this year as I didn't want to start any benchmarketing battles.Mark Callaghanhttps://www.blogger.com/profile/09590445221922043181noreply@blogger.comtag:blogger.com,1999:blog-9149523927864751087.post-29713940527360576572016-09-15T13:16:48.493-07:002016-09-15T13:16:48.493-07:00When you compare MyRocks to InnoDB, do you mean In...When you compare MyRocks to InnoDB, do you mean InnoDB with or without compression?Justin Swanharthttps://www.blogger.com/profile/08193089637089861226noreply@blogger.com