Posts

Showing posts from April, 2018

MyRocks, malloc and fragmentation -- a strong case for jemalloc

While trying to reproduce a MyRocks performance problem I ran a test using a 4gb block cache and tried both jemalloc and glibc malloc. The test server uses Ubuntu 16.04 which has glibc 2.23 today. The table below lists the VSZ and RSS values for the mysqld process after a test table has been loaded. RSS with glibc malloc is 2.6x larger than with jemalloc. MyRocks and RocksDB are much harder on an allocator than InnoDB and this test shows the value of jemalloc. VSZ(gb) RSS(gb) malloc     7.9     4.8     jemalloc-3.6.0 13.6     12.4     glibc-2.23 I am not sure that it is possible to use a large RocksDB block cache with glibc malloc, where large means that it gets about 80% of RAM. I previously shared results  for MySQL and for MongoDB . There have been improvements over the past few years to make glibc malloc perform better on many-core servers. I don't know whether that work also made it better at avoiding fragmentation.

Fun with caching_sha2_password in MySQL 8.0.11

I want to get benchmark numbers with MySQL 8.0.11. This is my first impression. The default auth method was changed to caching_sha2_password. See this post for more details. There will be some confusion with this change. By confusion I mean the difference between "error" and "OK because cached" below. I am not alone. See the experience that an expert had with replication . Fun with caching_sha2_password occurs even with clients compiled as part of 8.0.11: install MySQL 8.0.11, disable SSL but use mostly default my.cnf bin/mysql -u... -p... -h127.0.0.1 -e ... -> error bin/mysql -u... -p... -e ... -> OK bin/mysql -u... -p... -h127.0.0.1 -e ... -> OK because cached The error in step 2 is:  ERROR 2061 (HY000): Authentication plugin 'caching_sha2_password' reported error: Authentication requires secure connection. From show global variables I see the default is caching_sha2_password: default_ auth entication_plugin   caching_sha2_p

Index Structures, Access Methods, whatever

I prefer to use Index Structures when writing about algorithms for indexing data stored on disk and SSD but Access Methods is another popular name. Recently I have been working on a high-level comparison (lots of hand waving) of index structures in terms of read, write, space and cache amplification . I started by dividing the index structures into tree-based and hash-based. Tree-based support range queries while hash-based do not. Most of my experience is with tree-based approaches. There was a lot of research on hash-based in the 80s (extendible, dynamic and linear hashing), they are in production today but not as prominent as b-trees, and there is recent research on them. This paper by Seltzer and Yigit is a great overview on hash-based index structures. The next classification is by algorithm - page-based, index+log and LSM. I use this for both tree-based and hash-based so LSM means Log Structured Merge rather than Log Structured Merge Tree. Most DBMS implement one or two of