Tuesday, December 10, 2019

Slides for talks I have given on MySQL, MongoDB and RocksDB

It all started for me with the Google patch for MySQL in April 2007. The Register summary of that included a Cringely story repeated by Nick Carr that Google might have shared the patch as part of a plan to dominate IT via cloud computing. I thought that was ridiculous. AWS said hold my beer and brought us Aurora.

I donated to the Wayback Machine to offset my HW consumption for those links.

A list of talks from the RocksDB team is here.

This is an incomplete list of slide decks and videos from me:

Historical - summary of the Google MySQL effort

This summarizes work we did on MySQL at Google. These posts used to be shared at code.google.com but it was shutdown. After reformatting most of these (it was a fun day for me, but sorry for the spam) I remembered that somone had already done that in 2015. Thank you.

It all started for me with the Google patch for MySQL in April 2007. The Register summary of that included a Cringely story repeated by Nick Carr that Google might have shared the patch as part of a plan to dominate IT via cloud computing. I thought that was ridiculous. AWS said hold my beer and brought us Aurora.

My reformatted posts:

Posts from the git wiki pages via my fork of upstream:

Historical - InnoDB IO Performance

This post was shared on code.google.com many years ago but code.google has been shutdown. It describes work done by my team at Google. I am interested in the history of technology and with some spare time have been enable to republish it.

This is a collection from several posts about InnoDB IO performance

Max dirty pages

InnoDB provides a my.cnf variable, innodb_max_dirty_pages_pct, to set the maximum percentage of buffer pool pages that should be dirty. It then appears to ignores said variable for IO bound workloads (see this post from DimitriK). It doesn't really ignore the value. The problem is that it does not try hard enough to flush dirty pages even when there is available IO capacity. Specific problems include:
  • one thread uses synchronous IO to write pages to disk. When write latency is significant (because O_DIRECT is used, SATA write cache is disabled, network attached storage is used, ext2 is used) then this thread becomes a bottleneck. This was fixed in the v2 Google patch and is also fixed in MySQL 5.4 and Percona builds.
  • rate limits are too small. InnoDB has a background thread that schedules writes for 100 dirty pages at a time when there are too many dirty pages. The limit of 100 is reasonable for a single-disk server. It must be larger for a high IOPs server. The v2 Google patch, MySQL 5.4 (maybe) and Percona branches use the innodb_io_capacity my.cnf variable to determine the number of pages per second that should be written for this case and the amount of IO that should be done in other cases. All work is expressed as a fraction of this variable, rather than as a fixed number of IO operations.
  • request arrays are too small. On my servers, each array has 256 slots. For a server that can do 1000 IOPs, this is too small. The v4 patch makes the size of the array a function of the value of innodb_io.capacity
  • user sessions are not constrained. InnoDB delay's user sessions when the purge thread gets too far behind. Otherwise, not much is done to delay a user session. The v4 patch adds code to force user session's to stop and flush dirty pages when the maximum number of dirty pages has been exceeded. Hopefully, this code does nothing as the background thread is more likely to keep up given other changes in the v4 patch.

IO Performance

This provides performance results for work done to improve InnoDB IO performance. TODO - fix the links:

It is one thing to publish performance results. It is another to understand them. The results here need more analysis and the code needs to be tested by others in the community.

This describes work to make InnoDB faster on IO bound workloads. The goal is to make it easy to use InnoDB on a server that can do 1000 to 10000 IOPs. Many of problems must be fixed for that to be possible, but this is a big step towards that goal. These changes improve performance by 20% to more than 400% on several benchmarks. At a high level, these changes make InnoDB:
  • more efficient when processing IO requests
  • more likely to use available IO capacity
  • better at balancing different IO tasks
  • easier to monitor

One day, Heikki will write the Complete Guide to InnoDB (edit - Jeremy Cole did a lot to explain them), until then you need to consult multiple sources to understand the internals. It also helps to read the source code. These may help you to understand it:

Features

  • Changes the computation of the percentage of dirty buffer pool pages. Before this change the percentage exluded pages borrowed from the buffer pool for other uses. While that may be more accurate, it also requires the caller to lock/unlock a hot mutex. It also made the percentage vary a bit too much as the insert buffer grew and shrank. The v4 patch doesn't exclude the borrowed pages. As most of the borrowed pages should be used in the insert buffer and the insert buffer should be smaller (thanks to ibuf_max_pct_of_buffer), this is probably a good thing.
  • (edit removed many links to other project pages)

Background IO

InnoDB starts a thread, the main background IO thread, to perform background IO operations. This has operations that run once per second, once per 10 seconds and only when the server is idle. This is implemented with a for loop that iterates 10 times. Each time through the loop, the thread sleeps for 1 second unless too much work was done on the previous iteration of the loop. At the end of 10 iterations, the once per 10 seconds tasks are run.

It is hard to understand the behavior of this loop because the sleep is optional dependent on the amount of work done on the previous iteration of the loop. And there are costs from this complexity. For example, one of the 1 second tasks is to flush the transaction log to disk to match the expected behavior from innodb_flush_log_at_trx_commit=2. However, when the 1 second loop runs much more frequently than once per second there will be many more fsync calls then expected.

In the v4 patch, the sleep is not optional. Other changes to the main background IO thread make it possible for each loop iteration to do enough work that there is no need to skip the sleep.

In the v4 patch all of the code that submits a large number of async IO requests makes sure that the number of requests does not exceed the number of free slots in the array. Otherwise, the async IO requests block until there are free slots.

CPU overhead from IO

There are several factors that consume CPU time during IO processing:
  • checksum computation and verification - the v4 patch does not make this faster. Using -O3 rather than -O2 with gcc makes this faster. On a server that does 10,000 IOPs, this will consume a lot of CPU time. Domas wrote about this. We may need to consider alternative checksum algorithms and machine-specific optimizations.
  • request array iteration - InnoDB maintains requests for IO in an array. It frequently iterates on the array and called a function to get the next element in the array. That has been changed to use pointer arithmetic. This makes a big difference when the array is large.
  • request merging - InnoDB merges requests for adjacent blocks so that one large IO operation is done instead of several page size operations. Up to 64 page requests can be merged into one large (1MB) request. The merging algorithm was O(N*N) on the size of the request array and has been changed to be O(N). This will merge fewer requests but use much less CPU. A better change might be to replace each array with two lists: one that maintains requests in file order and the other in arrival order. But that must wait for another day.

my.cnf options for IO performance

These InnoDB my.cnf variables are new in the Google patches:
  • innodb_max_merged_io - maximum number of IO requests merged to issue large IO from background IO threads
  • innodb_read_io_threads - number of background read I/O threads in InnoDB
  • innodb_write_io_threads - number of background write I/O threads in InnoDB
  • innodb_adaptive_checkpoint - makes the background IO thread flush dirty pages when are there old pages that will delay a checkpoint. OFF provides traditional behavior
  • innodb_check_max_dirty_foreground - make user sessions flush some dirty pages when innodb_max_dirty_pages_pct has been exceeded. OFF provides traditional behavior
  • innodb_file_aio_stats - compute and export per-file IO statistics for InnoDB
  • innodb_flush_adjacent_background - when background IO threads flush dirty pages, flush adjacent dirty pages from the same extent. ON provides traditional behavior.
  • innodb_flush_adjacent_foreground - when user sessions flush dirty pages, flush adjacent dirty pages from the same extent. ON provides traditional behavior
  • innodb_ibuf_flush_pct - percent of innodb_io_capacity that should be used for prefetch reads used to merge insert buffer entries
  • innodb_ibuf_max_pct_of_buffer - soft limit for the percent of buffer cache pages that can be used for the insert buffer. When this is exceeded background IO threads work harder to merge insert buffer entries. The hard limit is 50%. The traditional value is 50%.
  • innodb_ibuf_reads_sync - use sync IO to read blocks for insert buffer merges. ON provides traditional behavior. 
  • innodb_io_capacity - maximum number of concurrent IO requests that should be done to flush dirty buffer pool pages. CAUTION -- setting this too high will use a lot of CPU to schedule IO requests and more than 1000 might be too high. The traditional value is 100.

Insert Buffer Improvements

InnoDB performance on many IO bound workloads is much better than expected because of the insert buffer. Unfortunately, InnoDB does not try hard enough to keep the insert buffer from getting full. And when it gets full it kills performance because it continues to use memory from the buffer pool but cannot be used to defer IO for secondary index maintenance.

The v4 patch has several changes to fix this:
  • the my.cnf variable innodb_ibuf_max_pct_of_buffer specifies a soft limit on the size of the buffer pool. The hard limit is 50%. When the hard limit is reached no more inserts are done to the insert buffer. When the soft limit is reached, the main background IO thread aggressively requests prefetch reads to merge insert buffer records.
  • the my.cnf variable innodb_ibuf_flush_pct specifies the number of prefetch reads that can be submitted at a time as a percentage of innodb_io_capacity. Prior to the v4 patch, InnoDB did 5 prefetch read requests at a time and this was usually done once per second.
  • the my.cnf variable innodb_ibuf_reads_sync determines whether async IO is used for the prefetch reads. Prior to the v4 patch, sync IO was used for the prefetch reads done to merge insert buffer records. This variable was added for testing as the default value (skip_innodb_ibuf_reads_sync) should be used in production.
  • code is added to delay user sessions and make them merge insert buffer records when the size of the insert buffer exceeds the soft limit.

Freeze InnoDB IO

This feature wasn't useful in production. It added the commands:
  • set global innodb_disallow_writes=ON
  • set global innodb_disallow_writes=OFF

These enable and disable all Innodb file system activity except for reads. If you want to take a database backup without stopping the server and you don't use LVM, ZFS or some other storage software that provides snapshots, then you can use this to halt all destructive file system activity from InnoDB and then backup the InnoDB data files. Note that it is not sufficient to run FLUSH TABLES WITH READ LOCK as there are background IO threads used by InnoDB that may still do IO.

Async IO for InnoDB

InnoDB supports asynchronous IO for Windows. For Linux, it uses 4 threads to perform background IO tasks and each thread uses synchronous IO. There is one thread for each of:
  • insert buffer merging
  • log IO
  • read prefetch requests
  • writing dirty buffer cache pages
InnoDB issues prefetch requests when it detects locality in random IO and when it detects a sequential scan. However, it only uses one thread to execute these requests. Multi-disk servers are best utilized when more IO requests can be issued concurrently.

For deployments that use buffered IO rather than direct IO or some type of remote disk (SAN, NFS, NAS), there is not much of a need for more write threads because writes complete quickly into the OS buffer cache. However, as servers with many GB of RAM are used, it is frequently better to use direct IO.

We changed InnoDB to support a configurable number of background IO threads for read and write requests. This is controlled by the parameters:
  • innodb_max_merged_io - Max number of IO requests merged to issue large IO from background IO threads
  • innodb_read_io_threads - the number of background IO threads for read prefetch requests
  • innodb_write_io_threads - the number of background IO threads for writing dirty pages from the buffer cache

Historical - Patch for MySQL 5.0

This post was shared on code.google.com many years ago but code.google has been shutdown. It describes work done by my team at Google. I am interested in the history of technology and with some spare time have been enable to republish it.

This describes the patch for MySQL 5.0 provided by my team at Google. The early patches from us were difficult for others because they tended to include too many diffs. I didn't have time to better. After I moved to FB Steaphan Greene undertook the heroic effort to do proper diff management and the team has continued to follow his example.

Introduction

The code has been changed to make MySQL more manageable, available and scaleable. Many problems remain to be solved to improve SMP performance. This is a good start. The v3 patch and all future patches will be published with a BSD license which applies to code we have added and changed. Original MySQL sources has a GPL license.

I am not sure if the patches were lost after the googlecode shutdown.

These have the same functionality as the MySQL 4 patches. There are several patch sets:
  • v1 patch published in 2007
  • v2 patch with all of our changes for MySQL 5.0.37
  • v3 patch with all of our changes for MySQL 5.0.37 as of May 6, 2009. This adds global transaction IDs, row-change logging and more InnoDB SMP performance fixes.
  • v4 patch [http://google-mysql-tools.googlecode.com/svn/trunk/old/mysql-as of June 1, 2009
  • semisync v1 patch published in 2007
  • mutexstats patch MySQL 5.1.26
  • SMP perf patch for MySQL 5.0.67. This has two changes:
    • use atomic memory instructions for the InnoDB mutex and rw-mutex. This is only done for x86 platforms that use a recent (>= 4.1) version of GCC.
    • disable the InnoDB memory heap. This is done for all platforms
  • SMP plugin for the InnoDB 1.0.1 plugin in MySQL 5.1
  • Patch to enable/disable IO to InnoDB files for MySQL 5.0.37
  • Patch to use pthread_mutex_t instead of mutex_t for rw_lock_struct::mutex in InnoDB
  • Patch for global transaction IDs and binlog event checksums stand-alone patch] extracted out of the big V3 patch and ported to mysql-5.0.68 as of 

Feedback, Problems and Comments

Use the deprecated Google group.

Disclaimer

We have changed a lot of code. Not all of the changes are described here and some of the changes to default behavior from new my.cnf variables can break your applications. Unless your name rhymes with Domas, it might be better to take pieces of the patch rather than try to use all of it.

The code has been tested on 32-bit and 64-bit Linux x86. We may have broken the build for other platforms.

The embedded server, *--with-embedded-server*, cannot be built with these changes. We have broken the build for it.

Many of the Makefile.in and Makefile.am files have been changed in the big patch because we changed InnoDB to use the top-level configure.

If you try to install the big patch, treat it like [http://dev.mysql.com/doc/refman/5.0/en/installing-source-tree.html installing from a source tree].

Authors

Many people contributed to this:
  • Wei Li
  • Gene Pang
  • Eric Rollins
  • Ben Handy
  • Justin Tolmer
  • Larry Zhou
  • Yuan Wei
  • Robert Banz
  • Chip Turner
  • Steve Gunn
  • Mark Callaghan

The v2 patch

This has many new features and a few non-features. Embedded MySQL will not work with this patch.
  • SqlChanges
  • SemiSyncReplication
  • InnodbSmp
  • NewShowStatus
  • NewShowInnodbStatus
  • NewConfiguration
  • UserTableMonitoring
  • TransactionalReplication
  • MysqlRoles
  • MysqlRateLimiting
  • MoreLogging
  • InnodbAsyncIo
  • FastMasterPromotion
  • MirroredBinlogs
  • InnodbSampling
  • NewSqlFunctions
  • InnodbStatus
  • LosslessFloatDump
  • MysqlHttp
  • InnodbIoTuning
  • MutexContentionStats
  • FastMutexes
  • InnodbFreeze

The v3 patch

This has many new features and a few non-features. Embedded MySQL will not work with this patch. Also, I generated the patch after running 'make distclean' so there are some files that must be regenerated after this patch is applied, including sql_yacc.cc and sql_yacc.h. By doing this, the patch diff is smaller but maybe a bit confusing. Also, I did not update any of the files in libmysqld/ that are copied from sql/.
  • GlobalTransactionIds
  • OnlineDataDrift
  • BatchKeyAccess
  • InnodbMutexContention2
  • BinlogEventChecksums

The v4 patch

This makes InnoDB much faster on IO bound workloads and fixes bugs in new features.
  • InnodbIoPerformance

Not yet released
  • MysqlThreadPool

Historical - Make User Delayed

This post was shared on code.google.com many years ago but code.google has been shutdown. It describes work done by my team at Google. I am interested in the history of technology and with some spare time have been enable to republish it.

I added support to rate limit DBMS accounts that were too busy. It wasn't successful in production for the obvious reason that it just shifts the convoy from the database to the app server -- the problem still exists. The better solution is to fix the application or improve DBMS capacity but that takes time.

This describes SQL commands added to rate limit queries per account and per client IP.

Per account rate limiting

Per-account query delays use new SQL commands to set a query delay for an account. The delay is the number of milliseconds to sleep before running a SQL statement for the account. These values are transient and all reset to zero delay on server restart. The values are set by the command MAKE USER 'user' DELAYED 100 where the literals user and 100 are the account and number of milliseconds to sleep. There is no delay when the value is 0. The values are displayed by the command SHOW DELAYED USER.

MySQL had a feature to limit the number of queries per hour for an account. This is done by setting the _user.max_questions_ column for the account. We have changed this to be the max queries per minute so that when an account reaches the limit, it doesn't have to wait for an hour for the reset.

These don't change the behavior for existing connections. There must be a reconnect to get the new values.

Per client IP rate limiting

Per-client rate limiting is done by the command MAKE CLIENT 'IP-address' DELAYED 100 where the literal IP-address is the exact match for the client IP that should be delayed and 100 is the number of milliseconds to delay each statement. The delays are displayed by the command SHOW DELAYED CLIENT.

Historical - Adding Roles to MySQL

This post was shared on code.google.com many years ago but code.google has been shutdown. It describes work done by my team at Google. I am interested in the history of technology and with some spare time have been enable to republish it.

I added support for roles to MySQL circa 2008. They arrived upstream with MySQL 8 in 2018. I wasn't able to wait. I enjoyed the project more than expected. It wasn't hard in terms of algorithms or performance but I had to avoid mistakes to avoid security bugs and the upstream code was well written. I had a similar experience implementing BINARY_FLOAT and BINARY_DOUBLE at Oracle. There I got to learn about the IEEE754 standard and had to go out of my way to catch all of the corner cases. Plus I enjoyed working with Minghui Yang who did the PL/SQL part of it.

MySQL roles and mapped users

The access control model in MySQL does not scale for a deployment with thousands of accounts and thousands of tables. The problems are that similar privileges are specified for many accounts and that the only way to limit an account from accessing a table is to grant privileges at the table or column level in which case the mysql.user table has millions of entries.

Privileges may be associated once with a role, and then many accounts may be mapped to that role. When many accounts have the same privileges, this avoids the need to specify the privileges for each account.

We have implemented mapped users in the MySQL access control model. These are used to simulate roles and solve one of these problems. A mapped user provides authentication credentials and is mapped to a _role_ for access control. A new table, mysql.mapped_user, has been added to define mapped users. Entries in an existing table, mysql.user, are reused for roles when there are entries from mysql.mapped_user that reference them.

To avoid confusion:
  • mapped user - one row in mysql.mapped_user
  • role - one row in mysql.user referenced by at least one row in mysql.mapped_user

This provides several features:
  • multiple passwords per account
  • manual password expiration
  • roles
  • transparent to users (mysql -uuser -ppassword works regardless of whether authentication is done using entries in mysql.mapped_user or mysql.user)

Use Case

Create a role account in mysql.user. Create thousands of private accounts in mysql.mapped_user that map to the role. By map to I mean that the value of mysql.mapped_user.Role is the account name for the role.

Implementation

Authentication in MySQL is implemented using the _mysql.user_ table. mysqld sorts these entries and when a connection is attempted, the first entry in the sorted list that matches the account name and hostname/IP of the client is used for authentication. A challenge response protocol is done using the password hash for that entry.
A new table is added to support mapped users. This table does not have columns for privileges. Instead, each row references an account name from mysql.user that provides the privileges. The new table has a subset of the columns from mysql.user:
  • User - the name for this mapped user
  • Role - the name of the account in mysql.user from which this account gets its privileges
  • Password - the password hash for authenticating a connection
  • PasswordChanged - the timestamp when this entry was last updated or created. This is intended to support manual password expiration via a script that deletes all entries where PasswordChanged less than the cutoff.
  • ssl_type, ssl_cipher, x509_issuer, x509_subject - values for SSL authentication, note that code has yet to be added in the server to handle these values

DDL for the new table:
CREATE TABLE mapped_user (
  User char(16) binary DEFAULT '' NOT NULL,
  Role char(16) binary DEFAULT '' NOT NULL,
  Password char(41) character set latin1 collate latin1_bin DEFAULT '' NOT NULL,
  PasswordChanged Timestamp DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP NOT NULL,
  ssl_type enum('','ANY','X509','SPECIFIED') character set utf8 NOT NULL default '',
  ssl_cipher blob NOT NULL,
  x509_issuer blob NOT NULL,
  x509_subject blob NOT NULL,
  PRIMARY KEY (User, Role, Password)
) engine=MyISAM
CHARACTER SET utf8 COLLATE utf8_bin
comment='Mapped users';

Authentication

Entries from mysql.mapped_user are used to authenticate connection attempts only when authentication fails with entries in mysql.user. The failure may have occurred because there was no entry in mysql.user for the user/host or because the password was wrong. If authentication succeeds using an entry in mysql.mapped_user, the mysql.mapped_user.Role column in that entry and the client's hostname/IP are used to search mysql.user for a matching entry. And if one is found, that entry provides the privileges for the connection. By provides the privileges I mean that:
  • the values of mysql.user.User and mysql.user.Host are used to search the other privilege tables
  • the global privileges stored in mysql.user for the matching entry are used

The mysql.mapped_user table supports multiple passwords per account. When a user tries to create a connection with a username that is in the mysql.mapped_user table and there are multiple entries with a matching value in mysql.mapped_user.User, then authentication is attempted for one entry at a time using the password hash in mysql.mapped_user.Password until authentication succeeds or there are no more entries. Note that the order in which the entries from mysql.mapped_user are checked is *not* defined, but this is only an issue when there are entries in mysql.mapped_user with the same value for _User_ and different values for _Role_ and that deployment model should not be used. Also note that this does not require additional RPCs during client authentication.

Entries are ignored from mysql.mapped_user when:
  • Role is the empty string
  • User is the empty string
  • Password is the empty string

There is no constraint between the values in mysql.mapped_user.User and mysql.user.User.  Thus, a bogus mapping (Role references an account that does not exist in mysql.user) can be created. In that case, the entry in mysql.mapped_user cannot be used to create connections and will get access denied errors.

There is a primary key index on mysql.mapped_user, but that is not sufficient to enforce all of the integrity constraints that are needed. Entries with the same values for User and Role but different passwords are allowed, and the primary key forces the password to be different. Entries with the same value for User but different values for _Role_ should not be allowed. However, this can only be enforced with a check constraint on the table and MySQL does not enforce check constraints. We can write a tool to find such entries.

SQL Interfaces

Roles can be added via the _create mapped user_ command that is similar to create user but extended to support options for SSL connections. Roles can be dropped by the drop mapped user command that is similar to drop user. These commands update internal data structures and update the mysql.mapped_user table. There is no need to run flush privileges with these commands.

The following have been changed to print the value of mysql.mapped_user.User rather than the value of mysql.user.User when a role is used to create a connection.
  • error messages related to access control
  • select current_user()
  • select user()
  • show user_statistics
  • show processlist

The output of show grants has not been changed and will display the privileges for the role (the entry in _mysql.user).

set password = password(STRING)_ fails for accounts that use a role. The only way to change a password for an entry in mysql.mapped_user is by an insert statement.

how processlist with roles displays the role for connections from mapped users rather than the mapped user name. show processlist displays the value from mysql.mapped_user.

show user_statistics with roles displays statistics aggregated by role for connections from mapped users. show user_statistics displays values aggregated by the value from mysql.mapped_user.

Mapped users can be created by inserting into mysql.mapped_user and then running FLUSH PRIVILEGES. They are also created by the _create mapped user_ command. An example is create mapped user mapped_readonly identified by 'password' role readonly.

Mapped users can be dropped by deleting from mysql.mapped_user and then running FLUSH PRIVILEGES. They are also dropped by the _drop mapped user_ command. An example is *drop mapped user foo*. This drops all entries from mysql.mapped_user with that user name. A delete statement must be used to drop an entry matching either (username, role) or (username, role, password).

select user() displays the value of the mapped user name when connected as a mapped user. select current_user() displays the value of the role when connected as a mapped user. This is done because current_user() is defined to return the name of the account used for access control.

make user delayed is done on the value of the account name. It does not matter whether the account is listed in mysql.user or mysql.mapped_user.

mysql.mapped_user does not have columns for resource limits such as max connections and max QPS. Limits are enforced per role.

This feature is only supported when the configuration variable mapped_users is used (add to /etc/my.cnf). This feature is disabled by default. Also, the mysql.mapped_user table must exist. This table does not exist in our current deployment. It must be created before the feature is enabled. The scripts provided by MySQL to create the system databases will create the table, but we do not use those scripts frequently.

The value of the mysql.user.Host column applies to any mapped users trying to create a connection. This can be used to restrict clients to connect from prod or corp hosts.

Open Requests
  • Add a unique index on (User, Password)
  • Add an email column to mysql.mapped_user
  • Inherit limits (hostname/IP address from which connections are allowed, connection limits, max queries per minute limit) from the mysql.user table.
  • Implement support for SSL -- the mysql.mapped_user table has columns for SSL authentication. Code has not been added to the server to handle them.


Historical - changes to my.cnf

This post was shared on code.google.com many years ago but code.google has been shutdown. It describes work done by my team at Google. I am interested in the history of technology and with some spare time have been enable to republish it.

TODO - find the linked pages including:
  • MysqlHttp - we added an http server to mysql for exporting monitoring. This was work by Nick Burrett
  • InnodbAsyncIo - this explains perf improvements we made for InnoDB
  • InnoDbIoTuning - explains more perf improvements we made for InnoDB

We added these options:
  • http_enable - start the embedded HTTP demon when ON, see MysqlHttp
  • http_port - port on which HTTP listens, see MysqlHttp
  • innodb_max_merged_io - max number of IO requests merged into one large request by a background IO thread
  • innodb_read_io_threads, innodb_write_io_threads - number of background IO threads for prefetch reads and dirty page writes, see InnodbAsyncIo
  • show_command_compatible_mysql4 - make output from some SHOW commands match that used by MySQL4
  • show_default_global - make SHOW STATUS use global statistics
  • global_status_update_interval - the interval at which per-thread stats are read for SHOW STATUS. When SHOW STATUS is run more frequently cached values are used rather than locking and reading data from each thread.
  • google_profile[=name] - enable profiling using Google Perftools and write output to this file. Server must have been compiled to use Google Perftools.
  • equality_propagation - enables use of equality propagation in the optimizer because the overhead was too much in a few releases (bug filed & fixed)
  • trim_trailing_blanks - trim trailing blanks on varchar fields when set
  • allow_view_trigger_sp_subquery - allow use of views, triggers, stored procedures and subqueries when set
  • allow_delayed_write - allow use of delayed insert and replace statements
  • local-infile-needs-file - LOAD DATA LOCAL INFILE requires the FILE privilege when set  
  • audit_log[=name] - log logins, queries against specified tables, and startup
  • audit_log_tables=name - log queries that use these tables to the audit log (comma separated)
  • log_root - log DML done by users with the SUPER privilege
  • repl_port[=#] - extra port on which mysqld listens for connections from users with SUPER and replication privileges
  • rpl_always_reconnect_on_error - slave IO thread always tries to reconnect on error when set
  • rpl_always_enter_innodb - slave SQL thread always enter innodb when set regardless of innodb concurrency ticket count
  • rpl_event_buffer_size=# - size of the per-connection buffer used on the master to copy events to a slave. Avoids allocating/deallocating a buffer for each event.
  • reserved_super_connections=# - number of reserved connections for users with SUPER privileges.
  • rpl_always_begin_event - always add a BEGIN event at the beginning of each transaction block written to the binlog. This fixes a bug.
  • rpl_semi_sync_enabled - enable semisync replication on a master
  • rpl_semi_sync_slave_enabled - semisync replication on a slave
  • rpl_semi_sync_timeout - timeout in milliseconds for semisync replication in the master
  • rpl_semi_sync_trace_level - trace level for debugging for semisync replication
  • rpl_transaction_enabled - use transactional replication on a slave
  • innodb_crash_if_init_fails - crash if InnoDB initialization fails
  • innodb_io_capacity - number of disk IOPs the server can do, see InnodbIoTuning
  • innodb_extra_dirty_writes - flush dirty buffer pages when dirty pct is less than max dirty pct
  • connect_must_have_super - only connections with SUPER_ACL, REPL_SLAVE_ACL or REPL_CLIENT_ACL are accepted (yes, this is dynamic)
  • readonly_databases - prevents writes to any DB except for mysql
  • readonly_mysql - prevents writes to mysql DB will fail.
  • fixup_binlog_end_pos - fix for MySQL bug 23171 which updates the end_log_pos of  binlog events as they are written to the  bin log
  • log_slave_connects - log connect and disconnect messages for replication slaves
  • mapped_users - use the mapped_user table to map users to roles
  • xa_enabled - enable support for XA transactions (I like to disable this)