Monday, July 17, 2023

my.cnf options that influence create index performance

I measured the time to create indexes for InnoDB and MyRocks using data and schemas from the insert benchmark.

tl;dr

  • For InnoDB changing the my.cnf options was more significant when the secondary indexes were created via one alter table statement
  • Increasing innodb_ddl_threads helps, up to a point, when all indexes are created at once
  • Increasing innodb_ddl_buffer_size helps, up to a point, when all indexes are created at once
  • Increasing rocksdb_merge_combine_read_size didn't help

Benchmarks

All of my scripts and output files are here and the spreadsheet with all results is here

The table was created with a PK index, then loaded with N rows (N=1B, 2B, 4B, 8B) and then I measured the time to create three secondary indexes. The secondary indexes were created using two methods:

  • one at a time - create each index separately (three alter table statements)
  • three at once - create all indexes at once (one alter table statements)
Example my.cnf files are here for InnoDB and for MyRocks. I used upstream 8.0.32 for InnoDB and FB MySQL 8.0.33 for MyRocks. The goal was to understand the impact of innodb_ddl_threads, innodb_ddl_buffer_size and rocksdb_merge_combine_read_size.

The server has 80 cores with hyperthreads enabled, 256G of RAM and fast storage.

To understand the impact of innodb_ddl_threads I measured the time to create the secondary indexes with it set to 1, 2, 4, 8, 12, 16 and 32 while innodb_ddl_buffer_size was fixed at 2G. Assuming innodb_ddl_buffer_size is the per-thread memory limit then this means that with a larger value of innodb_ddl_threads there was more memory used. Regardless, when indexes were created one at a time the value of innodb_ddl_threads has little impact. But when all indexes were created via one alter table statement there was an improvement up to innodb_ddl_threads=8.

To understand the impact of innodb_ddl_buffer_size I measured the time to create secondary indexes with it set to 32M, 64M, 128M, 256M, 512M, 1G and 2G while innodb_ddl_threads was fixed at 4. When indexes were created one at a time the value of innodb_ddl_buffer_size has a small impact. But when all indexes were created via one alter table statement the impact was larger. Data is missing for 64m at 8B because the create index statement failed.

The value of rocksdb_merge_combine_read_size has little impact on the time to create indexes. The default is 1G and I frequently reduce it to reduce peak RSS for mysqld (and avoid OOM).



No comments:

Post a Comment

Evaluating vector indexes in MariaDB and pgvector: part 2

This post has results from the ann-benchmarks with the   fashion-mnist-784-euclidean  dataset for MariaDB and Postgres (pgvector) with conc...