I made parallel create index for InnoDB up to 2.4X faster by doing one of:
- performance_schema=0
- removing this line of code
I filed bug 113505 for this. The problem is that the parallel create index threads all call that function and there is too much contention on the counter that is incremented. The problem wasn't obvious from Linux perf when using the default metric (cycles). I then tried perf stat and IPC (instructions/cycle) was about 2X larger during create index with performance_schema=0. Finally, I repeated the perf measurements using memory system counters (cache-misses, etc) and with perf record, perf report and perf annotate the problem was obvious.
I am still deciding whether this 2X faster bug fix is more exciting then the one I found for Falcon.
I am still deciding whether this 2X faster bug fix is more exciting then the one I found for Falcon.
Results
The results here are from a server with 2 sockets, 12 cores/socket, 64G of RAM running Ubuntu 22.04 and one m.2 device with 2TB and XFS. All tests use MySQL 8.0.35 and all of the my.cnf files are here.
The test is the create index benchmark step from the Insert Benchmark with one client, 80M rows and 3 secondary indexes created in one ALTER statement. With innodb_ddl_threads=N if N=1 then only one thread handles the index create work, otherwise it is done in parallel over N threads -- this is assuming innodb_parallel_read_threads >= innodb_ddl_threads, I am not sure I want to remember how they interact.
I tested my.cnf configs with innodb_ddl_threads and innodb_parallel_read_threads set to =1, =4, =8 and =16. Note that the default is =4. Tests were repeated with performance_schema=1 and =0 and for =1, nothing else was set for the perf schema.
I tested my.cnf configs with innodb_ddl_threads and innodb_parallel_read_threads set to =1, =4, =8 and =16. Note that the default is =4. Tests were repeated with performance_schema=1 and =0 and for =1, nothing else was set for the perf schema.
The following configurations are tested:
- ddl1 - innodb_ddl_threads=1, innodb_parallel_read_threads=1, performance_schema=1
- ddl1+ps0 - like ddl1 but with perfomance_schema=0
- ddl4 - innodb_ddl_threads=4, innodb_parallel_read_threads=4, performance_schema=1
- ddl4+ps0 - like ddl4 but with perfomance_schema=0
- ddl8 - innodb_ddl_threads=8, innodb_parallel_read_threads=8, performance_schema=1
- ddl8+ps0 - like ddl8 but with perfomance_schema=0
- ddl16 - innodb_ddl_threads=16, innodb_parallel_read_threads=16, performance_schema=1
- ddl16+ps0 - like ddl16 but with perfomance_schema=0
The chart shows the time to create indexes for MySQL 8.0.35 using two builds. Note that shorter bars are better here. The as-is build is 8.0.35 unchanged and the fixed build is 8.0.35 with this line removed.
- When performance_schema=0 then there is no difference between the as-is and fixed builds
- When performance_schema=1 then the fixed build is ~2X faster than the as-is build
No comments:
Post a Comment