Small Datum: Concurrent, read-only & cached: MongoDB, TokuMX, MySQL

Tuesday, April 22, 2014

Concurrent, read-only & cached: MongoDB, TokuMX, MySQL

This has results for a read-only workload where all data is cached. The test query fetches all columns in one doucment/row by PK. For InnoDB all data is in the buffer pool. For TokuMX and MongoDB all data is in the OS filesystem cache and accessed via mmap'd files. The test server has 40 CPU cores with HT enabled and the test clients share the host with mysqld/mongod to reduce variance from network latency. This was similar to a previous test, except the database is in cache and the test host has more CPU cores. The summary of my results is:

MongoDB 2.6 has a performance regression from using more CPU per query. The regression might be limited to simple queries that do single row lookups on the _id index. I spent a bit of time rediscovering how to get hierarchical CPU profile data from gperftools to explain this. JIRAs 13663 and 13685 are open for this.
MySQL gets much more QPS at high concurrency than MongoDB and TokuMX
MySQL gets more QPS using the HANDLER interface than SELECT. I expect the InnoDB memcached API to be even faster than HANDLER but did not test it.
MySQL uses more CPU per query in 5.7.4 than in 5.6.12 but this didn't have an impact on QPS

Setup

The test was repeated for 1, 2, 4, 8, 16, 32 and 40 concurrent clients. It uses my forked versions of the MongoDB and C clients for sysbench. There are 8 collections/tables in one database. Each table has 400M rows but queries are limited to the first 1M. I don't know yet whether using a database per collection would improve the MongoDB results. Each query fetches all columns in one document/row by PK. I have yet to push my changes to the MongoDB sysbench client to make it fetch all columns. I tested these binaries:

fb56.handler - MySQL 5.6.12 with the Facebook patch and 8k pages. Uses HANDLER to fetch data.
fb56.sql - MySQL 5.6.12 with the Facebook patch and 8k pages. Uses SELECT to fetch data.
orig57.handler - MySQL 5.7.4 without the Facebook patch and 8k pages. Uses HANDLER to fetch data.
orig57.sql - MySQL 5.7.4 without the Facebook patch and 8k pages. Uses SELECT to fetch data.
tokumx - TokuMX 1.4.1 using quicklz and 32kb pages. There should be no decompression during the test as all data used by the test (1M documents) is much smaller than 50% of RAM.
mongo24 - MongoDB 2.4.9
mongo26 - MongoDB 2.6.0

Results

At last I included a graph. I have been reluctant to include graphs on previous posts comparing MongoDB, TokuMX and MySQL because I want to avoid benchmarketing and drive-by analysis. These tests have been time consuming to run and document and I don't want to make it too easy to misinterpret the results. Results for MySQL 5.7.4 are not in the graph to make it easier to read. The top two bars (blue & red) are for MySQL and you can see that QPS increases with more concurrency. QPS for MongoDB and TokuMX saturates at a lower level of concurrency.

Numbers used for the graph above.

point queries per second
1 2 4 8 16 32 40 clients
17864 32397 60294 106374 184566 298276 350665 fb56.handler
11730 22884 39646 73485 131533 215487 249402 fb56.sql
18161 33262 59413 107505 185894 306084 371045 orig57.handler
11775 21838 40528 75322 135331 227450 266917 orig57.sql
14298 25219 45743 83214 142489 168498 161840 tokumx
17203 30158 52476 94705 161922 174453 170177 mongo24
10705 19502 34318 61977 109684 152667 151555 mongo26

Analysis

I used vmstat to measure the average CPU utilization (user + system) during the test. The numbers below are: (CPU_utilization / QPS) * 1,000,000. There are some interesting details.

the values are larger for MySQL 5.7 than for 5.6 at low concurrency. Note that in both cases the performance schema was disabled at compile time.
the values are much larger for MongoDB 2.6 than for 2.4 and hopefully this can be fixed via JIRAs 13663 and 13685.

(CPU_utilization / QPS) * 1,000,000

1 2 4 8 16 32 40 clients

218 197 197 208 216 251 268 fb56.handler

323 310 287 298 304 352 372 fb56.sql

357 279 240 216 215 248 250 orig57.handler

407 380 313 288 302 342 359 orig57.sql

272 269 251 254 266 302 296 tokumx

232 215 219 225 234 257 252 mongo24

373 333 340 342 355 425 422 mongo26

I also used vmstat to measure the context switch rate and the table below lists the number of context switches per query. Note that the rate decreases with concurrency for MySQL but not for MongoDB and TokuMX. I don't know enough about Linux internals to interpret this.

vmstat.cs / QPS

context switch per query

1 2 4 8 16 32 40 clients

4.44 4.14 4.01 3.79 3.47 3.05 2.19 fb56.handler

4.61 4.32 4.03 3.84 3.59 3.23 2.65 fb56.sql

4.53 4.27 4.07 3.88 3.52 3.08 2.20 orig57.handler

4.81 4.48 4.19 3.96 3.63 3.07 2.19 orig57.sql

4.59 4.30 4.08 3.87 3.77 4.32 4.32 tokumx

4.54 4.23 4.03 3.84 3.79 4.29 4.30 mongo24

4.80 4.43 4.21 3.99 3.93 4.58 4.63 mongo26

Small Datum

Tuesday, April 22, 2014

Concurrent, read-only & cached: MongoDB, TokuMX, MySQL

Setup

Results

Analysis

No comments:

Post a Comment

Postgres 18 beta2: large server, Insert Benchmark, part 2