A simple test to measure CPU per IO

What should I expect with respect to CPU overhead and latency when using the public cloud. I won't name the vendor here because they might have a DeWitt Clause.

Hardware

My server has 16 real cores, HT or SMT disabled, Ubuntu 22.04 and ext4 is used in all cases. The two IO setups tested are:

  • local - 2 NVMe devices with SW RAID 0
  • network - 1TB of fast cloud block storage that is backed by SSD and advertised as being targeted for database workloads.
Updates:
  • Fixed a silly mistake in the math for CPU usecs per block read
Benchmark

This uses fio with O_DIRECT to do 4kb block reads. My benchmark script is here it is run by the following command lines and I ignore the result of the first run:
for d in 8 16 32 ; do bash run.sh local2_iod${d} /data/m/t.fio io_uring $d 300 512G ; done
for d in 4 8 16 32 ; do bash run.sh network_iod${d} /data2/t.fio io_uring $d 300 900G ; done

Results

I compute CPU usecs as: (((vmstat.us + vmstat.sy)/100) * 16 * 1M) / IOPs where
  • vmstat.us, vmstat.sy - the average value for the us (user) and sy (system) columns in vmstat
  • 16 - the number of CPU cores
  • 1M - scale from CPU seconds to CPU microseconds
  • IOPs - average number of r/s per fio
With a queue depth of 8
  • local: ~54k reads/s at ~150 usecs latency and ~10.16 CPU usecs/read
  • network: ~15k reads/s at ~510 usecs latency and ~12.61 CPU usecs/read
At queue depth =16 I still get ~15k reads/s from network so the setup is already saturated at queue depth =8 and ignore the results for =16.

From these results and others that I have not shared the CPU overhead per read from using cloud block storage is ~2.5 CPU usecs in absolute terms and ~24% in relative terms. I don't think that is bad.

Comments

Popular posts from this blog

Fixing bug 109595 makes MySQL almost 4X faster on the Insert Benchmark

Postgres versions 11, 12, 13, 14, 15, and 16 vs sysbench with a medium server

Postgres vs MySQL: the impact of CPU overhead on performance