Monday, April 25, 2022

clang vs crc32c in RocksDB

By default RocksDB uses crc32c as a block checksum. The source is here and a benchmark is here (db_bench --benchmarks=crc32c). By default RocksDB is compiled with gcc on Linux but it is easy to switch to clang. I did that switch to compare benchmark performance between gcc and clang builds and the initial results were interesting.

The overhead for crc32c is apparent for benchmarks that do a lot of IO from fast storage (fast SSD or the OS page cache).

tl;dr for x86 HW

  • For crc32
    • throughput is more than 2X faster with gcc 9.4 than clang for clang versions < 14
    • the difference drops to 1.36X at clang version 14
  • For xxh3
    • clang versions 10 to 13 are ~1.08X faster than gcc 9.4
    • the difference drops to ~1.04X faster for clang 14
  • clang versions 14 and 15 have similar performance, so do clang versions 10 through 13

Update - issue 55153 filed for LLVM

Results

Legend:

  • cc - compiler
  • crc32c, xxhash, xxh3 - db_bench benchmark names
  • others are abbreviated names for db_bench benchmarks: xxh64 = xxhash64, comp = compress, uncomp = uncompress
The numbers in the table are the throughput in MB/s reported by db_bench.

cc      crc32c  xxhash  xxh64   xxh3    comp    uncomp
gcc     19327   5036    9796    26805   661     5201
clang    8935   5043    9847    29327   658     5185
clang11  8367   5050    9849    29344   660     5184
clang12  7594   5024    9832    28392   658     4858
clang13  7571   5021    9832    29031   659     5234
clang14 14239   5008    9847    27946   660     4770
clang15 14266   5027    9811    28020   660     5170

It is not shown here but the results for the uncompress test had too much variance so I ignore them. Perhaps the test needs to run for more time.

Setup

Most of my tests used an Intel NUC described here. This has Ubuntu 20.04 with gcc 9.4.0 and clang 10.0.0-4ubuntu1. After noticing that gcc 9.4 was much faster than clang 10 I tried clang versions 11, 12, 13 and 14 and the scripts here made it easy to install the newer versions of clang.

Other notes:

  • RocksDB is compiled with -O2 and all of the needed flags/includes to get a fast crc32.
  • From clang --version the versions were 11.1.0, 12.0.1, 13.0.1 and 14.0.1.

I then compiled RocksDB using gcc and clang:

# Compile for gcc 9.4
make clean; make DISABLE_WARNING_AS_ERROR=1 DEBUG_LEVEL=0 V=1 VERBOSE=1 -j4 static_lib db_bench; mv db_bench db_bench.gcc.use1

# Compile for clang 10.0.0-4ubuntu1
make clean; CC=/usr/bin/clang CXX=/usr/bin/clang++ USE_CLANG=1 make DISABLE_WARNING_AS_ERROR=1 DEBUG_LEVEL=0 V=1 VERBOSE=1 -j4 static_lib
db_bench; mv db_bench db_bench.clang.use1

# Compile for clang versions 11, 12, 13, 14
for v in 11 12 13 14; do make clean; CC=/usr/bin/clang-${v} CXX=/usr/bin/clang++-${v} USE_CLANG=1 make DISABLE_WARNING_AS_ERROR=1 DEBUG_LEVEL=0 V=1 VERBOSE=1 -j4 static_lib db_bench; mv db_bench db_bench.clang${v}.use1 ; done

And then I ran each of the CPU-intensive microbenchmarks 3 times and reported the median result:


for bm in crc32c xxhash xxhash64 xxh3 compress uncompress ; do
  echo; echo $bm; echo
  for x in gcc clang.use1 clang11.use1 clang12.use1 clang13.use1 clang14.use1 ; do
    echo; echo $x
    for z in 1 2 3; do
      ./db_bench.$x --benchmarks=$bm --stats_per_interval=1 --stats_interval_seconds=600 2> /dev/null | grep ^"$bm"
    done
  done
done

Clang versions

$ /usr/bin/clang-11 --version
Ubuntu clang version 11.1.0-++20211011094159+1fdec59bffc1-1~exp1~20211011214622.5
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

$ /usr/bin/clang-12 --version
Ubuntu clang version 12.0.1-++20211029101322+fed41342a82f-1~exp1~20211029221816.4
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

$ /usr/bin/clang-13 --version
Ubuntu clang version 13.0.1-++20220120110924+75e33f71c2da-1~exp1~20220120231001.58
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

$ /usr/bin/clang-14 --version
Ubuntu clang version 14.0.1-++20220423123024+9a3e81e1f91f-1~exp1~20220423003108.124
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

$ /usr/bin/clang-15 --version
Ubuntu clang version 15.0.0-++20220424052740+3f0f20366622-1~exp1~20220424172822.231
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock, v2

This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...