Tuesday, June 28, 2022

Setting up a server on GCP

 This is mostly a note to myself to explain what I do to setup a server on GCP for database benchmarks.

Create the instance

  1. Confirm that quota limits have not been reached on the Quotas page.
  2. Go to the VM instances page and click on Create Instance
  3. Edit the instance name
  4. Edit the region (us-west1 for me)
  5. Choose the instance type. Click on Compute Optimized, select the c2 series, select the Machine Type and then c2-standard-60.
  6. Disable hyperthreading to reduce benchmark variance. Click on CPU Platform and GPU, click on vCPUs to core ratio and choose 1 vCPU per core.
  7. Scroll down to Boot disk and click on Change. Click on Operating System and select Ubuntu. Click on Version and select Ubuntu 22.04 LTS. Don't change Boot disk type (the default is Balanced persistent disk). Change Size (GB) to 100. Then click on Select.
  8. Scroll down to Identity and API access and select Allow full access to all Cloud APIs. This enables read and write access to Cloud Object Storage buckets where I upload benchmark results and download binaries and other test files. If you forget to do this, you can stop the server, change the setting and continue.
  9. Scroll down to Networking, Disks and ... then click on Disks, then click on Add New Disks. Change the disk name (I use $instance-name + "db"). Change Disk type to SSD Persistent Disk. Change the Size. I use 1000 GB for cached workloads and 3000 GB for IO-bound workloads. Then scroll down and for Deletion rule select Delete disk. If you forget to do this then you will continue to rent the storage after deleting the VM and can visit here to delete it.
  10. Scroll down and click on Create. At this point you will return to the VM Instances page while the instance is started.
Prepare the instance
  1. From the VM Instances page find the entry for the instance and click on the arrow under the Connect column for that instance. Select View gcloud command and copy the command line. This assumes you install the GCloud SDK on your laptop.
  2. Clone the RocksDB repo (optional): git clone https://github.com/facebook/rocksdb.git
  3. Install Ubuntu updates, then install packages. Some of the packages are only there in case I want to build RocksDB
    • sudo apt-get update; sudo apt-get upgrade
    • sudo apt install -y numactl fio sysstat
    • sudo apt install -y libgflags-dev libsnappy-dev zlib1g-dev liblz4-dev libzstd-dev
    • sudo apt install -y gcc g++ default-jdk make libjemalloc-dev
  4. Setup the filesystem for the cloud block storage
    • sudo mkfs.xfs /dev/sdb; sudo mkdir /data; sudo mount -o discard,defaults /dev/sdb /data ; sudo chown mcallaghan /data ; df -h | grep data ; mkdir -p /data/m/rx
    • I am mcallaghan, you might not be and that should be edited
    • I use /data/m/rx as the database directory
    • If you reboot the host, then you must do: sudo mount -o discard,defaults /dev/sdb /data
  5. sudo reboot now -- in case a new kernel arrived
  6. Things to do after each reboot:
    • sudo mount -o discard,defaults /dev/sdb /data
    • ulimit -n 70000 -- I have wasted many hours forgetting to do this. RocksDB likes to have far more than 1024 file descriptors open and 1024 is the default. 
    • screen -S me -- or use tmux. This is useful for long running benchmark commands
    • The default behavior for systemd is to remove your files from /dev/shm when you logout, even if a screen session is still running as you -- see here. This removes files that Postgres needs. To avoid that:
      1. add RemoveIPC=no to /etc/systemd/logind.conf
      2. sudo systemctl restart systemd-logind.service
  7. Run your benchmarks
    • I usually archive the db_bench binaries into an object storage bucket, so I copy that bucket onto the host
    • Since the RocksDB repo was clone above I can cd to $rocksdb_root/tools to find tools/benchmark_compare.sh and tools/benchmark.sh
Try fio

I am trying this out as my first step, characterize IO read performance with fio:
sudo fio --filename=/dev/sdb --direct=1 --rw=randread \
    --bs=4k --ioengine=libaio --iodepth=256 --runtime=300 \
    --numjobs=8 --time_based --group_reporting \
    --name=iops-test-job --eta-newline=1 --eta-interval=1 \
    --readonly --eta=always >& o.fio.randread.4k.8t

sudo fio --filename=/dev/sdb --direct=1 --rw=randread \
    --bs=1m --ioengine=libaio --iodepth=1 --runtime=300 \
    --numjobs=8 --time_based --group_reporting \
    --name=iops-test-job --eta-newline=1 --eta-interval=1 \
    --readonly --eta=always >& o.fio.randread.1m.8t

For both instances (1000G or 3000G of block storage) I get ~30k IOPs with 4k reads. For the 1M reads I get ~480M/s with 1000G of storage and ~12G/s with 3000G of storage.

No comments:

Post a Comment

RocksDB on a big server: LRU vs hyperclock

This has benchmark results for RocksDB using a big (48-core) server. I ran tests to document the impact of the the block cache type (LRU vs ...