Small Datum: Setting up a server on GCP

This is mostly a note to myself to explain what I do to setup a server on GCP for database benchmarks.

Create the instance

Confirm that quota limits have not been reached on the Quotas page.
Go to the VM instances page and click on Create Instance
Edit the instance name
Edit the region (us-west1 for me)
Choose the instance type. Click on Compute Optimized, select the c2 series, select the Machine Type and then c2-standard-60.
Disable hyperthreading to reduce benchmark variance. Click on CPU Platform and GPU, click on vCPUs to core ratio and choose 1 vCPU per core.
Scroll down to Boot disk and click on Change. Click on Operating System and select Ubuntu. Click on Version and select Ubuntu 22.04 LTS. Don't change Boot disk type (the default is Balanced persistent disk). Change Size (GB) to 100. Then click on Select.
Scroll down to Identity and API access and select Allow full access to all Cloud APIs. This enables read and write access to Cloud Object Storage buckets where I upload benchmark results and download binaries and other test files. If you forget to do this, you can stop the server, change the setting and continue.
Scroll down to Networking, Disks and ... then click on Disks, then click on Add New Disks. Change the disk name (I use $instance-name + "db"). Change Disk type to SSD Persistent Disk. Change the Size. I use 1000 GB for cached workloads and 3000 GB for IO-bound workloads. Then scroll down and for Deletion rule select Delete disk. If you forget to do this then you will continue to rent the storage after deleting the VM and can visit here to delete it.
Scroll down and click on Create. At this point you will return to the VM Instances page while the instance is started.

Prepare the instance

From the VM Instances page find the entry for the instance and click on the arrow under the Connect column for that instance. Select View gcloud command and copy the command line. This assumes you install the GCloud SDK on your laptop.
Clone the RocksDB repo (optional): git clone https://github.com/facebook/rocksdb.git
Install Ubuntu updates, then install packages. Some of the packages are only there in case I want to build RocksDB

sudo apt-get update; sudo apt-get upgrade
sudo apt install -y numactl fio sysstat
sudo apt install -y libgflags-dev libsnappy-dev zlib1g-dev liblz4-dev libzstd-dev
sudo apt install -y gcc g++ default-jdk make libjemalloc-dev

Setup the filesystem for the cloud block storage

sudo mkfs.xfs /dev/sdb; sudo mkdir /data; sudo mount -o discard,defaults /dev/sdb /data ; sudo chown mcallaghan /data ; df -h | grep data ; mkdir -p /data/m/rx
I am mcallaghan, you might not be and that should be edited
I use /data/m/rx as the database directory
If you reboot the host, then you must do: sudo mount -o discard,defaults /dev/sdb /data

sudo reboot now -- in case a new kernel arrived
Things to do after each reboot:

sudo mount -o discard,defaults /dev/sdb /data
ulimit -n 70000 -- I have wasted many hours forgetting to do this. RocksDB likes to have far more than 1024 file descriptors open and 1024 is the default.
screen -S me -- or use tmux. This is useful for long running benchmark commands
The default behavior for systemd is to remove your files from /dev/shm when you logout, even if a screen session is still running as you -- see here. This removes files that Postgres needs. To avoid that:
1. add RemoveIPC=no to /etc/systemd/logind.conf
2. sudo systemctl restart systemd-logind.service

Run your benchmarks

I usually archive the db_bench binaries into an object storage bucket, so I copy that bucket onto the host
Since the RocksDB repo was clone above I can cd to $rocksdb_root/tools to find tools/benchmark_compare.sh and tools/benchmark.sh

Try fio

I am trying this out as my first step, characterize IO read performance with fio:
sudo fio --filename=/dev/sdb --direct=1 --rw=randread \

--bs=4k --ioengine=libaio --iodepth=256 --runtime=300 \

--numjobs=8 --time_based --group_reporting \

--name=iops-test-job --eta-newline=1 --eta-interval=1 \

--readonly --eta=always >& o.fio.randread.4k.8t

sudo fio --filename=/dev/sdb --direct=1 --rw=randread \

--bs=1m --ioengine=libaio --iodepth=1 --runtime=300 \

--numjobs=8 --time_based --group_reporting \

--name=iops-test-job --eta-newline=1 --eta-interval=1 \

--readonly --eta=always >& o.fio.randread.1m.8t

For both instances (1000G or 3000G of block storage) I get ~30k IOPs with 4k reads. For the 1M reads I get ~480M/s with 1000G of storage and ~12G/s with 3000G of storage.

Small Datum

Tuesday, June 28, 2022

Setting up a server on GCP

No comments:

Post a Comment

Postgres 18 beta2: large server, Insert Benchmark, part 2