Monday, October 24, 2022

Small servers for performance testing, v4

I am setting up my fourth cluster of small servers to test open source database software. Cluster might be an overstatement because each cluster is limited to 2 or 3 servers. The clusters were/are:

  • v1 - Intel NUC5i3ryh (5th gen core i3), 8G RAM, SATA disk for OS, Samsun 850 EVO m.2 for db
  • v2 - Intel NUC7i5bnh (7th gen core i5), 16G RAM, Samsung 850 EVO SATA for OS, Samsung 960 EVO m.2 for db
  • v3 - Intel NUC8i7beh (8th gen core i7), 16G RAM, Samsung 860 EVO SATA for OS, Samsung 970 EVO m.2 for db
  • v4 - Beelink SER 4700u with Ryzen 7 4700u, 16G RAM, WD Blue 1T SATA for OS, Kingston NVMe for db
  • v5 - Beelink SER7 7840HS with Ryzen 7 7840HS, 32G RAM, 1T m.2 SSD for OS, 2T Samsung Pro 990 for the database. This is also my first mini PC with a fan and TDP is 65w which is a bit larger than what came before. Passmark single-thread rating is 3771 vs 2532 for the v4 server.
  • v6 - SuperMicro SuperWorkstation 7049A-T with 2 sockets, 12 cores/socket, 64G RAM, one m.2 SSD (2TB, XFS). CPU is Intel Xeon Silver 4214R CPU @ 2.40GHz
  • v7 - Dell Precision 7865 Tower Workstation with 1 socket, 128G RAM, AMD Ryzen Threadripper PRO 5975WX 32-Cores, 2 m.2 SSD (each 2TB, RAID SW 0, XFS)
More details on previous clusters are here for v1 and v2 and then for v3. I use a separate disk for the OS because I expect the database SSD to wear out and I don't want to reinstall the OS when that happens. Posts on monitoring for endurance are here and here but in the past I have neglected to catch that early enough and some SSDs greatly exceeded their endurance ratings.

The CPUs for each cluster:

  • v1 - Intel i3-5010U with 2 cores. I think I left hyperthread enabled to get 4 HW threads.
  • v2 - Intel i5-7260U with 2 cores. Again, I think I left hyperthread enabled to get 4 HW threads. But this was about 2X faster on the compile MySQL 8.1 benchmark. Turbo boost was also disabled to reduce performance variance.
  • v3 - Intel i7-8559U with 4 cores and hyperthread disabled. Turbo boost was also disabled to reduce performance variance so the CPU runs at 2.70 GHz.
  • v4 - AMD Ryzen 7 4700u with 8 cores and 8 HW threads. Turbo core was disabled to reduce performance variance. I am still figuring out what the base clock speed is.
  • v5 - AMZ Ryzen 7 7840HS with 8 cores and 16 HW threads. Turbo core was disabled to reduce performance variance.
The Beelink (v4) server comes with 16G of RAM and 512G of NVMe m.2 installed. Then I installed a 1T SATA SSD. So putting the HW together was a bit easier with Beelink than with the NUC as the NUC kits I ordered required me to install RAM, m.2 and SATA SSDs.

The SER 4700u web page claims that Crucial DDR4 DRAM and Kingston NVMe m.2 are used. I confirmed that Kingston was provided. I was happy to learn this comes with quality components, and again the price is great. (Update) After wearing out one of the Kingston SSDs I replaced them with Samsung 980 Pro (1TB) which is a lot faster.

Why AMD?

I have been happy with my Intel NUC clusters but I chose AMD this time because the prices and reviews for Beelink were great and because I am not the target use case for the new Intel NUCs. The Intel NUC is currently on the 12th generation. I had to go back to the 10th gen to find a NUC that would work for me. The newer ones were either targeted at gaming, home video or had a mix of performance and efficiency cores. More than anything, I need consistent performance and can't make use of both performance and efficiency cores.

The NUCs were reliable for me. Their only weak spot was the wires attached to the base that would have to flex when you remove the base to replace SSDs and the wires on 1 server eventually failed at the flex point. I shipped them back to Intel and received a new one. The Beelink server only has one ribbon for SATA connected to the base and it is much more flexible.

All of the cluster servers claim a low TDP. I like that as I don't want to trip circuit breakers or have them heat the server room. I also want to be able to use them, at least at night, in the summer when it starts to get warm. While the CPU performance is likely to have not that much variance given that I disabled turbo, I still wonder about SSD performance variance due to heat.

Setup

Setting up the Beelink was easy -- remove 4 screws on the bottom, insert a small allen wrench into a gap to pry the base off (the hardest part) and then add the SATA SSD. In the BIOS (hold "delete" on boot) I changed the boot order to move SATA before NVMe. I am not sure if I needed to reorder the USB when I used that to install Linux. 

The server comes with some flavor of Windows on the m.2 SSD and will boot into the Windows setup if you are not careful. This was confusing because the first few screens of that process don't make it clear that you are about to setup Windows. Reboot and hold f7 to quickly get to a screen where you can change the boot order.

One small feature that is extra useful is that the Beelink server lists the BIOS prompt keys on the bottom plate -- delete to get the full BIOS and f7 to get the boot order screen. I wish the NUC had that as I always relearn it by experimenting. I think it is f2. While the Intel NUC BIOS was easier to navigate it is also a visual BIOS so I get a bit more exercise finding a mouse whenever I have to fiddle with it, and I recently had to disable secure boot on the NUCs to make blktrace work.

I installed Ubuntu 22.04 Server via a thumb drive. This was easy. Soon after the install I removed cloud-init as that slows the boot process and adds a bit too much text during boot. I was able to get a wifi connection during installed, but after install the wifi setup step would hang during boot. I am still not sure why that happened -- my unproven but educated guesses were: wifi worked better when the boxes weren't next to each other, wifi worked better when the boxes connected to my wifi base router rather than a wifi extender.

While I can disable turbo boost in the BIOS on the Intel NUCs, with AMD there was no BIOS option to disable turbo core. But there are things that can be done after boot. This is fine for me given that I already run scripts at startup to enable the usage of gdb and mount my database filesystem. The scripts do the following. The last line disabled turbo core.

echo -1 > /proc/sys/kernel/perf_event_paranoid
echo 0 > /proc/sys/kernel/yama/ptrace_scope
sudo sh -c " echo 0 > /proc/sys/kernel/kptr_restrict"
echo 1 > /proc/sys/kernel/sysrq
echo x > /proc/sysrq-trigger
mount -o noatime,nodiratime,discard,noauto /dev/nvme0n1 /data
echo '0' > /sys/devices/system/cpu/cpufreq/boost

Debugging

From the output of cpupower frequency-info and turbostat I think the base clock is 2GHz. But I am not certain yet. Modern CPUs are complicated. Example output is here. I haven't used these tools and this post was useful both for the tools and as an introduction into how clock frequency can change (C states and more).

I needed to debug a performance difference -- the per-cpu IOPs from fio was ~20k on one server vs ~45k on the other. I ran perf top and the difference was obvious, read_hpet.0 was 50% of the CPU on the slow server and not visible on the fast server. Also the ratio of user to system CPU time was very different between the servers. 

This line in dmesg output was a strong hint:
TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'

More details are here and eventually I found this Reddit post. The fix is below. I don't have an opinion on whether this is a HW issue or whether Linux will become more tolerant when measuring TSC at startup while choosing a clock source. Regardless, it is confusing to debug.
  1. edit /etc/default/grub -> GRUB_CMDLINE_LINUX_DEFAULT="clocksource=tsc tsc=reliable"
  2. update-grub
  3. reboot
Bad SSD

Both servers are getting READ FPDMA QUEUED and WRITE FPDMA QUEUED errors from the SATA SSD (WD Blue SA510). I will guess that the problem is the WD Blue SSD and will soon replace it with a Samsung 870. The errors in dmesg look like this. The error has yet to repro after the replacement, but I also used screws to lock down the drive during the replacement and they were not used prior. So perhaps the lack of screws was the problem.

Worn out SSD

The Kingston SSD shipped with the boxes quickly reached the endurance limit. It has been replaced with 1TB Samsung 980 Pro.

Disappearing SSD

While the Samsung 980 Pro endurance has been great, one of the devices reached EoL too soon. The endurance consumed is likely less than 10% but the device disappeared. Time to buy a replacement. Details are here.

systemd

The default behavior for systemd is to remove your files from /dev/shm when you logout, even if a screen session is still running as you -- see here. This removes files that Postgres needs. To avoid that:
  1. add RemoveIPC=no to /etc/systemd/logind.conf
  2. sudo systemctl restart systemd-logind.service
netplan

I always forget these:
sudo vi /etc/netplan/00-installer-config-wifi.yaml
  sudo netplan apply

Problems with the v4 server

I get XFS errors and the device either goes into read-only mode or disappears (see here for details). So I am working on monitoring temperatures on the device and the SSD temp is easy to get but the sensors command provided by the lm_sensors package doesn't find any CPU temp when using Ubuntu 22.04 with the 5.15 kernel. So I upgraded to the HWE kernels to get a 6.2 kernel which is able to read CPU temps. To upgrade the kernel I used: sudo apt install linux-generic-hwe-22.04. I can monitor the NVMe devices via one of:
sudo smartctl -a /dev/nvme0n1
sudo nvme smart-log /dev/nvme0n1
The XFS errors occur for the Samsung 990 Pro device I added but smartctl shows that the error log is empty and the temp never entered the warning range (see here). That devices stores the benchmark database and is very busy. For the NVMe device that has the OS install and only gets benchmark output (not very busy) there are logged errors and the device enters the warning temp range (see here). The logged error message is here. Info from fwupdmgr on the two NVMe devices is here. The BIOS is set to Balanced rather than Performance which means the TDP is lower.

I have been disabling turbo on the CPU via this command. After my current attempt to repro the problem I will try another round without this:
echo '0' > /sys/devices/system/cpu/cpufreq/boost
Next up is repeating the benchmarks with scripts running in the background to monitor temperatures.

Problems with the v5 server

The v4 Beelink has a Mediatek wifi chip that uses the mt7921e driver. The v5 Beelink uses the Intel AX200 wifi chip and there are problems -- the driver frequently crashes and restarts (see here). There was also a corruption problem with XFS (see here) but that occurred after many wifi crashes so I am now running with wifi disabled to see if the XFS error repeats. 

Possibly related error reports for the wifi problem:

2 comments:

  1. I don't know if you'll notice the same problem on your model, but with the SER-5 I noticed that usb storage performs very poorly using the front ports, compared to the rear usb ports (nearly unusable if keyboard and usb live installer are both on the front ports)
    I also had to install Fedora, as Debian 12 alpha installer versions have trouble with the wifi (modprobe hangs looking for a firmware patch)
    Otherwise, I am very happy, so far.

    ReplyDelete
    Replies
    1. I didn't notice the USB issue but I rarely use USB for storage except while installing Ubuntu via a thumb drive.

      I have wifi problems but all appear to be due to the servers connecting to a mesh point rather than the base router. I enabled "router steering" and hope that helps.

      Delete