I need stable performance from the servers I use for benchmarks. I also need servers that don't run too hot because too-hot servers cause HW to fail and introduce variance when CPU speeds are throttled. This post is about my latest adventures managing CPU speed. My previous post is here.
At a high level my solution is:
- disable turbo boost
- (optional) cap the max frequency the CPU can use
Background reading
The AMD pstate scaling drivers (amd-pstate, amd-pstate-epp) make life interesting for some of us. The common advice today is to stay with acpi-cpufreq for servers. I agree and assume it is best to wait for things to settle, for the scaling drivers to be feature complete, for docs to catch up and for new AMD CPUs that support all of the new features to arrive in your servers. It is very confusing today -- the non-expert user experience isn't great and there is too much advice on the interweb that is wrong and/or out of date.
The amd-pstate-epp scaling driver used in active mode doesn't appear to have a way to disable or enable turbo boost. It also isn't obvious it has a way to limit the max CPU frequency. So expect your server to run fast and hot, then throttle the CPU, then repeat forever. That might be fine for a laptop but isn't good for my use case.
There is much to learn, but sometimes I prefer to focus on my problems (database storage engines) and not have to spend too much time on topics like this:
- an overview on AMD pstate drivers from AMD (see here)
- an overview of CPU frequency scaling from Arch Linux (see here)
- good configuration advice on Reddit (see here)
- benchmark results from Phoronix (see here)
- kernel docs on frequency governors (see here)
- a good user experience post (see here)
- an overview from RedHat (see here)
The solution
The concrete steps are:
- disable turbo boost
- use the acpi-cpufreq scaling driver
- use the performance frequency governor
- (optionally) cap the max CPU frequency (this only works on some of my servers)
Run this to disable turbo boost. Note that the boost file exists when using the acpi-cpufreq scaling driver. If using amd-pstate-epp then the file isn't there with active mode and is there with guided mode.
echo '0' | sudo tee /sys/devices/system/cpu/cpufreq/boost
To get the acpi-cpufreq scaling driver, edit /etc/default/grub to add one of these lines (nosmt disables AMD SMT), then run sudo update-grub and then reboot.
GRUB_CMDLINE_LINUX_DEFAULT="nosmt amd_pstate=disable"
GRUB_CMDLINE_LINUX_DEFAULT="amd_pstate=disable"
Run this to use the performance frequency governor
sudo cpupower frequency-set --governor performance
And then run this to confirm you enabled the performance governor:
cpupower -c all frequency-info | grep gov ; cpupower frequency-info
While disabling turbo boost goes a long way to avoiding a too-hot CPU, sometimes you might want to reduce the CPU frequencies even more, and somtimes that is possible via the cpupower command. However, this doesn't have an impact on all of my servers. Fortunately, summer has passed and I don't have to worry as much about overheating for a few months.
Here is one example of a user having a problem similar to mine (the max value is ignored).
sudo cpupower frequency-set -u 2.40GHz
Setting up the adventure
This adventure began when I had to replace m.2 devices on several of my small servers because they were at or near endurance limits. I use Ubuntu 22.04 and then decided to update the installs which brought new kernel versions. And while doing that I updated all of my servers to use an HWE kernel which is now 6.8.something with Ubuntu 22.04.5.
Ubuntu 22.04 Server uses the schedutil frequency governor by default when using the acpi-cpufreq scaling driver. I noticed a few months back that schedutil that caused odd performance results for MyRocks on some of my servers and switching to the performance governor gave me up to ~2X more QPS (see here). So I decided to switch all of my small servers to use the performance governor.
Servers
I have two types of small servers. Both use Ubutu 22.04 and everything has the 6.8.0-45-generic HWE enabled kernel today.
The older type has an AMD Ryzen 4700u CPU with 8 cores, 16G of RAM, no support for AMD SMT and no support for CPPC. I did not check to see if CPPC support was disabled in the BIOS. And by no support for CPPC I mean that these directories do not exist:
/sys/devices/system/cpu/cpu*/acpi_cppc
However, on the same server this shows that CPPC is supported: lscpu | grep cppc
The newer small server has 8 cores with AMD SMT disabled, 32G RAM and an AMD Ryzen 7 CPU -- either 7840HS or 7735HS. These have support for CPPC.
Results
I used this script to determine the behavior of scaling driver (acpi-cpufreq, amd-pstate-epp in active mode, amd-pstate-epp in guided mode), frequency governor (schedutil, performance, powersave) and energy performance preference (the EPP in amd-pstate-epp).
I decided to not share the results, perhaps I am grouchy after spending a few too many hours on this.
Update 1 - cpupower idle-set
One of my smart friends suggested I might need to use cpupower idle-set -D1 to avoid some sources of variance. While I would rather more on storage engines and less on tuning frequency management, I suppose I need to look at this.
I ran cpupower idle-info on the CPUs in my home servers. The CPUs are and per-state latencies are listed below. One interesting thing is that gap between C1 and C2 is huge for the 4700u CPU (1 to 350) but much smaller on all of the other (and newer) CPUs.
- AMD Ryzen 7 4700u (laptop class, the oldest & slowest of the bunch)
- Latency for (Poll, C1, C2, C3) = (0, 1, 350, 400)
- AMD Ryzen 7 7735HS (laptop class)
- Latency for (Poll, C1, C2, C3) = (0, 1, 18, 350)
- AMD Ryzen 7 7840HS (laptop class)
- Latency for (Poll, C1, C2, C3) = (0, 1, 18, 350)
- Intel Xeon Silver 4214R
- Latency for (Poll, C1, C1E, C6) = (0, 2, 10, 133)
- AMD Ryzen Threadripper PRO 5975WX
- Latency for (Poll, C1, C2) = (0, 1, 18)
I get the following output from cpupower idle-info
AMD Ryzen 7 4700u
CPUidle driver: acpi_idle
CPUidle governor: menu
analyzing CPU 6:
Number of idle states: 4
Available idle states: POLL C1 C2 C3
POLL:
Flags/Description: CPUIDLE CORE POLL IDLE
Latency: 0
Usage: 18496
Duration: 924722
C1:
Flags/Description: ACPI FFH MWAIT 0x0
Latency: 1
Usage: 89774
Duration: 31663862
C2:
Flags/Description: ACPI IOPORT 0x414
Latency: 350
Usage: 25768
Duration: 24190992
C3:
Flags/Description: ACPI IOPORT 0x415
Latency: 400
Usage: 405313
Duration: 85575022637
AMD Ryzen 7 7735HS
CPUidle driver: acpi_idle
CPUidle governor: menu
analyzing CPU 4:
Number of idle states: 4
Available idle states: POLL C1 C2 C3
POLL:
Flags/Description: CPUIDLE CORE POLL IDLE
Latency: 0
Usage: 5236511
Duration: 33151236
C1:
Flags/Description: ACPI FFH MWAIT 0x0
Latency: 1
Usage: 68356802
Duration: 1954121741
C2:
Flags/Description: ACPI IOPORT 0x414
Latency: 18
Usage: 25603592
Duration: 1986129966
C3:
Flags/Description: ACPI IOPORT 0x415
Latency: 350
Usage: 14551352
Duration: 77114754509
AMD Ryzen 7 7840HS
CPUidle driver: acpi_idle
CPUidle governor: menu
analyzing CPU 5:
Number of idle states: 4
Available idle states: POLL C1 C2 C3
POLL:
Flags/Description: CPUIDLE CORE POLL IDLE
Latency: 0
Usage: 2754712
Duration: 91604147
C1:
Flags/Description: ACPI FFH MWAIT 0x0
Latency: 1
Usage: 101206334
Duration: 5540660093
C2:
Flags/Description: ACPI IOPORT 0x414
Latency: 18
Usage: 17879457
Duration: 1736766115
C3:
Flags/Description: ACPI IOPORT 0x415
Latency: 350
Usage: 3887844
Duration: 73780112353
Intel Xeon Silver 4214R
CPUidle driver: intel_idle
CPUidle governor: menu
analyzing CPU 13:
Number of idle states: 4
Available idle states: POLL C1 C1E C6
POLL:
Flags/Description: CPUIDLE CORE POLL IDLE
Latency: 0
Usage: 1780886844
Duration: 8636888464
C1:
Flags/Description: MWAIT 0x00
Latency: 2
Usage: 16431990427
Duration: 337814322622
C1E:
Flags/Description: MWAIT 0x01
Latency: 10
Usage: 5084275890
Duration: 309067690957
C6:
Flags/Description: MWAIT 0x20
Latency: 133
Usage: 1368088856
Duration: 3588025308542
AMD Ryzen Threadripper PRO 5975WX
CPUidle driver: acpi_idle
CPUidle governor: menu
analyzing CPU 27:
Number of idle states: 3
Available idle states: POLL C1 C2
POLL:
Flags/Description: CPUIDLE CORE POLL IDLE
Latency: 0
Usage: 410444172
Duration: 1861530465
C1:
Flags/Description: ACPI FFH MWAIT 0x0
Latency: 1
Usage: 14490956799
Duration: 422501068392
C2:
Flags/Description: ACPI IOPORT 0x814
Latency: 18
Usage: 3995240933
Duration: 3752298780474