I often use HWE kernels with Ubuntu and currently use Ubuntu 22.04. Until recently that meant I ran Linux 6.2 but after a recent update I am now on Linux 6.5.
I am far from an expert on this topic and what I write here might just be notes to myself. Be wary of following my advice.
Disabling turbo boost yesterday
I have been disabling turbo boost for many years on my home test servers to reduce performance variance from hardware, especially as the weather gets warm because I don't have a server room with AC. The problem with turbo boost on some of my servers was cyclical behavior:
- CPU cools, turbo boost does its thing
- benchmark runs faster
- CPU gets hot
- turbo boost stops doing its thing
- benchmark runs slower
- repeat
On my Intel servers I disable turbo boost via BIOS settings. On my AMD servers that used to be done via a script because I was using acpi-cpufreq: echo 0 > /sys/devices/system/cpu/cpufreq/boost
My goal is repeatable performance and I am willing to sacrifice peak HW performance to get that. Avoiding the cycle described above helps to achieve that. Alas this is a spectrum -- I tolerate other things (CPU cache, database cache) that improve performance while adding variance. I assume that I want CPU frequency to stay within a narrow range. It isn't clear that even when using acpi-cpufreq that I was getting a narrow range, but it did help.
From the Ryzen 7 7840HS CPU I am use on these servers the AMD specs state that the base speed is 3.8GHz and the max boost is up to 5.1GHz. With acpi-cpufreq the CPU cores can be in one of three frequency levels, and from cpupower frequency-info they are:
available frequency steps: 3.80 GHz, 2.20 GHz, 1.60 GHz
So even with turbo boost disabled (see the echo command above) there is still room for variance. But I don't know enough to determine whether I need to do more tuning.
Disabling turbo boost today
After a recent update on Ubuntu 22.04 with HWE kernels I now run 6.5.0-27-generic and acpi-cpufreq has been replaced by amd-pstate. I am sure there are many benefits from this change, alas, it also brings complexity and confusion from users who now have server cooling problems (because things are running faster) and are trying to figure out how to fix them. Notes on setting up the server are here.
I noticed this change because with the the default (amd-pstate in active mode) this file doesn't exist:
/sys/devices/system/cpu/cpufreq/boost
On a Ryzen 7 CPU I get the amd-pstate-epp driver in active mode. Output from /proc/cpuinfo and cpupower frequency-info from this state is below. Note that /sys/devices/system/cpu/cpufreq/boost doesn't exist when in active mode. It does exist when in guided or passive mode. So I either need to switch to guided or passive mode or rollback to using the acpi-cpufreq driver. Which means I need to understand a bit more.
There is a lot of documentation for the amd-pstate driver. It isn't meant for the casual user.
- from AMD
- useful comments (grep for linuxlion)
- other workarounds
- more details
- notes on CPU frequency scaling
- generic notes on turbo boost
- another good thread on Reddit
- my post on the Beelink support forum
driver: amd-pstate-epp
CPUs which run at the same hardware frequency: 7
CPUs which need to have their frequency coordinated by software: 7
maximum transition latency: Cannot determine or is not supported.
hardware limits: 400 MHz - 5.61 GHz
available cpufreq governors: performance powersave
current policy: frequency should be within 400 MHz and 5.61 GHz.
The governor "powersave" may decide which speed to use
within this range.
current CPU frequency: Unable to call hardware
current CPU frequency: 2.97 GHz (asserted by call to kernel)
boost state support:
Supported: yes
Active: no
For now I will just rollback to using the acpi-cpufreq driver while figuring this out and possibly waiting for Linux 6.6 to show up on Ubuntu 22.04. I am not sure how mature amd-pstate is, and I won't get support for cpupower set --turbo-boost 1 until 6.6 arrives.
I now have this in /etc/default/grub:
GRUB_CMDLINE_LINUX_DEFAULT="pcie_aspm=off nosmt amd_pstate=disable"
- pcie_aspm=off is there to avoid correctable PCI errors (maybe Beelink BIOS needs an update)
- nosmt disables hyperthreads because BIOS doesn't have an option for that
- amd_pstate=disable lets me use the acpi-cpufreq driver
CPU frequencies with acpi-cpufreq
This shows the CPU frequencies I get from an idle server with the acpi-cpufreq driver. Note that I mostly get only 3 values when boost is disabled (set to 0).
With /sys/devices/system/cpu/cpufreq/boost set to 0
current CPU frequency: 1.50 GHz (asserted by call to kernel)
current CPU frequency: 3.80 GHz (asserted by call to kernel)
current CPU frequency: 1.60 GHz (asserted by call to kernel)
current CPU frequency: 1.60 GHz (asserted by call to kernel)
current CPU frequency: 1.60 GHz (asserted by call to kernel)
current CPU frequency: 1.60 GHz (asserted by call to kernel)
current CPU frequency: 3.80 GHz (asserted by call to kernel)
With /sys/devices/system/cpu/cpufreq/boost set to 1
current CPU frequency: 2.04 GHz (asserted by call to kernel)
current CPU frequency: 2.11 GHz (asserted by call to kernel)
current CPU frequency: 1.60 GHz (asserted by call to kernel)
current CPU frequency: 1.60 GHz (asserted by call to kernel)
current CPU frequency: 1.57 GHz (asserted by call to kernel)
current CPU frequency: 1.60 GHz (asserted by call to kernel)
current CPU frequency: 3.21 GHz (asserted by call to kernel)
Appendix
Note that cpupower frequency-info only shows frequencies for one core, to see them all use cpupower -c all frequency-info.
Output from cpupower frequency-info with active mode
driver: amd-pstate-epp
CPUs which run at the same hardware frequency: 7
CPUs which need to have their frequency coordinated by software: 7
maximum transition latency: Cannot determine or is not supported.
hardware limits: 400 MHz - 5.61 GHz
available cpufreq governors: performance powersave
current policy: frequency should be within 400 MHz and 5.61 GHz.
The governor "powersave" may decide which speed to use
within this range.
current CPU frequency: Unable to call hardware
current CPU frequency: 2.97 GHz (asserted by call to kernel)
boost state support:
Supported: yes
Active: no
Output from cpupower frequency-info with guided mode
driver: amd-pstate
CPUs which run at the same hardware frequency: 7
CPUs which need to have their frequency coordinated by software: 7
maximum transition latency: 20.0 us
hardware limits: 400 MHz - 5.61 GHz
available cpufreq governors: conservative ondemand userspace powersave performance schedutil
current policy: frequency should be within 400 MHz and 5.61 GHz.
The governor "schedutil" may decide which speed to use
within this range.
current CPU frequency: Unable to call hardware
current CPU frequency: 1.44 GHz (asserted by call to kernel)
boost state support:
Supported: yes
Active: yes
AMD PSTATE Highest Performance: 214. Maximum Frequency: 5.61 GHz.
AMD PSTATE Nominal Performance: 145. Nominal Frequency: 3.80 GHz.
AMD PSTATE Lowest Non-linear Performance: 42. Lowest Non-linear Frequency: 1.10 GHz.
AMD PSTATE Lowest Performance: 16. Lowest Frequency: 400 MHz.
Output from cpupower frequency-info with passive mode
driver: amd-pstate
CPUs which run at the same hardware frequency: 7
CPUs which need to have their frequency coordinated by software: 7
maximum transition latency: 20.0 us
hardware limits: 400 MHz - 5.61 GHz
available cpufreq governors: conservative ondemand userspace powersave performance schedutil
current policy: frequency should be within 400 MHz and 5.61 GHz.
The governor "schedutil" may decide which speed to use
within this range.
current CPU frequency: Unable to call hardware
current CPU frequency: 2.74 GHz (asserted by call to kernel)
boost state support:
Supported: yes
Active: yes
AMD PSTATE Highest Performance: 214. Maximum Frequency: 5.61 GHz.
AMD PSTATE Nominal Performance: 145. Nominal Frequency: 3.80 GHz.
AMD PSTATE Lowest Non-linear Performance: 42. Lowest Non-linear Frequency: 1.10 GHz.
AMD PSTATE Lowest Performance: 16. Lowest Frequency: 400 MHz.
Output from /proc/cpuinfo
vendor_id : AuthenticAMD
cpu family : 25
model : 116
model name : AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
stepping : 1
microcode : 0xa704103
cpu MHz : 3800.000
cache size : 1024 KB
physical id : 0
siblings : 8
core id : 7
cpu cores : 8
apicid : 14
initial apicid : 14
fpu : yes
fpu_exception : yes
cpuid level : 16
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload vgif x2avic v_spec_ctrl vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid overflow_recov succor smca flush_l1d
bugs : sysret_ss_attrs spectre_v1 spectre_v2 spec_store_bypass srso
bogomips : 7585.46
TLB size : 2560 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14] [15]
The problem isn't resolved. I tried both the current HWE and non-HWE kernels with a variety of kernel boot params but the result was that one or both of the new SER7 servers were slower. I tried all of these:
GRUB_CMDLINE_LINUX_DEFAULT=""
Results from tests are here. In many cases the CPU overhead (user+system) is significantly different on the new servers compared to the old one.
The selection of small servers (with small TDP) that don't have a mix of performance and efficiency cores is limited (the few available use AMD Ryzen CPUs). I might replace the Beelink SER7 with ASUS PN53.
I asked Beelink support for a copy of the v28 BIOS that the old (good) server uses. They provided it, I installed it but the errors remain.
From dmidecode -t bios the BIOS versions are:
v28 (good one used by old server)BIOS InformationVendor: American Megatrends International, LLC.Version: SER7PRO_P5C8V28Release Date: 08/14/2023
v38 (used by both new servers that have the errors)BIOS InformationVendor: American Megatrends International, LLC.Version: SER7PRO_P5C8V38Release Date: 01/10/2024
An example of the errors is:
Dear Mark,
ReplyDeleteI have amd ryzen 5600h processor in my laptop. I use ubuntu 22.04 lts version.
I have tried switching back to old acpi driver. But amd_pstate_epp working best for me from energy and performance perspective. I have only one issue though, if i could limit its cpu max freq or somehow disable the turbo boost, both seems impossible for me. After reading so many arch wikis, ubuntu help pages and cpu control tools website, finally I gave up for now. Hopefully new control mechanism comes in upstream kernels. After reading your webpage I understood exactly how you might be feeling. Please do post the solution if you found one in near future. Thank you.
Let me know if you find something better for your use case. Linux has to serve many different workloads.
DeleteBy default of Turbo Boost on Intel Xeon-SP processor, the frequency varies according to the number of active cores. Depending on the CPU model, the CPU frequency with 1 active core (single core Turbo) and all active cores (all core Turbo) could vary a lot. That will make performance inconsistent as you mentioned. For recent Intel Xeon processor, there is a feature named Per Core Turbo. It allows you to set the turbo frequency of individual core. For your case, you can match the single core turbo frequency to all core turbo frequency. The variation can be reduced significantly but the performance is better than running at base frequency. You can refer to this paper for the script which can configure the per core turbo. https://cdrdv2-public.intel.com/739138/739138_Intel_TBT_Config_Per_Core_Turbo_Overview_TG_Rev1p0.pdf
ReplyDeletehttps://cdrdv2-public.intel.com/739138/739138_Intel_TBT_Config_Per_Core_Turbo_Overview_TG_Rev1p0.pdf
ReplyDeleteFor desktop, this can be adjusted thru BIOS. https://forums.tomshardware.com/threads/how-to-fix-my-cpu-core-ratio-max-turbo-boost-multipliers.3791956/ I have not tested the script on the Intel processor based desktops ... :)
ReplyDeleteThank you for the advice. I ended up revisiting this and details on that are here: https://smalldatum.blogspot.com/2024/10/managing-cpu-frequency-for-amd-on.html
Delete