The USL is worth understanding. USL is short for universal scalability law and was created by Neil Gunther for use in capacity planning. I don't do much capacity planning but the USL is also valuable for explaining performance. Performance problems for the old InnoDB rw-lock (circa 2007) would have been easy to describe via the USL because "wake all" on unlock is an N2 overhead -- see the 𝛽 parameter in the USL.
A longer version of the USL is here. The USL predicts how throughput will change (increase or decrease) with concurrency. One version of the formula where X(i) is throughput with i concurrent clients is below.
X(1) * N
X(N) = -------------------
1 + α(N-1) + ꞵN(N-1)
The denominator determines how throughput changes with concurrency. When it is one then there is linear scaleup. The best way to get linear scaleup is to cheat and choose an easy base case but otherwise the denominator is greater than 1 and the USL explains less than linear scaleup. The components in the denominator represent the overhead from contention and coherency. The alpha term represents contention and the beta term represents coherency.
I write above that "wake all" on unlock has an N2 overhead. By this I mean that when N threads are trying to get through a synchronization object one at a time and all waiters wake on unlock then there are O(N2) wakeups -- (N-1) + (N-2) + (N-3) + ... + 3 + 2 + 1. The beta term in the denominator represents the coherency overhead and has an N2 term as 𝛽N(N-1) = 𝛽N2-𝛽N.
Amdahl's Law
The long post states that when B=0 and X(1)=1 then the USL is reduced to Amdahl's Law. I will derive the math for that here. The key point is that α=(1-p) where p is the fraction of the workload that is parallel in Amdahl.
# Amdahl's law where p is the fraction of the workload
# that is parallel, N is same as for USL
1
Speedup = -----------
(1-p) + p/N
# Start with USL
N
X(N) = -------------------
1 + α(N-1) + ꞵN(N-1)
# Update for X(1)=1, ꞵ=0, α=(1-p)
N
X(N) = ----------
1 + (1-p)(N-1)
# Algebra for denominator, expand and then reduce
N
X(N) = ----------
N - pN + p
# Multiply numerator and denominator by 1/N
1
X(N) = ----------
(1 - p) + p/N
# Now it looks like Amdahl where α=(1-p)
Predicted max throughput
The long post states that the max throughput occurs at sqrt((1-α)/ꞵ). That is the value of N for which X(N)' = 0. I won't derive it here but X(N)' is:
1 - α - ꞵN2
X(N)' = ---------------------
(1+α(N-1) + ꞵN(N-1))2
Then X(N)'=0 when the numerator is zero and that happens when N=sqrt((1-α)/ꞵ). That determines a critical point for X(N) which is either a min or max. In this case it is a max. I assume that X(N) is concave without deriving X(N)''.
Subscribe to:
Post Comments (Atom)
RocksDB on a big server: LRU vs hyperclock, v2
This post show that RocksDB has gotten much faster over time for the read-heavy benchmarks that I use. I recently shared results from a lar...
-
This provides additional results for Postgres versions 11 through 16 vs Sysbench on a medium server. My previous post is here . The goal is ...
-
I often use HWE kernels with Ubuntu and currently use Ubuntu 22.04. Until recently that meant I ran Linux 6.2 but after a recent update I am...
-
I am trying out a dedicated server from Hetzner for my performance work. I am trying the ax162-s that has 48 cores (96 vCPU), 128G of RAM a...
No comments:
Post a Comment