To pick up again, imagine that in your Kubernetes pod spec, you asked for 250 millicores of CPU to run your container.
Something has to happen to turn that abstract request, 250m of CPU, along with any limits, into a set of concrete allocations or constraints around a running process.
Most Kubernetes resource abstractions are implemented by kubelet, and the container runtime, using Linux Control Groups (cgroups) and control group settings.
When Kubernetes sets cpu.max (limits), that doesn’t change the process’s proportional priority while it’s runnable.
There is often moment-to-moment spare CPU capacity on a node that isn’t guaranteed to a particular container by virtue of its CPU requests.
The limits approach can feel tempting at first, but an industry consensus has been building around not using CPU limits at all for general-purpose workload templates and instead relying on the request method.
People usually intuit that limits are related to fairness and make sure every workload gets its allotted time.
With CPU and CFS cgroup settings out of the way, it’s time to move on to memory.
How do memory requests and limits translate to Linux process settings?
Will proportionality modeling apply to memory the same way it does to CPU?