Reading through the release notes of Kubernetes 1.4, I came across some fantastical news. News so good, I should have expected it. News that could not come at a better time. I’m talking about container guarantees, or what Kubernetes calls Resource Quality of Service. Let me be frank here, its like the Kubernetes team was just trying to confuse me. I’m sure the rest of you immediately knew what they were talking about but I’m a simpleton. So after reading it 5 times, I think I finally got ahold of it.
In a nutshell, when resource min and max values are set, quality of service dictates container priority when a server is oversubscribed.
Let me say this another way, we can oversubscribe server resources AND decide which containers stay alive and which ones get killed off.
Think of it like Linux OOM killer but with more fine grained control. In the Linux OOM Killer, the only thing you can do to help determine what does or does not get killed off, is adjust oom_score_adj per process. Which as it turns out is exactly what Kubernetes is doing.
Here are the details:
There are 3 levels of priority.
BestEffort – These are the containers Kubernetes will kill off first when under memory pressure.
Guaranteed – Take top priority over everything else. Kubernetes will try everything to keep these alive.
Burstable – Likely to be killed off when no more BestEffort pods exist and they have exceeded the REQUEST amount.
And there are two parameters you need to consider.
request – the base number of resources (cpu and ram) a container wants at runtime.
limit – The upper limit the container can consume if not already used elsewhere.
Notice how I mentioned memory pressure up above. Under CPU pressure, nothing will be killed off. Containers will simply get throttled instead.
So how do we determine which priority level a container will have?
Guaranteed if request == limit OR only the limits set
which looks like:
containers: name: mywebapp resources: limits: cpu: 10m memory: 1Gi requests: cpu: 10m memory: 1Gi
OR
containers: name: foo resources: limits: cpu: 10m memory: 1Gi ### Setting requests is optional
Burstable if request less than limit OR one of the containers has nothing set
containers: name: foo resources: limits: cpu: 10m memory: 1Gi requests: cpu: 10m memory: 1Gi name: bar
now recognize there are two containers above, one with nothing specified so that container gets BestEffort which makes the Pod as a whole Burstable.
OR
containers: name: foo resources: limits: memory: 1Gi name: bar resources: limits: cpu: 100m
This config above has two different resources set. One has memory set and the other cpu. Thus once again, Burstable.
BestEffort if no defined resources assigned.
containers: name: foo resources: name: bar resources:
This is just the tip of the iceberg on container guarantees.
There is a lot more there around cgroups, swap and compressible vs incompressible resources.
Head over to the github page to read more.