Resource Management in Kubernetes
This article explores how resource management works on Kubernetes and walks through technical details and examples.
Kubernetes is a container orchestration platform that automates application management like deployments, scaling, etc. While there are many benefits of Kubernetes, one key feature is its resource management. This blog walks through how resource management works on Kubernetes and includes technical details and examples.
Primary Resource Types
Kubernetes has two resources, mainly:
- Central processing unit (CPU): This is measured in cores.
- Memory (RAM): This is measured in bytes.
Requests and Limits
Kubernetes lets you specify requests and limits for every resource on each container within a pod.
- Requests: This describes the bare minimum of memory or CPU that the container may operate on. To schedule the pods onto the nodes, the scheduler makes use of these values.
- Limits: This is the upper bound. If it exceeds the limits, the container either throttles or terminates.
Example Specification for Requests and Limits
Resource requests and limits can be annotated in the pod specification. Below is a YAML example of the pod spec to show resources.
apiVersion: v1
kind: Pod
metadata:
name: test-app
spec:
containers:
- name: test-app-container
image: nginx
resources:
requests:
memory: "100Mi"
cpu: "500m"
limits:
memory: "512Mi"
cpu: "1"
From the above pod spec, it can be seen that the resources requested are 500Millicores and 100MiB for CPU and memory (respectively) by the test-app-container.
How Pod Scheduling Works
Kubernetes verifies the addition of all the requests of each pod's containers, and only if there is any lever over of a node that fulfills the whole container's needs, they will be scheduled on that. Plus, it checks the sum of requests for all the resources from all the pods; this sum should not exceed the node’s capacity so the node can operate and schedule all the pods.
Overcommitment Resources
A user may set up overcommitment so that the sum of the requested resources is less than the node’s capacity while the sum of the resource limits is more than that. Since not every container on a node will need the maximum resource at once, this is permitted.
Example of Overcommitment
Here is an example of the resources on a node:
CPU: 2
Memory: 4Gi
You can run multiple pods with the following:
# Pod 1
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1"
# Pod 2
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1"
The total CPU and memory requests (1 and 2Gi) are within the node’s resources. The total limits (2 and 4Gi) are the same as the node’s resources so we can overcommit.
Quality of Services (QoS) Classes
Pods are given QoS classes by Kubernetes based on their resource constraints and requests:
- Guaranteed: These pods are given the highest priority and have equal requests and limits.
- Burstable: This is assigned to the pods that have limits greater than requests.
- BestEffort: Pods are classified under this class when there are no limits or requests.
Example of QoS Classes
# Guaranteed QoS
resources:
requests:
memory: "512Mi"
cpu: "1"
limits:
memory: "512Mi"
cpu: "1"
# Burstable QoS
resources:
requests:
memory: "128Mi"
cpu: "500m"
limits:
memory: "512Mi"
cpu: "1"
# BestEffort QoS
# No requests or limits specified
Requests vs. Limits: Memory and CPU
To make Kubernetes resource management work, understanding the difference between requests and memory and CPU limits is really important.
Requests and Limits for CPU
- CPU requests: This is the minimum CPU a container will get. The scheduler will consider this when scheduling the pod on a node.
- CPU limits: This is the maximum CPU that a container can use. If a container uses more CPU than it can handle, Kubernetes throttles the CPU rather than terminating the container.
Memory Requests vs Limits
- Memory requests: This is the amount of memory a container is sure to get. The scheduler utilizes this parameter to find out the exact node to place the pod on.
- Memory limits: The maximum amount of memory a container can use. If a container goes beyond its memory limit, it is killed (OOMKilled), and then potentially made to start again.
Example for CPU and Memory Throttling and Termination
Let’s say we have a container running a web server with the following:
apiVersion: v1
kind: Pod
metadata:
name: test-demo
spec:
containers:
- name: test-app-container
image: nginx
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
CPU Requests and Limit Behavior
- If the container requests more than 250 millicores of CPU, Kubernetes will try to give it more CPU if available. What this means is that the container will use a bit more CPU than it has requested but not get throttled until it reaches its limit.
- If the container goes over 500 millicores of CPU, Kubernetes will throttle the CPU. This means the container’s CPU will be limited to 500 millicores and may slow down but won’t use more than its fair share of CPU.
Exceeding Memory Requests and Limits
- Kubernetes will try to give the container more memory if it asks for more than 64MiB. This is not a promise, and at times of high memory pressure, the container might get less than it needs.
- Kubernetes will kill and restart the container if the memory goes over 128MiB. This is called OOMKill (out of memory kill). The container application will crash and you will have downtime or reduced functionality until the container is restarted.
Resource Management and Monitoring
There are several tools available to monitor and manage resources:
- kubectl top: Shows node and pod memory and CPU usage
- Measurements server: HPA and VPA and gathers resource measurements.
Example of kubectl top Usage
- kubectl top node
- kubectl top pod
ResourceQuotas and Limits
Kubernetes administrators can limit resource usage by setting resource quotas and limits at the namespace level.
- ResourceQuota: These can be used as guardrails to limit the total resource per application.
- LimitRange: This can be used as a policy to restrict resources on specific object kinds on an application namespace.
Example of ResourceQuota and LimitRange
YAML
apiVersion: v1
kind: ResourceQuota
metadata:
name: resource-quota
spec:
hard:
requests.cpu: "4"
requests.memory: "8Gi"
limits.cpu: "8"
limits.memory: "16Gi"
---
apiVersion: v1
kind: LimitRange
metadata:
name: limit-range
spec:
limits:
- default:
cpu: "500m"
memory: "512Mi"
defaultRequest:
cpu: "200m"
memory: "256Mi"
type: Container
Conclusion
Resource management is key to running applications reliably in Kubernetes. By understanding and configuring resource requests, limits, and QoS classes you can make sure your applications perform well under load and use cluster resources efficiently. Using kubectl top, ResourceQuota, and LimitRangeto monitor and enforce resource usage policies will give you a balanced and optimal Kubernetes environment.
We ZippyOPS Provide consulting, implementation, and management services on DevOps, DevSecOps, DataOps, MLOps, AIOps, Cloud, Automated Ops, Microservices, Infrastructure, and Security
Services offered by us: https://www.zippyops.com/services
Our Products: https://www.zippyops.com/products
Our Solutions: https://www.zippyops.com/solutions
For Demo, videos check out YouTube Playlist: https://www.youtube.com/watch?v=4FYvPooN_Tg&list=PLCJ3JpanNyCfXlHahZhYgJH9-rV6ouPro
If this seems interesting, please email us at [email protected] for a quick call.
Relevant Blogs:
Simplify Kubernetes Resource Management With Sveltos, Carvel ytt, and Flux
Kubernetes Resource Limits: How To Make Limits and Requests Work for You
How to Create Customer Resource Definitions in Kubernetes
A Controller To Identify Unused and Unhealthy Kubernetes Resources
Recent Comments
No comments
Leave a Comment
We will be happy to hear what you think about this post