Hardening Your Kubernetes Cluster: Threat Model

The NSA and CISA recently released a guide on Kubernetes hardening. We'll cover this guide in a three-part series. First, let's explore the Threat Model and how it maps to K8s components.

The National Security Agency (NSA) and the Cybersecurity and Infrastructure Security Agency (CISA) recently released a Cybersecurity Technical Report, namely the "Kubernetes Hardening Guidance".

This report details the threats facing Kubernetes environments and provides configuration guidance to minimize risk.

With this new 3 parts series, we aim to:

Provide a recap on Kubernetes components in order to understand its threat model
List the guide security rules and explain their rationale
Offer a hands-on guide showing how to implement these rules on a cluster

The first part of this new series will be dedicated to a reminder of the key concepts of Kubernetes, necessary to understand the threat model defined by the government agencies.

A Quick Lookback

K8s is about seven years old, and over the last three years, it has risen in popularity to consistently be one of the most loved "orchestration platforms".

To be fair, there isn't anything really new in Kubernetes. When one instance of the application cannot handle the workload, we add another instance, then we load-balance it. We already knew how to do that ten years ago. Back in 2010, I was working in the architecture team in the largest search engine companies in China, and we were already capable of scheduling nodes and apps with redundancy, high availability, and automated failover. Except back then, we were a whole team building wheels so that other vertical product teams didn't have to do this, but some people had to worry about it.

Not anymore. With Kubernetes scheduling and autoscaling, the engineers do not need to worry anymore about load balancers and failover; they don't need to investigate into which is the best tool for the job and build the wheels. Instead, they are liberated from the mundane daily maintenance routine so that they can focus on the more important stuff. That's why K8s is prevailing: it automates the stuff you would otherwise have to worry about.

Now it's not uncommon that the latest generations of software engineers and system engineers have never touched or even seen a physical server on the rack. Because why would they need that? Everything is cloud-based and managed by code!

Why Should You Care About This Guide?

However, as a (complex) platform, Kubernetes is not so straightforward to set up correctly and run fluently without any issues. It has many components, and it's fair to say that every component is quite critical. There is a learning curve, and the curve isn't flat. That's why there are so many data leaks or hacked clusters due to misconfiguration.

Since K8s has become the de-facto place to run your workload, and it integrates well with whatever cloud provider you are using, the importance of cluster security is becoming more and more paramount. That's why NSA/CISA decided to kick in and say something about it.

Who's the NSA/CISA?

You might have already heard of National Security Agency from movies. It saves lives, protects privacy rights, and advances U.S. goals and alliances, often by clandestine operations. What you might not know is, it’s leading the protection of U.S. communications networks and information systems.

The Cybersecurity and Infrastructure Security Agency, on the other hand, was established much more recently (2018) to focus on the government's cybersecurity protections against private and nation-state hackers. It's deep into topics like cybersecurity, infrastructure security, emergency communications, and so on.

Both share the mission to keep the US safe from cyber threats, and doing so assumes the sharing of military-grade recommendations to harden private infrastructures on the nation scale.

Now, without further ado, let's dive right into it.

1. Kubernetes Components

First, let's do a quick recap of the components in Kubernetes:

A K8s cluster consists of some worker nodes (which run containerized apps) and a control plane that manages the worker nodes and the Pods in the cluster.

On each worker node runs:

container runtime: responsible for running containers (for example Docker, containerd etc.)
kubelet: an agent which makes sure that containers are running on a Pod
kube-proxy: a network proxy that maintains network rules on nodes, allowing network communication to your Pods
cluster DNS: a DNS server serving DNS records for Kubernetes services.

The control plane makes global decisions about the cluster (for example, scheduling) and detects and responds to cluster events (for example, starting up a new pod when a deployment's replicas field is unsatisfied).

The major control plane components are:

kube-apiserver: exposes the K8s API, acting as the "frontend" for K8s control plane.
etcd: consistent and HA key-value data store, used as K8s' storage for all cluster data.
kube-scheduler: watches for newly created Pods with no assigned node and schedules them (selects a node for them to run on).
kube-controller-manager: runs controller processes. Examples: node controller, responsible for noticing and responding when nodes go down; endpoints controller, populates the Endpoints object.
cloud-controller-manager: this is how your K8s cluster interacts with your cloud (for example, when you run AWS EKS, when you create an ingress object, a load-balancer will be created automatically.) If you are running Kubernetes on-premise, the cluster does not have a cloud controller manager.

2. Kubernetes Threat Model

The first part of the NSA/CISA guide is about presenting their Threat Model, which is structured by three categories:

1. Supply Chain Risk

The supply chain encompasses any element that makes up the end product, which can mean a lot of things! From components to third-party services or software used to manage a cluster, these risks arise from a lot of sources and are often challenging to mitigate.

On the container/application level, a malicious container, a known security leak from a third-party library, or even a single line of a dangerous function call could provide hackers with a way into the cluster. Secrets forgotten in source code are one of the favorite attack vectors because they are often very easy to recover and exploit.

On the infrastructure level, the software and libraries running in the worker node could also have vulnerabilities that the hackers could potentially exploit to get into the system.

2. Malicious Actors

Malicious threat actors can exploit vulnerabilities and misconfigurations in components of the Kubernetes architecture, such as the control plane, worker nodes, or containerized applications.

The control plane's multiple components (API server, controller manager, cloud controller manager, etc, etc.) communicate with each other to manage the cluster. Hackers frequently take advantage of exposed control plane components that are lacking appropriate access control.

Worker nodes run the container engine, kubelet, and kube-proxy service, which are potentially exploitable by hackers as well.

On the app level, the containers running inside the cluster are common targets as well. Many apps are frequently accessible outside of the cluster, making them reachable remotely. A hacker can then start from an already compromised Pod, escalate privileges within the cluster, and access more.

3. Insider Threats

Insider threats can be administrators, users, or even cloud service providers. Insiders with special access to an organization's Kubernetes infrastructure may be able to abuse these privileges.

For example, admins may have control over running containers and the ability to execute arbitrary commands inside containerized environments. RBAC authorization can help reduce the risk by restricting access to sensitive capabilities, but it could be misconfigured. What's more, admins often have physical access to the systems or hypervisors, which could also be used to compromise the Kubernetes environment.

Conclusion

You should now be familiar with the NSA/CISA threat model, and how they can relate (more or less specifically) to the distinct components found in Kubernetes.

Keep in mind that this model is not unique, and if you are interested in this topic, you should definitely check the CNCF financial user group one, which takes a distinct approach to the same topic.

We ZippyOPS, Provide consulting, implementation, and management services on DevOps, DevSecOps, Cloud, Automated Ops, Microservices, Infrastructure, and Security

Services offered by us: https://www.zippyops.com/services

Our Products: https://www.zippyops.com/products

Our Solutions: https://www.zippyops.com/solutions

For Demo, videos check out YouTube Playlist: https://www.youtube.com/watch?v=4FYvPooN_Tg&list=PLCJ3JpanNyCfXlHahZhYgJH9-rV6ouPro

Relevant Blogs:

Visualize Attack Paths in Production Environments With ThreatMapper

Extracting Useful Kubernetes Cluster Info With custom-columns and jq

6 Reasons to Utilize Kubernetes on Bare Metal

How to Explain Kubernetes to a Business Team