Securing Containers With Seccomp Filters
In this article we present a novel way to protect your container applications post-exploitation. This additional protection is called Seccomp-BPF.
Many businesses are adopting containers as a foundational
technology used to manage and run their applications. If you’ve worked much
with containers, it’s easy to see why: they enable entirely new levels of
portability and scalability. But the adoption of containers, like any other new
technology, also means new ways to exploit applications.
Depending on the container’s configuration, an exploited application can eventually lead to the compromise of the host that the container is running on. There are also other implications to consider, such as potential secrets stored as environment variables in the container and what they have access to. If you want to know more about Docker containers security best practices specifically, GitGuardian proposes a useful cheat sheet.
A mature software development lifecycle already includes security processes such as vulnerability scanning and software composition analysis, but there is a need for more. Most available application security technology exists to prevent an application from being vulnerable, but not many will contain the damage that can be done when an application is successfully exploited. To help with that, I’ve been researching a novel way to protect your container applications post-exploitation. In this post, I’ll be sharing what it is and how it can be seamlessly integrated into your software development processes that are already established. The additional protection I’m referring to is called Seccomp-BPF, and I need to explain a little about what it is before diving into how to use it.
Background
The programs that we run on computers rely heavily on the underlying operating system to do anything. Tasks like opening files and spawning new processes are abstracted in modern programming languages, but under the hood, the code is making kernel requests called system calls (or syscalls). How important are syscalls for a program to function? Well, there are around 400 syscalls available in the Linux kernel, and even a basic “Hello, World!” program written in C makes 2 of them: write and exit.
Code running in so-called “user space” can’t do anything without going through the kernel to do it. Eventually, some smart Linux kernel developers decided to use that fact to create a powerful security feature. In July 2012, Linux version 3.5 was released which added support for something called Seccomp-BPF.
Seccomp-BPF is a Linux kernel feature that allows you to restrict the syscalls that a process can make by creating a special filter.
In theory, you can create a Seccomp-BPF filter that only allows a process to make the exact syscalls that it needs to function and nothing more. This would be useful in cases where an app is accidentally exploitable in a way that allows an adversary to spawn additional processes. If Seccomp isn’t allowing the process to make new syscalls, there’s a good chance it could thwart the attacker.
Seccomp is super cool, and it’s even integrated into container runtime and orchestration tools like Docker and Kubernetes. It begs the question: “Why isn’t Seccomp widely used?” I think the answer is that there aren’t enough resources out there that bridge the gap between a low-level kernel feature like Seccomp and modern software development processes. Not every organization has a low-level code developer who knows a ton about syscalls. There’s also the overhead of figuring out which syscalls your program needs and updating that with every new feature you implement in your code.
I was thinking about how to solve that problem, and I thought of
an idea: “What if we record the syscalls that a program makes while it’s
running?” I was telling one of my co-workers about my idea, and the next day he
sent me a link to a tool he found on GitHub. It turned out that some folks at
Red Hat had already made a tool called oci-seccomp-bpf-hook
that
does exactly what I wanted!
Creating a Seccomp-BPF Filter
The tool oci-seccomp-bpf-hook
was
made to work with Linux containers. OCI stands for “Open Container Initiative,”
and it’s a set of standards for container runtimes that defines what kinds of
interfaces they should be able to provide. OCI-compliant container runtimes
(like Docker) provide a mechanism called “hooks” that allows you to run code
before a container is spun up and after a container is torn down. Rather than
explain how Red Hat’s tool uses these hooks, I think a demonstration will be
clearer.
Red Hat developed oci-seccomp-bpf-hook
for
use with their container runtime, podman. Podman is backward-compatible with
Docker, for the most part, so the syntax in my examples will look mostly
familiar if you’ve used Docker. Additionally, the OCI hook is currently only
available in Red-Hat-related DNF repositories unless you install it from the
source. To make things less complicated for this demo, I’m just using a Fedora
server (if you don’t have a Fedora environment, I recommend running a Fedora
virtual machine on something like Virtualbox or VMware to follow).
The first thing you’ll need to do to start using oci-seccomp-bpf-hook
is to
make sure you have it installed along with podman. To do that, we can run the
following command:
sudo dnf install podman oci-seccomp-bpf-hook
Now that we have podman and the OCI hook, we can finally dive into how to generate a Seccomp-BPF filter. From the readme, the syntax is:
sudo podman run --annotation io.containers.trace-syscall="if:[absolute path to the input file];of:[absolute path to the output file]" IMAGE COMMAND
Let’s run the ls
command
in a basic container and pipe the output into /dev/null
. While
we’re doing that, we’re going to be recording the syscalls that the ls
command
makes and saving them to a file at /tmp/ls.json
.
sudo podman run --annotation io.containers.trace-syscall=of:/tmp/ls.json fedora:35 ls / > /dev/null
Since we are piping the output of the ls
command
to /dev/null
, there
should be no output in the terminal. However, after the command is done, we can
look at the file that we saved the syscalls to. There we see that the command
did work, and the syscalls were captured:
This file is our Seccomp filter, and we can now use it with any
container runtime that supports it. Let’s try using the filter with the same
containerized ls
command that we just ran:
sudo podman run --security-opt seccomp=/tmp/ls.json fedora ls / > /dev/null
There is no output nor any errors, indicating that the command
was able to successfully run with the Seccomp filter applied. Now comes the fun
part. We will add some capability to the container that wasn’t present when we
recorded the syscalls to make our Seccomp filter. All we’re going to do is add
the -l
flag to our ls
command.
As you can see, we now get a bunch of errors telling us that we
can’t perform some operation that our command was trying to do. The addition of
the -l
flag to our ls
command
added a few new syscalls to the process that weren’t in our Seccomp filter’s
allow list. If we generate a new Seccomp filter with the ls
-l
command, we can see that the new filter works because it
now has all the required syscalls.
As you can see, applying Seccomp filters to your containers greatly restricts its capabilities. In a scenario where an attacker can exploit your application, it may stop them from doing damage or even prevent exploitation altogether.
By using Red Hat’s OCI hook, you no longer need to have a deep knowledge of the Linux kernel’s syscalls to create a Seccomp filter. You can easily create an application-specific filter that doesn’t allow your container to do anything more than what it needs to be able to do. This is a huge step in bridging the gap between the kernel feature and high-level software development.
Conclusion
As great as oci-seccomp-bpf-hook
is,
the tool alone doesn’t fully live up to my expectations for integrating Seccomp
into a mature software engineering workflow. There is still overhead involved
in running the tool, and as a software developer, you don’t want to spend time
manually updating your Seccomp filter for every update of your application. To
bridge that final gap and make it as easy as possible to use Seccomp in enterprise
applications, we need to find a way to automate the generation of Seccomp-BPF
filters. Fortunately, when we look at how modern software development happens,
there is already a perfect place for this automation to happen: during Continuous Integration (CI).
CI workflows are already a well-established part of a mature software development lifecycle. For those that aren’t familiar with CI, it enables you to do things like automated unit testing and code security scanning every time you commit code to your git repository. There are lots of tools for CI out there, so it’s the perfect place to automate the generation of a Seccomp filter for your containerized application.
We are running out of time for this post, so I’ll be back in another post with a demonstration of how to create a CI workflow that generates a Seccomp filter every time you update your code. Then you will finally be equipped to take advantage of Seccomp’s syscall restriction and secure your applications!
We Provide consulting, implementation, and management services on DevOps, DevSecOps, Cloud, Automated Ops, Microservices, Infrastructure, and Security
Services offered by us: https://www.zippyops.com/services
Our Products: https://www.zippyops.com/products
Our Solutions: https://www.zippyops.com/solutions
For Demo, videos check out YouTube Playlist: https://www.youtube.com/watch?v=4FYvPooN_Tg&list=PLCJ3JpanNyCfXlHahZhYgJH9-rV6ouPro
If this seems interesting, please email us at [email protected] for a call.
Relevant blogs:
How to Operationalize a Cloud Security Solution
How to Reduce Cloud Cost by 99% for EDA Kafka Applications
How to Prevent Securit Risks in the Private Cloud
Building a QR Code Generator with Azure Functions
Recent Comments
No comments
Leave a Comment
We will be happy to hear what you think about this post