Many organizations worldwide are using containers for modern application deployment. But are these containers safe from cyberattacks? Let’s investigate.
In the advancing world of technology, containers are used for application deployment. They provide isolated and lightweight environments for smooth-running processes. However, the security risk is also increased with the revolution brought by containers in application deployment. Hackers gain access to such restricted and isolated environments, posing a threat to the host system. Hackers use many techniques to gain access to the host system.
One technique hackers use is manipulating Procfs, a virtual process file system in Linux. Hackers exploit Procfs to bypass security measures like AppArmor, leaving the host system compromised. Hackers particularly target critical components such as the RUNC binary, which is used for managing containers.
The article focuses on hackers’ illicit use of Procfs for container escape. It also discusses measures to increase container security, such as PID namespace isolation, limitations on Procfs access to hackers, and fileless storage for the RUNC binary.
1. RUNC: The Tool For Managing Containers
- RUNC is an open-source CLI (Command Line Interface) tool for running and spawning containers compliant with the specifications of the Open Container Initiative (OCI). It is a portable and lightweight foundation for the operations of a container runtime.
- Developed for the Docker Project, RUNC isolates and starts container processes using the host OS. Some Linux features that RUNC uses include groups, capabilities, and namespaces for creating isolated environments with dedicated networks, filesystems, and process trees.
- It is used for Kubernetes and Docker, working as a standalone tool, and ensures consistent management of containers across different environments and OSes.
Use the following command to start a container using runc:
runc run -b bundle container-id
Where -b represents the path to the bundle directory, container-id is the unique identifier of the container
Example:
runc spec # Generate a default OCI spec file (config.json)
runc create -b /mybundle mycontainer # Create a container from the specified bundle
runc start mycontainer # Start a previously created container
runc run -b /mybundle mycontainer # Create and start the container in one step
runc exec mycontainer sh # Execute a command (e.g., sh) inside a running container
runc list # List all containers managed by runc
runc state mycontainer # Show the state of a specific container
runc kill mycontainer SIGKILL # Send a signal (e.g., SIGKILL) to a container
runc delete mycontainer # Delete a container (should be stopped first)
2. Procfs: Linux Virtual File System
- Unlike traditional file systems, the process remains only in memory and is dynamically generated at boot time. It provides a live and detailed snapshot of the system. The process is a virtual file system based on Linux that offers insights into system resources and processes by dynamically exposing kernel data structures as directories and files. It revealed memory, network activity, I/O statistics, and CPU activity for debugging and monitoring.
- Procfs are now a double-edged sword. They provide transparency for administrators, but at the same time, they offer a target for hackers to exploit privileged kernel-level information.
cat /proc/<pid>/status
This displays detailed information about the process with the given PID.
Example:
cat /proc/1234/status # View status of process with PID 1234
cat /proc/1234/cmdline # Show command line arguments for the process
ls -l /proc/1234/fd # List open file descriptors for the process
cat /proc/cpuinfo # Display CPU information
cat /proc/meminfo # Display memory usage details
cat /proc/uptime # Show system uptime
cat /proc/loadavg # Display system load average
4. Security Measures for Containers
Privileged containers are redefined to address Docker’s filesystem vulnerabilities. User namespaces can be used to run Docker containers for enhanced isolation. However, this feature is not enabled in many containers, making them effectively privileged in UID mappings. This allows potential host system access; therefore, unlike LXC, they are not truly unprivileged.
LXC maps container UID 0 to a UID of a non-root host, securing unprivileged containers. They are isolated in separate namespaces, and manipulation of the host kernel or file system is prevented.
For reducing such attacks, the following are two measures that should be adopted;
- Quarantine /Proc Filesystem View: Quarantine container’s view of /proc is isolated from the host’s view of /proc. The host’s/proc is less likely to be directly exploitable from within the container, offering extra security.
5. Practical Examples For Securing Containers From Exploit Use of Procfs
5.1 First Use Case: Hackers Exploiting the RUNC Binary
- A hacker attacks the system through misconfigurations, malicious images, or vulnerabilities and gains access to a container. After gaining access to the system, the hacker first targets the RUNC binary, which compromises the host system and prevents it from working normally. The RUNC binary manages and creates containers by interfacing with the features of the Linux kernel, such as groups and namespaces. It is the low-level runtime responsible for running and spawning container processes.
- Hackers can replace original RUNC binary with malicious and harmful version with the original RUNC binary. Now, the host system will not work on the original RUNC binary but on the malicious version that the hacker has used.
- When a malicious RUNC version is used for managing containers, the hacker can gain complete control of the host system by executing arbitrary code with root privileges.
Solution: Read-only mount: Mount /runc using a read-only bind
- Use signed binaries: Use hash/signature for verifying /runc’s integrity.
- AppArmor/SELinux: Restrict what processes can modify or execute /runc.
- Run containers as non-root users (rootless containers).
- Use system immutability: e.g., boot with read-only root filesystem.
Enhancing Further Security: Limited Access
- Allowing access to write for trusted system users can enhance security further. Later, the container runtime should also be configured to use that fileless version.
- This will ensure that hackers cannot overwrite the RUNC binary version and that all such attempts will be ineffective. By using such measures, organizations can increase the security of containers and reduce the risk of container escape, and container operations will run smoothly.
# === Attacker Step (Awareness Only) ===
# Locate and inspect the current runc binary
which runc
ls -l $(which runc)
# Hypothetical: Replace runc with a malicious version (only possible in a misconfigured system)
cp /tmp/malicious-runc /usr/bin/runc
chmod +x /usr/bin/runc
# === Defender Mitigation: Fileless, Read-Only runc Binary ===
# Mount a tmpfs (in-memory, read-only) directory
mount -t tmpfs -o ro,nosuid,nodev,noexec tmpfs /secure/runc
# Copy original trusted runc binary to secure location
cp /usr/bin/runc /secure/runc/runc
chmod 755 /secure/runc/runc
chown rootroot /secure/runc/runc
# === Configure Runtime to Use Secure runc Binary ===
# Use environment variable or configuration to run container with the secure binary
/secure/runc/runc run -b /mybundle mycontainer
5.2 Second Use Case: Hackers Exploiting the /proc/<PID>/maps
Overview
The /proc filesystem provides deep analysis into specific data of processes in Linux. /proc/<PID>/maps provides memory layout information, including permissions, memory regions, and associated files. This analysis can be used for debugging and performance analysis. Hackers can also misuse this information.
Security Risk
Hackers can use /proc/<PID>/maps with local system access to identify the process’s memory layout. This increases precision attacks in which memory regions are critical, such as Return-Oriented Programming (ROP). A layout information leak can make memory injection and buffer overflow attacks easy to execute if ASLR-like protections are not enabled or are weak.
Solution: Restrict Access to Process Metadata
To mitigate this risk:
- Use the hidepid=2 mount option to restrict users’ access to other users’ process data:
- mount -o remount,hidepid=2 /proc
- Use PID namespaces (containers) for isolating processes.
- Enforce MAC policies (AppArmor, SELinux) to limit access to /proc/*.
- Limit access to /proc/* by enforcing MAC policies (AppArmor, SELinux).
- Ensure stack protection and ASLR are enabled. Harden memory permissions
These measures reduce the effectiveness of memory-targeted attacks, limiting what attackers can learn from the process layout.
5.3 Third Use Case: PID Namespace Isolation
Security Risk
If PID namespaces are not appropriately configured, container processes can interact with or see host processes in Linux-based container environments. This helps hackers break container isolation, increasing security risks such as unauthorized signaling (e.g., using kill), privilege escalation, or information leakage using process inspection tools.
Solution: Implement PID Namespace Isolation
PID namespaces provide container processes with their own isolated process ID space. PID 1 refers to the container’s init process from within the container, and host processes are not visible. This protects the containers and the host from malicious or accidental process-level interference.
Additional Hardening Measures
- Seccomp: Limits syscalls (e.g., ptrace) that can manipulate other processes.
- User Namespace: Map container root users to non-root users on the host.
- AppArmor/SELinux: It applies process confinement policies.
- Dropping Capabilities: It uses essential Linux capabilities. For example, drop CAP_SYS_PTRACE.
These measures maintain container boundaries and reduce the attack surface, ensuring strong process-level isolation.
6. Conclusion
Container escapes using Procfs pose significant security risks in containerized environments. Hackers can compromise a host system by exploiting runtime vulnerabilities, misconfigurations, and breaking the isolations. Organizations should develop effective security measures for securing their containerized environments, as discussed in the article. Technology is continuously evolving, and so are security risks. With these increasing security risks and cyber attacks, proper hardening, continuous detection, and vigilance are the key to preventing container escapes using Procfs by hackers.