strace is your best friend for understanding what your Docker containers are actually doing under the hood, beyond the application logs. It intercepts and records every system call the process makes, giving you a granular view of its interactions with the host kernel. This isn’t just for debugging; it’s for understanding resource contention, security vulnerabilities, and performance bottlenecks.
Let’s see strace in action. Imagine you have a simple Python script that reads a file and prints its contents. We’ll run this inside a container.
First, the Python script, read_file.py:
import sys
try:
with open('/app/data.txt', 'r') as f:
content = f.read()
print(content)
except FileNotFoundError:
print("Error: data.txt not found!", file=sys.stderr)
sys.exit(1)
Now, let’s build a Docker image for it.
Dockerfile:
FROM python:3.9-slim
WORKDIR /app
COPY read_file.py .
RUN echo "Hello from inside the container!" > data.txt
CMD ["python", "read_file.py"]
Build the image:
docker build -t syscall-reader .
Run the container:
docker run --name reader_test syscall-reader
This will output: Hello from inside the container!
Now, let’s attach strace to this running container. You’ll need the container ID or name.
docker exec -it reader_test strace -o /tmp/strace.log python read_file.py
Here, docker exec -it reader_test runs a command inside the reader_test container. strace -o /tmp/strace.log tells strace to record its output to a file named strace.log inside the container’s filesystem, and python read_file.py is the command strace will trace.
After running this, you can copy the log out of the container:
docker cp reader_test:/tmp/strace.log .
Now, let’s examine strace.log. You’ll see a lot of system calls. The key ones for our script are related to file operations.
execve("/usr/local/bin/python", ["python", "read_file.py"], 0x7ffd41011270 /* 59 vars */) = 0
...
openat(AT_FDCWD, "/app/data.txt", O_RDONLY) = 3
read(3, "Hello from inside the container!\n", 4096) = 32
write(1, "Hello from inside the container!\n", 32) = 32
close(3) = 0
...
exit_group(0) = ?
+++ exited with 0 +++
The mental model here is that your application process (Python interpreter in this case) doesn’t directly interact with your disk or network. It makes requests to the Linux kernel via system calls. strace intercepts these requests.
The openat(AT_FDCWD, "/app/data.txt", O_RDONLY) call is the kernel being asked to open the file /app/data.txt for reading. AT_FDCWD means the path is relative to the current working directory. The 3 returned is a file descriptor, a handle the process uses for subsequent operations on that file.
read(3, "Hello from inside the container!\n", 4096) reads up to 4096 bytes from file descriptor 3. The output shows it successfully read 32 bytes, which is our file content.
write(1, "Hello from inside the container!\n", 32) writes to file descriptor 1, which is standard output.
close(3) releases the file descriptor.
exit_group(0) signals the process is exiting cleanly with status code 0.
The true power of strace in a container context comes when things don’t work. If data.txt wasn’t created, you’d see:
openat(AT_FDCWD, "/app/data.txt", O_RDONLY) = -1 ENOENT (No such file or directory)
write(2, "Error: data.txt not found!\n", 28) = 28
exit_group(1) = ?
+++ exited with 1 +++
Here, ENOENT (Error NO ENTry) is the kernel telling you the file doesn’t exist. The write(2, ...) call shows the error message being sent to standard error (file descriptor 2).
The system calls you’ll see most frequently are:
openat,open: Opening files.read,write: Reading from/writing to files or sockets.stat,lstat,fstat: Getting file metadata.close: Closing file descriptors.mmap,munmap: Memory mapping.brk: Program break, used for dynamic memory allocation.socket,bind,connect,listen,accept: Network operations.futex: Fast Userspace Mutex, used for synchronization.clone,fork,execve: Process creation and execution.
When debugging performance, you might look for excessive read or write calls on small chunks, or frequent stat calls on directories. For security, you’d watch for unexpected openat calls to sensitive host paths or attempts to access restricted syscalls.
The most surprising thing about strace is how much it reveals about the container runtime’s interaction with your application. For instance, when an application tries to access a file, the openat syscall doesn’t just return a file descriptor directly to your application. It goes through the Docker daemon’s plumbing and the host kernel’s security mechanisms (like AppArmor or SELinux) to ensure the container is allowed to access that specific inode on the host. strace shows you the kernel’s perspective, not just the application’s.
The next thing you’ll likely want to explore is using strace to understand network issues, specifically tracing socket, connect, and sendmsg/recvmsg calls to see how your containerized application is communicating over the network.