strace is the Swiss Army knife for understanding what any Linux process is actually doing at the system call level.
Let’s see it in action. Imagine we have a simple C program:
// hello.c
#include <stdio.h>
int main() {
printf("Hello, strace!\n");
return 0;
}
We compile it: gcc hello.c -o hello
Now, to see strace in action, we run it like this:
strace ./hello
The output will look something like this:
execve("./hello", ["./hello"], 0x7ffd310d06c0 /* 59 vars */) = 0
brk(NULL) = 0x55c0523b6000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=102154, ...}) = 0
mmap(NULL, 102154, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f8a7e2c2000
close(3) = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\2\0\3\0\1\0\0\0\340\34\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=2195272, ...}) = 0
mmap(NULL, 2179776, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f8a7e09c000
mprotect(0x7f8a7e09c000, 1740800, PROT_NONE) = 0
mmap(0x7f8a7e09c000, 1474560, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x220000) = 0x7f8a7e09c000
mmap(0x7f8a7e21c000, 450560, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3a0000) = 0x7f8a7e21c000
mmap(0x7f8a7e288000, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f8a7e288000
close(3) = 0
openat(AT_FDCWD, "/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
write(3, "Hello, strace!\n", 17Hello, strace!
) = 17
close(3) = 0
exit_group(0) = ?
+++ exited with 0 +++
What’s happening here? strace intercepts every system call your program makes. The first line, execve, is the kernel telling the shell "Okay, I’m going to run this program now." Then you see calls like openat (opening files), read (reading file contents), mmap (mapping memory), write (writing data), and finally exit_group (the program is done).
The core problem strace solves is the "black box" nature of modern operating systems. You write code, you compile it, and it runs. But how it runs, how it interacts with the kernel, how it accesses files, networks, or other resources – that’s hidden. strace pulls back the curtain.
To build a mental model, think of the Linux kernel as a highly secured vault. Your processes are like customers wanting to access specific items inside. They can’t just grab what they want. They have to go through the vault’s security desk (the system call interface) and make a formal request. strace is like a security camera pointed directly at that desk, recording every request, who made it, and what the desk clerk (the kernel) responded.
Here’s how it works internally: strace uses the ptrace system call. When you run strace -p <PID>, strace asks the kernel to attach to the process with <PID>. Once attached, the kernel stops the target process before it executes any system call and notifies strace. strace then inspects the process’s state (registers, memory), figures out what system call was about to be made, prints it, and then tells the kernel to let the process continue. After the system call completes, the kernel stops the process again, and strace can see the return value. This cycle repeats for every system call.
You control strace with various options:
-p <PID>: Attach to an already running process. Crucial for debugging live applications.-f: Follow forks. If your process spawns child processes, this will trace them too.-s <size>: Specify the maximum string size to print (default is 32). Useful for seeing full file paths or network data.-e trace=<syscall_set>: Filter to trace only specific system calls. For example,-e trace=open,read,writewill show only those. This is a lifesaver for reducing noise.-o <file>: Write the output to a file instead of stderr.
Let’s trace a process that’s having trouble opening a file. Suppose a web server can’t find its configuration file.
# Assume the web server process has PID 1234
strace -p 1234 -e trace=open,openat,access -o /tmp/webserver_open.log
After a while, you stop strace (Ctrl+C) and examine /tmp/webserver_open.log. You might see a line like:
openat(AT_FDCWD, "/etc/webserver/config.conf", O_RDONLY) = -1 ENOENT (No such file or directory)
This tells you the process tried to open /etc/webserver/config.conf for reading, but the kernel said "No such file or directory." The fix? Create that file, or correct the path in the web server’s configuration.
The one thing most people don’t realize is that strace can also show you the data being passed in and out of system calls, not just the names and return codes. When you use -s <size> with a large enough value, you can see the contents of buffers being read from or written to files, or the exact strings being sent over a network socket. This makes strace incredibly powerful for debugging data corruption issues or understanding complex network interactions at a very low level. For instance, if you’re debugging a serialization problem, you might see write calls with binary data that’s clearly malformed, and strace shows you exactly what’s being sent.
Once you’ve mastered strace for system calls, the next logical step is to explore ltrace for library calls.