strace is fundamentally a debugger that lets you see the system calls a process makes. The -e trace= flag is your primary tool for focusing that view.
# Let's imagine we're debugging a simple 'ls' command
# Without filtering, this is a lot of noise:
strace ls /tmp
# Now, let's focus ONLY on file open operations:
strace -e trace=openat ls /tmp
# You can also trace specific error codes, like EPERM (permission denied)
strace -e trace=openat=EPERM ls /tmp
# Or trace a whole family of related calls, like all network calls:
strace -e trace=network ls /tmp
# You can even combine multiple specific calls and families:
strace -e trace=openat,read,write,network ls /tmp
The real power here isn’t just seeing what a process is doing, but seeing why it’s doing it. When you see openat(-1, "/etc/passwd", O_RDONLY) = 3, you know ls is trying to open /etc/passwd for reading. If that call returns -1 ENOENT (No such file or directory), you know ls can’t find the passwd file, which would be odd.
If ls is hanging, and you suspect it’s waiting on a network resource, you could run strace -e trace=network -p <PID>. Seeing connect(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("192.168.1.100")}, 16) = -1 EINPROGRESS (Operation now in progress) might indicate a network timeout. If you then see that same call hanging indefinitely, you know the problem is in the network stack or the remote endpoint.
When you trace openat, read, and write, you’re essentially watching the basic I/O lifecycle of a file. openat establishes the connection, read pulls data in, and write pushes it out. If a process is slow, seeing a series of read calls that take a long time, or a write call that never returns, points you directly to where the bottleneck is.
It’s not just about seeing successful calls. Tracing specific errors is incredibly insightful. If you’re debugging a curl command that fails with "Permission denied," you might try strace -e trace=openat=EACCES curl http://example.com/file.txt. If you see openat(-1, "/etc/ssl/certs/ca-certificates.crt", O_RDONLY) = -1 EACCES (Permission denied), you know curl can’t read its certificate store, and that’s your culprit. The fix would be chmod +r /etc/ssl/certs/ca-certificates.crt.
The network family is a catch-all for syscalls that interact with the network stack. This includes socket, bind, connect, sendto, recvfrom, poll, select, and many others. If an application is slow to establish a connection, or seems to be stuck waiting for data, tracing the network family will show you exactly which network syscall is blocking or returning an error.
You can also trace specific file descriptors. If you know a process has opened file descriptor 3, and you want to see all read/write operations on it, you can combine -e trace=read,write with a filter on the FD: strace -e read=3,write=3 -p <PID>. This is powerful for isolating activity on a specific socket or file.
The process family of syscalls (fork, execve, clone, wait4) is useful when debugging multi-process applications or when you suspect issues with process creation or termination. If a service is supposed to spawn a worker process but isn’t, tracing fork and execve will tell you if the fork succeeded and if the execve call is being made.
This tracing capability is fundamental for understanding how user-space applications interact with the kernel. It’s the closest you can get to seeing the "source code" of an application’s kernel interactions without having the source code itself.
The next thing you’ll want to explore is how to filter by file descriptor (-e read=FD,write=FD) to pinpoint activity on specific network sockets or files.