The strace command, typically used to trace system calls and signals for a single process, can actually be used to monitor an entire group of processes simultaneously.
Let’s say you have a parent process and several child processes it spawned, and you want to see all their system call activity in one place. You can achieve this by using the -f (follow forks) and -p (PID) flags together.
Here’s a quick demonstration. First, let’s create a simple Python script that forks a few processes:
import os
import time
print(f"Parent PID: {os.getpid()}")
for _ in range(3):
pid = os.fork()
if pid == 0:
print(f"Child PID: {os.getpid()}, Parent: {os.getppid()}")
time.sleep(10) # Keep children alive for a bit
os._exit(0) # Exit child cleanly
else:
print(f"Forked child with PID: {pid}")
# Parent waits for children (optional, but good for demo)
for _ in range(3):
os.wait()
print("Parent exiting.")
Save this as fork_test.py and run it: python fork_test.py. You’ll see output like this:
Parent PID: 12345
Forked child with PID: 12346
Child PID: 12346, Parent: 12345
Forked child with PID: 12347
Child PID: 12347, Parent: 12345
Forked child with PID: 12348
Child PID: 12348, Parent: 12345
Parent exiting.
Now, let’s trace this parent process and all its children. You’ll need the PID of the parent process from the output (e.g., 12345).
Run strace -f -p 12345.
You’ll see a flood of output, interleaving system calls from the parent and all its children.
%# strace -f -p 12345
strace: Process 12345 attached
...
[pid 12346] futex(0x7f9b2c654c7c, FUTEX_WAIT_PRIVATE, 2, NULL) = ?
[pid 12345] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
[pid 12346] futex(0x7f9b2c654c7c, FUTEX_WAIT_PRIVATE, 2, NULL) = ?
[pid 12347] futex(0x7f9b2c654c7c, FUTEX_WAIT_PRIVATE, 2, NULL) = ?
[pid 12348] futex(0x7f9b2c654c7c, FUTEX_WAIT_PRIVATE, 2, NULL) = ?
[pid 12345] sched_yield() = 0
[pid 12345] read(0, <unfinished ...>
[pid 12346] futex(0x7f9b2c654c7c, FUTEX_WAIT_PRIVATE, 2, NULL) = ?
[pid 12347] futex(0x7f9b2c654c7c, FUTEX_WAIT_PRIVATE, 2, NULL) = ?
[pid 12348] futex(0x7f9b2c654c7c, FUTEX_WAIT_PRIVATE, 2, NULL) = ?
...
The [pid XXXX] prefix is crucial here. It tells you which process made which system call. The -f flag tells strace to trace child processes created by fork(), vfork(), and clone() (and by extension, execve() when they are called by a traced child). Without -f, you’d only see the parent’s system calls.
This is incredibly powerful for debugging complex applications where multiple processes interact or where a parent process manages a pool of workers. You can see, for instance, if a parent is struggling to fork() new processes, or if children are stuck in futex() calls waiting for resources.
To make the output more manageable, you can combine this with other strace options:
-o output.log: Write all output to a file instead of stderr.-tt: Print microsecond-precision timestamps for each system call.-T: Show the time spent in each system call.-e trace=open,read,write: Filter to show only specific system calls.
For example, to trace system calls related to file operations for the parent and its children, writing to a file with timestamps:
strace -f -p 12345 -o trace.log -tt -e trace=file
This would capture all open, read, write, close, stat, etc., calls from the entire process group, allowing you to see how files are being accessed across all involved processes.
The real magic is that strace doesn’t need special configuration on the child processes. Once the parent is being traced with -f, strace hooks into the PTRACE_TRACEME mechanism, and any new processes created by the traced parent are automatically attached to the same strace session.
The next challenge is often correlating the interleaved output from multiple processes to understand the precise sequence of events, especially when dealing with complex inter-process communication.