strace is a debugging tool that intercepts and records system calls made by a process.

Here’s how to observe strace in action and understand its impact.

Let’s trace a simple ls command.

$ strace -c ls /tmp

You’ll see output like this:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 83.33    0.000010           1        10           read
 16.67    0.000002           2         1           openat
  0.00    0.000000           0         1           fstat
  0.00    0.000000           0         2           close
  0.00    0.000000           0         1           lseek
  0.00    0.000000           0         1           newfstatat
  0.00    0.000000           0         1           getdents64
  0.00    0.000000           0         1           write
------ ----------- ----------- --------- --------- ----------------
100.00    0.000012                    18 total

This output summarizes the system calls made by ls. The % time column shows how much CPU time was spent in each system call. Notice that read dominates here.

Now, let’s see strace without the -c flag, which prints every system call.

$ strace -f ls /tmp

You’ll see a flood of output:

execve("/bin/ls", ["ls", "/tmp"], 0x7ffc8f4b7d90 /* 48 vars */) = 0
brk(NULL)                               = 0x55cc44a43000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=140049, ...}) = 0
mmap(NULL, 140049, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f46f611e000
close(3)                                = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libls.so.1", O_RDONLY|O_CLOEXEC) = 3
... (many more lines) ...
readlinkat(AT_FDCWD, "/proc/self/exe", "/bin/ls", 4096) = 4
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(1, 3), ...}) = 0
write(1, "file1\nfile2\nfile3\n", 18)      = 18
close(1)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++

Each line represents a single system call, showing the call name, its arguments, and its return value. The -f flag tells strace to follow child processes, which is crucial for multi-threaded or multi-process applications.

The core problem with strace in production is its instrumentation overhead. Every system call a process makes is intercepted by strace. This interception involves context switches from user space to kernel space and back, which are inherently expensive operations. For a high-throughput application, these switches can accumulate rapidly, significantly slowing down the application or even causing it to time out.

When using strace in production, even for brief periods, you’re looking at a trade-off between visibility and performance. The system calls are the fundamental interface between user-space applications and the kernel. strace sits in this boundary and acts as a middleman for every interaction.

Consider a web server handling thousands of requests per second. Each request might involve numerous system calls for reading request headers, writing responses, accessing files, and network operations. If strace is attached, every single one of those calls is intercepted. The overhead isn’t just the time spent in the system call itself, but also the time spent entering and exiting the kernel to perform the interception.

To minimize this impact, several strategies are employed. The most straightforward is to limit the scope of tracing.

1. Trace only specific system calls: Instead of tracing all system calls, use the -e trace= option to focus on the ones you’re interested in.

$ strace -e trace=openat,read,write -p <PID>

This drastically reduces the number of interceptions, and thus the overhead. It works because strace only performs the interception and logging for the specified system calls.

2. Limit the duration of tracing: Attach strace for only a short, targeted period.

$ strace -p <PID> -o trace.log -T -tt -f &
$ sleep 30
$ kill <STRACE_PID>

The -o trace.log writes output to a file, reducing the immediate performance hit of printing to the console. The -T option shows the time spent in each system call, and -tt provides microsecond timestamps, invaluable for correlating events. Attaching and detaching (kill) allows you to capture a specific window of activity.

3. Use strace’s summary mode (-c): As seen in the first example, -c provides a statistical summary of system calls.

$ strace -c -p <PID>

This runs strace for a while, collects statistics, and then prints the summary and exits. It’s much less overhead than tracing every call, as it doesn’t log each individual event but rather accumulates counters. The overhead is still present but significantly reduced, especially if the application makes many repetitive calls.

4. Filter strace output: If you must trace everything but want to reduce the load, filter the output after it’s generated, or even pipe it through grep in real-time (though this adds its own overhead).

$ strace -f -p <PID> 2>&1 | grep -E "openat|read|write" > filtered_trace.log

This command redirects strace’s stderr (where it writes) to stdout, and then pipes it to grep to only keep lines containing specific system calls. The 2>&1 is crucial to merge strace’s output with the command’s stdout.

5. Use perf for lower-overhead tracing: For many performance analysis tasks, the perf tool (specifically perf trace) offers significantly lower overhead than strace. While strace intercepts system calls by injecting code into the kernel, perf often leverages hardware performance counters and tracing capabilities.

$ sudo perf trace -p <PID>

perf trace can provide similar information to strace but with a fraction of the overhead because it uses more efficient kernel tracing mechanisms like tracepoints or kprobes rather than intercepting every syscall instruction.

6. Trace only specific PIDs and TIDs: If your application has multiple threads or processes, use -p <PID> to attach to a specific process, and -f to follow children. If you want to trace a specific thread within a process, you’d typically need to know its Thread ID (TID) and potentially use tools that can filter by TID, though strace itself primarily operates on PIDs and their children.

The most impactful aspect of strace overhead comes from the sheer number of context switches it forces. Each time strace intercepts a system call, the kernel must transition from user mode to kernel mode, perform the interception, and then transition back to user mode. For an application making millions of system calls per second, this constant switching becomes the bottleneck.

One often overlooked aspect is the performance penalty of writing the trace output itself. If strace is printing to the console, the terminal driver and TTY subsystem are involved, adding further overhead. Redirecting output to a file (-o) bypasses much of this, but disk I/O for writing large trace logs can still be a factor.

The ultimate next step after identifying performance issues with strace is often to move to more specialized, lower-overhead tracing tools like perf or even eBPF-based solutions for deep, production-safe performance analysis.

Want structured learning?

Take the full Strace course →