strace -ff doesn’t actually trace all threads in a thread group; it forks a new strace process for each thread, which has significant implications for how you interpret the output and the system’s behavior.

Let’s see this in action. Imagine a simple multi-threaded C program:

#include <stdio.h>
#include <pthread.h>
#include <unistd.h>

void *thread_function(void *arg) {
    printf("Thread %ld: starting\n", (long)pthread_self());
    sleep(2);
    printf("Thread %ld: finishing\n", (long)pthread_self());
    return NULL;
}

int main() {
    pthread_t threads[2];
    long i;

    printf("Main thread: starting\n");

    for (i = 0; i < 2; i++) {
        pthread_create(&threads[i], NULL, thread_function, (void *)i);
    }

    for (i = 0; i < 2; i++) {
        pthread_join(threads[i], NULL);
    }

    printf("Main thread: finishing\n");
    return 0;
}

Compile it: gcc -o multithreaded multithreaded.c -pthread

Now, let’s run strace -ff -o trace_output ./multithreaded.

You’ll find files named trace_output.1234, trace_output.1235, trace_output.1236, etc. (where 1234 is the PID of ./multithreaded, and 1235, 1236 are the PIDs of the strace processes attached to each thread).

Each of these files contains the system calls made by one specific thread (or rather, the strace process attached to it).

The problem strace -ff solves is that a single strace process attached to the main thread of a multi-threaded program would not see system calls from other threads unless those calls were made via the main thread. Threads in Linux are implemented as processes sharing an address space, and each thread has its own PID. strace by default attaches to a single process. The -f flag, when used with a single strace process, tells it to follow forks, meaning it will attach to new child processes created by the traced process. However, new threads are not child processes in the same way; they are new processes within the same thread group.

The -ff option, therefore, doesn’t magically make a single strace process see everything. Instead, it instructs strace to fork itself for each new thread it detects within the target process’s thread group. Each of these new strace instances then attaches to one of the threads. This is why you get multiple output files, one for each PID.

This means the output is not a single, chronologically ordered stream of all system calls across all threads. You have separate logs. When you analyze trace_output.1234, you’re seeing the strace output for the strace process attached to the main thread. trace_output.1235 is the strace attached to the first worker thread, and so on.

The "thread group" concept in Linux means that threads within a process share resources like the PID (which is the PID of the main thread) and the thread group leader’s PID. However, each thread does have its own unique TGID (Thread Group ID), which is the same as its PID. When strace -ff runs, it sees these distinct PIDs (which are also TGIDs) and forks.

The most surprising true thing about strace -ff is that it doesn’t produce a single interleaved log, but rather a series of independent logs, one per thread. This is a direct consequence of how strace works: it attaches to a process ID, and since each thread has its own PID, strace spawns a new instance for each.

The core problem strace -ff attempts to solve is debugging race conditions or interactions between threads that involve system calls. Without it, you’d only see system calls from the thread strace initially attached to. With it, you get visibility into what each thread is doing at the syscall level.

The key levers you control are the output filename prefix with -o and the ability to filter system calls with -e. For example, strace -ff -e trace=futex -o trace_futex ./multithreaded would create separate files for each thread, but only log futex calls.

When you’re dealing with strace -ff output, the temptation is to look for a single, unified timeline. The reality is that you’ll need to correlate events across files based on timestamps (if available and reliable across the system) or common events like clone or execve that might appear in multiple files, indicating the creation of new processes or threads. The pthread_self() output in the printf statements provides a user-space identifier that doesn’t directly map to syscalls, but the order of printf calls in the respective strace files can give you clues.

The real challenge isn’t just collecting the traces; it’s stitching them back together mentally or with custom tooling to understand the sequence of events across threads. You’re essentially debugging a distributed system, albeit one where the "nodes" (threads) share memory.

After fixing your strace -ff related issues, you’ll likely encounter difficulties in correlating the timestamps across the different output files, as system clock synchronization and scheduling jitter can make precise ordering hard.

Want structured learning?

Take the full Strace course →