strace is your debugger for when Python itself seems like a black box, letting you see exactly how your Python code is talking to the operating system.

Let’s dive into how a Python script interacts with the kernel using strace. Imagine this simple Python script:

import sys

with open("my_file.txt", "w") as f:
    f.write("Hello, strace!")

print("Done!")

Now, let’s run this script under strace:

strace -f -e trace=open,write,close,fstat,lseek,read,exit_group python your_script.py

Here’s a snippet of what you might see:

...
[pid 12345] openat(AT_FDCWD, "my_file.txt", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
[pid 12345] write(3, "Hello, strace!", 14) = 14
[pid 12345] close(3)                    = 0
[pid 12345] write(1, "Done!\n", 6)        = 6
[pid 12345] exit_group(0)               = ?
...

This output is a transcript of system calls.

  • openat(AT_FDCWD, "my_file.txt", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3: This is Python opening my_file.txt for writing. AT_FDCWD means it’s relative to the current directory. O_WRONLY|O_CREAT|O_TRUNC are flags indicating write-only, create if it doesn’t exist, and truncate (empty) it if it does. 0666 is the file permission. The = 3 is the file descriptor, a small integer the OS uses to refer to this open file.
  • write(3, "Hello, strace!", 14) = 14: This is Python writing the string "Hello, strace!" (14 bytes) to the file descriptor 3. The = 14 means all 14 bytes were successfully written.
  • close(3) = 0: This is Python closing the file descriptor 3. A 0 return value means success.
  • write(1, "Done!\n", 6) = 6: This is Python writing "Done!\n" (6 bytes) to file descriptor 1. File descriptor 1 is standard output (stdout).
  • exit_group(0) = ?: This is the Python interpreter exiting, with an exit code of 0 (success).

strace’s power comes from its ability to show you the low-level operations your Python code initiates. When you’re debugging performance issues, unexpected file behavior, or network problems, strace can reveal if the problem lies in your Python code’s logic or in how it’s interacting with the OS.

The -f flag is crucial for multi-threaded or multi-process Python applications, as it traces all child processes and threads. The -e trace=syscall_list option lets you filter for specific system calls, making the output much more manageable. For file I/O, you’d often want to trace openat, read, write, close, fstat, and lseek. For network operations, socket, bind, listen, accept, connect, sendto, and recvfrom are key.

Consider a common scenario: a Python script that’s slow because it’s repeatedly opening and closing files unnecessarily. Without strace, you might be looking at Python loops, but strace would plainly show a close() followed by an open() in rapid succession, highlighting the inefficiency. Similarly, if a network request seems to hang, strace might show a recvfrom call that’s stuck waiting for data, or it might reveal that the connect call itself failed with an error code that Python is then mishandling.

The most surprising true thing about strace is how much information it can provide about synchronous I/O operations that your Python code thinks are happening instantly. When you call f.write(), Python’s C implementation calls the write() system call. If that write() call blocks because the underlying buffer is full or the network is congested, strace will show that block. It’s not just a Python function call; it’s a direct instruction to the kernel, and strace is the messenger. This means you can see, for instance, a read() call that returns EAGAIN (or EWOULDBLOCK), which Python might then interpret as "try again later" but which strace directly exposes as a non-blocking kernel operation.

When you’re debugging a Python application that’s hanging or behaving erratically, and you’ve ruled out obvious Python logic errors, strace is your next step to understand the OS-level interactions. You can even use it to inject signals. For example, strace -p <pid> -e signal=SIGINT can send a SIGINT to a running process, which might be useful for debugging signal handling or understanding what system call a process is blocked on when you try to interrupt it.

The next thing you’ll want to explore is how strace can be combined with other tools, like lsof, to correlate file descriptors seen in strace with actual open files and network connections.

Want structured learning?

Take the full Strace course →