strace output often looks like a jumbled mess of numbers and seemingly random characters, especially when dealing with binary data. This isn’t because strace is broken, but because it’s showing you the raw bytes as they are passed between processes or the kernel, which isn’t always human-readable ASCII.

Let’s say you’re debugging a program that reads a configuration file containing binary settings, and strace shows you something like this for a read call:

read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 32) = 32

Or maybe it’s a network packet:

recvfrom(4, "\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00", 16, 0, {sa_family=AF_INET, sin_port=htons(12345), sin_addr=inet_addr("192.168.1.100")}, [16]) = 16

The \0 and \x01 are just representations of byte values. strace uses \ooo for octal escapes and \xhh for hexadecimal escapes when it can’t represent a character directly in ASCII.

Here’s how you can decode this and understand what’s actually going on.

Understanding the strace Output

When strace shows data, it’s typically in one of two formats:

  1. Quoted String: For printable ASCII characters, it shows them directly, often enclosed in double quotes. Non-printable characters are escaped.
  2. Hex Dump: For larger chunks of data or when strace decides it’s more appropriate, it can show a hex dump.

The example read call above shows a string of 32 null bytes. The \0 is the escape sequence for a null character (byte value 0). The read(3, ...) means file descriptor 3 read 32 bytes from its source, and the output "\0\0\0\0..." is the content of those 32 bytes.

The recvfrom example shows bytes like \x01 and \x00. \x01 represents the byte with hexadecimal value 01 (decimal 1), and \x00 represents the byte with hexadecimal value 00 (decimal 0).

Decoding Binary Data with strace and xxd

The most effective way to decode this raw binary data is to pipe the strace output to a tool that can interpret hex dumps, like xxd.

Let’s refine our strace command to make it easier to pipe:

strace -s 1024 -e trace=read,write,sendto,recvfrom your_program
  • -s 1024: This increases the string length strace will display. The default is often too short to see meaningful chunks of binary data.
  • -e trace=read,write,sendto,recvfrom: This filters strace to only show system calls related to I/O, which are most likely to involve binary data.

Now, let’s say our program is supposed to write a specific binary header to a file. strace might show:

write(1, "\x01\x02\x03\x04\x00\x00\x00\x00\xff\xfe\x00\x00", 12) = 12

To decode this, we can pipe it to xxd:

strace -s 1024 -e trace=write your_program 2>&1 | grep 'write(1, ' | sed 's/write(1, \([^,]*\), .*/\1/' | xxd -r -p

Let’s break down this pipeline:

  1. strace -s 1024 -e trace=write your_program 2>&1: Runs strace and redirects stderr (where strace output goes) to stdout.
  2. grep 'write(1, ': Filters for lines containing write to file descriptor 1 (stdout).
  3. sed 's/write(1, \([^,]*\), .*/\1/': This is crucial. It uses sed to extract only the quoted binary data part from the strace line.
    • s/ ... / ... /: The substitution command.
    • write(1, : Matches the literal start of the string.
    • \([^,]*\): This is a capturing group (\( and \)). It captures any character (.) zero or more times (*) that is not a comma ([^,]). This effectively captures the quoted string of hex/octal escapes.
    • , .*: Matches the comma after the data and the rest of the line.
    • \1: Replaces the entire matched line with just the content of the first capturing group (the binary data string).
  4. xxd -r -p: This is the decoder.
    • -r: Revert mode. It takes hex dump input and converts it back to binary.
    • -p: Plain hex dump style. This tells xxd to expect input like 0102030400000000fffe0000 rather than the default hexdump format. Our sed command outputs exactly this format after strace has represented the bytes as escapes.

The output of this command would be the raw binary bytes:

\x01\x02\x03\x04\x00\x00\x00\x00\xff\xfe\x00\x00

If your strace output directly showed a hex string (e.g., 0102030400000000fffe0000 without quotes or escapes), you’d only need xxd -r -p. However, strace usually shows escaped characters for clarity, hence the sed step.

Common Binary Data Scenarios

  1. Configuration Files: Programs often read binary configuration files. strace will show read calls with data that looks like \x01\x00\x00\x00 (endianness issues, integer values) or specific byte patterns.

    • Diagnosis: Use the pipeline above with read and write system calls.
    • Fix: Analyze the output of xxd -r -p. If the bytes are incorrect, the issue is likely in how the file was generated or how your program is interpreting it.
  2. Network Packets: When dealing with sockets, recvfrom and sendto will show raw packet data. This is frequently binary.

    • Diagnosis: Pipe strace output for recvfrom/sendto to xxd -r -p using the sed trick to extract the data.
    • Fix: The decoded bytes are the actual network payload. If it’s not what you expect, the sender or your program’s packet parsing logic is at fault.
  3. IPC (Inter-Process Communication): Pipes, shared memory, or message queues can transfer binary data. read, write, msgrcv, msgsnd calls will show this.

    • Diagnosis: Similar to configuration files, focus on I/O calls.
    • Fix: The decoded bytes represent the message or data transferred. Ensure both sender and receiver agree on the binary format.

The "Why it Works"

strace captures the raw bytes as they are passed to or from the kernel. When these bytes are not printable ASCII characters, strace has to represent them somehow. It chooses escape sequences (\ooo, \xhh) for individual non-printable bytes or shows them as a string of these escapes.

xxd -r -p is the inverse operation. It takes a string of hexadecimal characters (like 01020304) and converts each pair of hex digits into a single byte. The sed command is simply a pre-processor to isolate the hexadecimal escape sequences from the strace output and format them into a plain string that xxd -r -p can consume.

The Next Step

Once you can reliably decode the binary data, your next challenge will be understanding the structure of that binary data. This often involves consulting documentation for the specific protocol or file format, or reverse-engineering the structure based on known fields and expected values.

Want structured learning?

Take the full Strace course →