Extracting The Python Stack Using Ptrace

I recently wrote a program that I think is pretty nifty, called Pystack. Pystack let's you ptrace(2) attach to an arbitrary running Python process and see the currently executing stack trace.

pystack in action

I'm going to use this blog post to explain in detail how it works, because I think it's pretty neat.

Ptrace

Ptrace has the following interface:

long ptrace(enum __ptrace_request request, pid_t pid,
            void *addr, void *data);

The first argument consists of an action to perform (e.g. PTRACE_ATTACH to attach a process, or PTRACE_GETREGS to read the registers for a process) and the second argument is the process ID. The next two arguments are optional and interpreted based on the request type, but typically the addr field will either specify a memory address in the remote process, and data is typically data to write for commands that have write semantics. If addr or data are not needed for the request type they are set to 0 by convention.

The full ptrace(2) interface is outside the scope of this article, but that's OK because Pystack only uses three of the request types, which I describe here.

The PTRACE_ATTACH request attaches to a "remote" process (called a tracee). The tracee becomes stopped after being attached, just as it would be if it had been delivered SIGSTOP. The tracing process (called the tracer) can now run more interesting ptrace requests against the tracee.

The PTRACE_DETACH request is the inverse of PTRACE_ATTACH, it releases the tracee back to a regular untraced state. This is actually optional: if the tracer dies for any reason, the tracee will be automatically detached.

The PTRACE_PEEKDATA request allows reading a word of memory from the tracee. The GNU libc interface is a little unusual. You use it like this:

long data = ptrace(PTRACE_PEEKDATA, pid, addr, 0);

On error, -1 will be returned. However, the data stored at addr could actually be the literal value -1. Therefore to correctly distinguish the error case you need to first clear the global errno value, and then test it if the return result is -1:

errno = 0;
long data = ptrace(PTRACE_PEEKDATA, pid, addr, 0);
if (data == -1 && errno != 0) {
    /* handle error here */
}

These commands are the only ones used in Pystack. If you want to look at a slightly more complicated program that uses more of the ptrace(2) interfaces, a while back I wrote something called setrlimit that's a bit more exotic.

Getting The Python Stack Trace

As you may know, Python has a global interpreter lock, a.k.a. the GIL. While this is the bane of Pythonistas everywhere, it makes the C programming interface really easy to use. The global lock isn't an actual lock per se, instead it's a mechanism whereby only one Python thread can run at a time. The information about which thread is currently running is stored in a variable in the Python interpreter called _PyThreadState_Current. From the thread information you can get the active frame object for that thread, and from the active frame you can extract the full stack trace.

Normally _PyThreadState_Current isn't directly exposed in the C API since it's not exported in the Python headers. However with ptrace(2) you can access any memory address, so if we can find the correct address we can read the current thread information.

Finding _PyThreadState_Current

There are a few ways that we could potentially find the thread state. The easiest way would be using a command like nm(1) or readelf(1). I wanted to find it directly in C++ though, without invoking external programs.

First we need to understand a bit about ELF files. ELF is the format used on Linux (and pretty much every other Unix, except OS X) for both executables and libraries. We typically think of executables and libraries as different things, but they're not really. They both contain machine code that can be run. A library is just a kind of partial executable, i.e. it has machine code that can be run and is split up into subroutines, but it doesn't have the complete information necessary to make a whole program.

Unix systems ship with a header file called elf.h that contains the necessary struct declarations to parse an ELF file. Here are some things you can get from parsing an ELF file:

There are a bunch of other things in ELF files, but these are the two important things for Pystack.

The Python interpreter itself comes in two different build modes. In one mode you get a fat executable that contains all of the Python symbols, including _PyThreadState_Current, all in the executable itself. This is the default build mode. The other mode is a "dynamic" build where the Python interpreter is just a little executable that contains the symbol Py_Main which calls other things to bootstrap the interpreter. The other symbols will then be in libpython. On my system libpython is called libpython2.7.so for the Python 2.7 interpreter, or libpython3.5m.so for Python 3.5. In the dynamic build mode _PyThreadState_Current will be in libpython, not in the interpreter itself.

So the first step that Pystack does is to figure out which kind of Python it has. It does this by looking at what Python links against. This is very similar to running ldd(1) against the interpreter. In fact ldd(1) works in exactly the same way that Pystack does.

Static builds are simplest. If it's a static build all you need to do is find the address for _PyThreadState_Current in the executable. This just involves enumerating all of the symbols in the symbol table until the right one is found. The address found this way will be the correct address (unless it's a PIE---but we won't worry about that here).

If it's a dynamic build the symbol will instead be in libpython. That's not a huge problem. You just do the same ELF parsing step on libpython and enumerate its symbol table instead. There's a big stumbling block here though. Linux implements address space layout randomization (ASLR) which is a technique used to make symbols be loaded at unpredictable locations. Essentially it means that in the Python interpreter libpython will be loaded at a random, unpredictable offset.

Bypassing ASLR on Linux is possible if you have permissions to read /proc/<PID>/maps for the process. This file lets you figure out the base address for the library you're interested in. Then you add this base address to the addresses found in the ELF file, and that gets you the true address. Typically if you are the same user ID as the other process you will have this permission.

Extracting The Stack

Once _PyThreadState_Current has been located all you have to do is extract the stack. This thread state variable has a pointer to a Python frame object. The Python frame object has a bunch of fields, but these are the important ones:

By following f_back we can get all of the stack frames. We can follow this pointer using PTRACE_PEEKDATA.

Together f_code and f_lasti allow you to get the frame filename and line number. The C code to get the filename is like:

frame->f_code->co_filename;  // has type PyStringObject

For each pointer dealias above we use PTRACE_PEEKDATA to follow the pointer. Python stores string data "inline" in the string object, so once we have the PyStringObject we can get the C string data by reading at the appropriate offset.

Computing the line number is a lot more complicated. There's a weird table structure that's used. The basic idea is that line number data can be compressed by a buffer that holds a map from bytecode increment to line number increment. The exact algorithm is explained here. This was definitely the trickiest part of the program for me, but fortunately I found the Python interpreter code to be pretty easy to follow.

Supporting Python 3

I'm still in the dark ages and use Python 2 for pretty much everything. I wanted to see if I could support Python 3 though.

It turns out that the struct fields are all called the same thing in Python 3, but the offsets differ. So after I figured out the appropriate autoconf voodoo I was able to get the build to use the right python to Python.h based on which Python the build was configured for.

File names in Python 3 are always Unicode objects. Unicode objects in Python 3 are pretty different from string objects in Python 2. The internal representation is a lot more complicated. The encoding scheme used differs based on the actual stored characters, so getting valid string data out of a Python 3 string object requires some relatively complicated decoding. Fortunately if the string data is just ASCII text Python 3 will store the data internally as ASCII. For most code projects in English this isn't a problem, since most people aren't in the habit of using Unicode in filesystem paths used for code. Therefore I just implemented the ASCII case. The code will fall apart if a Unicode path is actually used, so that's a to-do item for me in the future.

Conclusions

I thought this project was a pretty neat and tidy way to demonstrate some core concepts regarding ptrace(2) and ELF files. It's also legitimately useful, unlike some other examples I've seen.

I'm hoping this project can be a jumping off point for others who want to learn more about some of these low-level concepts. I also think Pystack could be the basis of an interesting Python profiler. Check out the code on GitHub (I've tried to keep it well commented) and let me know what you think.