Today I released version 1.6.0 of Pyflame,
which implements a new feature I've wanted to add for a while: the ability to
dump the stacks of all running threads. This functionality is roughly equivalent
to thread apply all bt
with GDB, except
the stack traces you get are Python stacks, not C stacks. This is not just
useful for multi-threaded programs: it's also incredibly useful for debugging
single-threaded programs.
Here's what it looks like in action. In this example, I'm asking Pyflame to dump the stacks for all threads belonging to process 9558:
$ pyflame -d -p 9558
140302210512640:
tests/dijkstra.py:generate:58
tests/dijkstra.py:run:154
tests/dijkstra.py:run_times:169
/usr/lib64/python3.6/threading.py:run:864
/usr/lib64/python3.6/threading.py:_bootstrap_inner:916
/usr/lib64/python3.6/threading.py:_bootstrap:884
140302218905344*:
/usr/lib64/python3.6/random.py:_randbelow:235
/usr/lib64/python3.6/random.py:randrange:196
/usr/lib64/python3.6/random.py:randint:220
tests/dijkstra.py:random_unfilled:80
tests/dijkstra.py:generate:52
tests/dijkstra.py:run:154
tests/dijkstra.py:run_times:169
/usr/lib64/python3.6/threading.py:run:864
/usr/lib64/python3.6/threading.py:_bootstrap_inner:916
/usr/lib64/python3.6/threading.py:_bootstrap:884
140302382565184:
/usr/lib64/python3.6/threading.py:_wait_for_tstate_lock:1072
/usr/lib64/python3.6/threading.py:join:1056
tests/dijkstra.py:main:200
tests/dijkstra.py:<module>:204
As usual, the program being traced has no explicit Pyflame
instrumentation---it's just a regular Python
program. I ran
it with three threads total: two worker threads, plus the main thread. The
currently executing thread is marked in the trace output with an asterisk, and
is thread 140302218905344 in this example. The thread id listed in the dump is
the same id returned by
_thread.get_ident()
.
On Linux systems this will be the same as the integer returned by
pthread_self()
, as
CPython threads are implemented using pthreads.
For really gnarly threading issues you may need to use to a "real" debugger like Pyrasite or GDB. But in my experience, getting a stack trace from all running threads is suitable for debugging many (if not most) real world deadlock issues, including those that have nothing to do with threading at all. For instance, let's say you have a Python process that is stuck in an unresponsive state. You want to know why it's stuck. It could be some kind of threading/deadlock issue, but it could also be something as simple as an unresponsive I/O request that is blocking the entire process. In both cases, knowing where in the code the process is stuck is often sufficient to debug the problem. This is true whether the process is stuck due to locks being acquired in the wrong order, or because a blocking HTTP request is missing a timeout.
Adding this feature was really straightforward, thanks to the work of Max Bolingbroke and James Pickering, who contributed multi-threading support to Pyflame earlier this year. If you find this feature useful, or have any feedback, feel free to shoot me an email or leave an issue on GitHub.