Strace is one of my favorite commands. I use it probably almost every day at work, and regularly use it on my home computers to debug all manner of things. People using OS X or BSD like to talk about how great DTrace is, but I think that for 99% of the things I want to do strace is vastly simpler and therefore is actually much more useful, despite being more limited in nature.
How To Use It
Previously I wrote
an article about setting up eCryptfs on Fedora. When I
was first setting this up, I didn't correctly have my user in the ecryptfs
group and after running ecryptfs-setup-private
I got an error like this:
/usr/bin/ecryptfs-setup-private: line 450: /sbin/mount.ecryptfs_private: Permission denied
ERROR: Could not mount private ecryptfs directory
What this error was actually telling me was that my ecryptfs directory had been
successfully created but couldn't be mounted (which would have been solved by
just restarting my login session). If you know this is what's going on the error
message makes sense, but it doesn't necessarily make sense if you don't know
that up front. So after restarting my session I tried running
ecryptfs-setup-private
again and instead got this error:
ERROR: wrapped-passphrase file already exists, use --force to overwrite.
This is equally cryptic if you don't know what's going on. It's kind of clear that there's some sort of file somewhere that exists, and the setup script is trying to not overwrite that file. But it's not at all clear where that file is since the error message doesn't actually include the path.
This is an example of the kind of problem that you can debug in a minute or so once you know how to use strace. Running
strace ecryptfs-setup-private
and scanning the output will show that there aren't any suspicious failed system
calls or access checks, but you will see that the command calls clone(2)
and
receives a SIGCHLD
. This indicates that the ecryptfs-setup-private
comand is
creating child processes or threads that are doing the actual work, so we need
to trace those:
strace -f ecrypts-setup-private 2>&1 | less
This will cause the child processes/threads to be traced as well and will send
all the output to less
. This yields nearly 3000 lines of output on my
computer, which is a lot to scan through, but we can guess that since the error
message says that the "file already exists" we probably need to look for a
system call that returns EEXIST
. When I searched for this, it immediately took
me to this section of the strace output:
[pid 4798] mkdir("/home", 0777) = -1 EEXIST (File exists)
[pid 4798] chdir("/home") = 0
[pid 4798] mkdir("evan", 0777) = -1 EEXIST (File exists)
[pid 4798] chdir("evan") = 0
[pid 4798] mkdir(".ecryptfs", 0700) = -1 EEXIST (File exists)
[pid 4798] stat(".ecryptfs", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
[pid 4798] close(1) = 0
[pid 4798] close(2) = 0
[pid 4798] exit_group(0) = ?
[pid 4798] +++ exited with 0 +++
and then about ten lines below that I see
stat("/home/evan/.ecryptfs/wrapped-passphrase", {st_mode=S_IFREG|0600, st_size=58, ...}) = 0
This is pretty obviously the issue that the error message is talking about. I
got a little lucky here by searching for EEXIST
which wasn't actually the root
of the error but was close to where the error occurs. I could have also searched
for wrapped-passphrase
which was mentioned as the filename in the error
message. After doing this I poked around in a bash shell looking at the
~/.ecryptfs
directory and figured out what was going on.
Again I got lucky because I had a good guess of what the actual error was; that said, this is the type of thing you'll quickly develop an intuition for if you use strace regularly.
If you don't get as lucky with guessing the error code or filename to look for,
usually the right thing to do is to look backwards in the strace output for
calls to exit(2)
or exit_group(2)
. Shortly before the exit_group(2)
call
typically you will see what the program was doing when it decided to bail.
Another technique that works is looking for system calls that have negative
return codes (again, typically you want to search backwards since you want to
look for failures near the end of the program run). Another technique is to look
for the write call to stdout or stderr that prints the error message, and try to
work backwards from there. In any case when you get stuck the rule of thumb is
to try to look at the end of the output first, since most programs exit (or at
least print an error) right after encountering a problem.
If the program is really complicated or does a lot of stuff before exiting then
pipeing the output to less
will be unwieldly. In this case you want to do
something like:
strace -ff -o x ecryptfs-mount-private
This will create a bunch of files named x.<PID>
with the actual PID
substituted. Afterwards you can usally use grep
to figure out which files you
want to investigate further, and once you've narrowed it down you can analyze
the files with less
, vi
, etc. This technique works well when you have a lot
of child processes to trace, since isolating the output of each process makes
following the trace of each of the processes simpler as you don't have to deal
with interleaved output.
There are a couple of other options you'll want to know. To trace an already running process, use:
strace -p <pid>
To get timing information, you'll want to -tt
to get absolute timestamps
and/or -T
to get elapsed time. I frequently use them together, e.g.
strace -tt -T ls
The leading portion here will show the timestamp and the trailing part will show how long the system call took. This is useful for debugging slow programs, e.g. you can look for system calls that take more than one second easily by grepping the output of this.
By default strings will be truncated after only 32 characters which isn't that
many; to get more characters use strace -s <number
.
If you know what system calls you're interested in you can filter with -e
. For
instance, let's say I want to know how the ping
program creates a socket for
sending ICMP packets; I might look at this with
strace -e socket ping www.google.com
You can specify mulitple system calls this way, e.g.
strace -e socket,setsockopt curl -s www.google.com >/dev/null
There are a myriad of other options that strace will accept, but the ones I've listed here are the ones I most frequently use.
When To Use It
A lot of people know about strace, but in my opinion fail to use it effectively because they don't use it at the right time.
Any time a program is crashing right after you start it, you can probably figure
out what's going on with strace. The most common situation here is that the
program is making a failed system call and then exiting after the failed system
call. Frequently this will be something like a missing file or a permissions
error (e.g. ENOENT
or EPERM
), but it could be almost any failed system call.
As I said before, try to work backwards from the end of the strace output.
Any time you have a program that appears to be hung, you should use strace -p
to see what system call it's stuck on (or what sequence of system calls it's
stuck on). Any time you have a program that appears to be running really slowly,
you should use strace
or strace -p
and see if there are system calls that
are taking a long time.
In both of the cases I just listed, there are a few common patterns that you'll see:
- Programs that are stuck in a
wait*
call are waiting for a child process to complete, proceed by tracing the child processes. - Programs that are stuck in
select(2)
orepoll_wait(2)
calls or loops without making other system calls are typically waiting forever for network data. To debug, try to figure out what file descriptors are in the select or epoll (you can typically figure this out with more strace, by looking in/proc
, or usinglsof
). This is potentially kind of tricky to do with edge-triggered epoll loops, you may need to use gdb or restart the program if you don't have sufficient context. - Programs that are stuck in calls like
connect(2)
,read(2)
,write(2)
,sendto(2)
,recvfrom(2)
, etc. are also stuck doing I/O, and once again you should use something likelsof
to figure out what the other end of the file descriptor is. - Programs that are stuck in the
futex(2)
system call typically have some sort of multi-threading related problem as this is the primitive that Pthreads uses on Linux. You'll see this also in higher level programs written in languages like Python, since Python threads are implemented on Linux using Pthreads. This can't easily be debugged with strace but often knowing that it's a threading problem is sufficient to point you in the right direction. If you need to do more low level debugging, proceed with gdb (I would recommend starting by attaching to the process and getting a backtrace).
Another really common use case for me is programs that aren't loading the right
files. Here's one that comes up a lot. I frequently see people who have a Python
program that has an import statement that is either loading the wrong module, or
that is failing even though the developer thinks that the module should be able
to be imported. How can you figure out where Python is attempting to load the
module from? You can use strace and then look for calls to open(2)
(or perhaps
acces(2)
), and then grep for the module name in question. Quickly you'll see
what files the Python interpreter is actually opening (or attempting to open),
and that will dispell any misconceptions about what may or may not be in your
sys.path
, will identify stale bytecode files, etc.
If there aren't any slow system calls or the system calls don't appear to be looping then the process is likely stuck on some sort of userspace thing. For instance, if the program is computing the value of pi to trillions of decimal places it will probably not be doing any system calls (or doing very few system calls that complete quickly). In this case you need to use a userspace debugger like gdb.
One final piece of advice I have is to try to make an effort to learn as many system calls as you can and figure out why programs are making them. If you've never used strace before the output is going to be really overwhelming and confusing, particularly if you don't have much Unix C programming experience. However, there actually aren't that many system calls that are regularly used or that are useful when debugging these issues. There are also common sequences or patterns to system calls (e.g. for things like program initialization, memory allocation, network I/O, etc.) that you can pick up quickly that will make scanning strace output really fast. All of the system calls you see will have man pages available that explain all of the arguments and return codes, so you don't even need to leave your terminal or have an internet connection to figure out what's going on.
Happy debugging, and may the strace gods bless you with quick results.