Earlier this week I thought that I had maybe found a bug in ptrace(2)
,
specifically when using the PTRACE_ATTACH
request. I have a program that is
using ptrace(2)
and I've written a stress tester to verify its correctness.
The stress tester will ptrace(2)
a target process every 100 milliseconds. The
target that it is tracing is a Python process that runs for a while and then
exits. The stress tester arranges for the Python process to be restarted after
it exits.
I noticed that on the last ptrace(2)
attach before seeing the Python process
exit I would unexpectedly get EPERM
returned. This is weird. I was expecting
to get ESRCH
, since that's the error code that's returned when you try to
PTRACE_ATTACH
to a process that doesn't exist.
I looked at the kernel source code, and I found the code responsible for this.
You can see it in
ptrace_attach()
in kernel/ptrace.c
.
The code looks like this:
retval = -EPERM;
if (unlikely(task->exit_state))
goto unlock_tasklist;
So if the target task has "exited", then the return value you get will be
EPERM
, not ESRCH
. What's interesting about this is this is not the same
behavior of kill(2)
. If you send a signal via the kill(2)
system call to an
exited task, and you would normally have permissions to send signals to that
process, then you will not get EPERM
. You won't get ESRCH
either, the
kill(2)
request will be successful.
I put some demonstration code up on GitHub at eklitzke/ptrace-idiosyncrasy that demonstrates the situation more clearly.
The current behavior is not documented, and I intend to notify the maintainers of the Linux man pages.