Earlier this week I thought that I had maybe found a bug in ptrace(2),
specifically when using the PTRACE_ATTACH request. I have a program that is
using ptrace(2) and I've written a stress tester to verify its correctness.
The stress tester will ptrace(2) a target process every 100 milliseconds. The
target that it is tracing is a Python process that runs for a while and then
exits. The stress tester arranges for the Python process to be restarted after
it exits.
I noticed that on the last ptrace(2) attach before seeing the Python process
exit I would unexpectedly get EPERM returned. This is weird. I was expecting
to get ESRCH, since that's the error code that's returned when you try to
PTRACE_ATTACH to a process that doesn't exist.
I looked at the kernel source code, and I found the code responsible for this.
You can see it in
ptrace_attach() in kernel/ptrace.c.
The code looks like this:
retval = -EPERM;
if (unlikely(task->exit_state))
    goto unlock_tasklist;
So if the target task has "exited", then the return value you get will be
EPERM, not ESRCH. What's interesting about this is this is not the same
behavior of kill(2). If you send a signal via the kill(2) system call to an
exited task, and you would normally have permissions to send signals to that
process, then you will not get EPERM. You won't get ESRCH either, the
kill(2) request will be successful.
I put some demonstration code up on GitHub at eklitzke/ptrace-idiosyncrasy that demonstrates the situation more clearly.
The current behavior is not documented, and I intend to notify the maintainers of the Linux man pages.