If you've done any weird low-level ELF debugging, you're probably familiar with
readelf; and perhaps some others I don't know
Well, what if you want to read the symbol table for an ELF executable or library
programmatically? Linux systems come with a header called
<elf.h>, and there's
this whole man 5 elf thing that explains in
obtuse terms how to use
<elf.h> to decode an ELF executable.
In practice, I found it pretty difficult to figure out how to decode the symbol
table. My goal was to decode the symbol table for a "statically" built
/usr/bin/python (which is the default way that Debian/Ubuntu compile python).
readelf I saw:
$ readelf -a $(which python) | grep PyObject_Malloc 619: 0000000000499750 539 FUNC GLOBAL DEFAULT 13 PyObject_Malloc
So I already knew that
PyObject_Malloc was to be found at offset 619 in the
symbol table, and that it should be loaded into memory by
I wrote a program that can actually decode
$(which python) and give the same
output. In particular, I see:
$ ./parse_elf $(which python) ... SYMBOL TABLE ENTRY 619 st_name = 18442 (PyObject_Malloc) st_info = 18 st_other = 0 st_shndx = 13 st_value = 0x499750 st_size = 539 ...
Which matches in all of the relevant fields: you can see that it correctly finds
PyObject_Malloc at symbol table entry 619, that
st_shndx (which holds the
symbol table program header entry) is 13, and that the size of the object code
is correctly identified as 539 bytes.
You can find the code on GitHub at eklitzke/parse-elf. Again, this isn't code I'm super proud of, but if you're in a similar situation to me this should at least help you figure out how to decode all of the relative offsets which should be enough to get you started.