Most Unix systems, including Linux, use the ELF format for executables and
object files. Normally the details of ELF files are invisible to developers, but
certain tasks can call for one to peer into their inscrutable depths. One reason
that you might need to parse ELF files is when trying to find symbols in another
process using the ptrace(2)
system call. In particular, by resolving the
symbols in the ELF file for a remote process you can do things like figure out
which symbols are available in the remote process and where in memory they can
actually be found.
Another reason you might explore this route is by doing hacky things with ELF executables, which I intend to describe in this article. While going down this road I learned some interesting and poorly-documented things that I hope to shed some light on for other developers out there.
Quick Aside: Parsing ELF Files
ELF files are composed of a bunch of sections and headers that can be expressed
as C structs. That means that you can use something like mmap(2)
to directly
map the on-disk representation of an ELF file into C data structures. The ELF
specification goes into great detail about how all of these structures work. As
it turns out, GNU libc (a.k.a. glibc) ships with a header file called elf.h
that contains the C struct definitions for you.
GNU libc is licensed under the LGPL which means that if you aren't making
modifications to it, you can dynamically link against it in your applications
without having it affect the licensing terms of your own code. This means you
can use elf.h
(and the rest of glibc) freely in your own code. However, you
may find that this approach is rather low level, and if you do take this
approach you will have to learn a lot of the intricacies of the ELF format to
find your way around the various sections and tables.
The details of elf.h
are documented in elf(5)
, meaning that you can read the
documentation with the invocation man 5 elf
.
If you have the option, I strongly encourage you to instead look at
GNU BFD, the obscurely named GNU
"Binary File Descriptor" library. BFD provides the basis for
GNU binutils, the standard
command-line utilities for working with object files. In particular, GNU BFD is
the library that is actually used by such tools as ld
(the GNU linker), as
(the GNU assembler), gdb
(the GNU debugger), and other tools you may have
heard of like nm
, objdump
, and readelf
. This means that you're using the
exact same library that is actually used by the linker/debugger on your machine.
Additionally, BFD provides a high-level abstraction so you can just open a file
and do things like get symbols without worrying about the low-level details of
the ELF format.
There is one catch with BFD, which is that it is licensed under the terms of the
GPL. This means that if you use GNU BFD and you want to distribute your work you
will have to license your own code under the GPL, which is not the case if you
include elf.h
when building your application.
ELF File Types
In the header for every ELF file there's a field called e_type
which indicates
what type of ELF file it is. The ones you should expect to see (besides
ET_NONE
) are:
ET_REL
for "relocatable files"; this is type for a.o
object fileET_EXEC
for executables; this means that you can actually run the executable, and this would be the type of files you find in/usr/bin/
or for C programs that define amain()
functionET_DYN
for "shared objects"; this is the type for.so
library filesET_CORE
for core dumps
Symbol Tables
Within the ELF file there will be a number of "sections", and among the section
types there's a type that elf.h
calls SHT_SYMTAB
which holds the symbol
tables. An ELF file can have more than one symbol table.
By convention if there is a symbol table called .dynsym
it holds relocatable
symbols. Relocatable symbols are symbols built in such a way that they can be
relocated to arbitrary virtual memory addresses. Typically if you look at a
.so
shared object file you'll see that all of the visible symbols are put into
.dynsym
.
By convention if there is a symbol table called .symtab
it holds
non-relocatable or non-allocatable symbols.
One of the fields in the ELF header holds the "entry point" for the file, which
is the virtual memory address that the system will transfer control to when
starting an executable. That is, the entry point holds the virtual memory
address for the code that bootstraps start up of the process. You can think of
this as your main()
function although in fact the compiler will generate some
stub code that gets invoked before main()
.
When you create an executable all of the code that you write will typically be
put into non-relocatable addresses. The way this works is the linker decides
that it's going to put the entry point at an address like 0x400410
and then
other symbols will be put at nearby memory locations. So you might end up with
your function foo
at 0x400800
and your function bar
at 0x400896
. If
foo
calls bar
the static linker can emit machine code that literally loads
the memory at address 0x400896
which is simple and fast.
When you create a shared library you don't know up front what memory addresses
will be available to you. Your library might want to put a symbol at 0x400800
,
but if the program loading your library already has the memory address mapped
then things won't work right. You can imagine what would happen---you'd end up
with either a situation where the library jumped into arbitrary positions in the
program, or the program would jump into arbitrary positions in the library code.
Either case would lead to an unpredictable program that would quickly crash with
a segfault or illegal instruction. There are different techniques to solve this
problem, but the short version is that when you have a shared library the
generated machine code won't use hard-coded memory addresses. Instead, the
generated machine code will be such that it looks up the real address of the
target function at runtime. It is the job of the dynamic linker to make sure
that when these symbols are loaded into memory that everything is set up
properly, so any stub code will have the correct addresses.
There is some overhead for relocating code like this, so in general it's faster to use hard-coded addresses, but this technique cannot be applied to shared objects.
Things like debugging symbols can also be put into the symbol tables. Since debugging symbols do not need to be mapped into virtual memory at runtime these symbols are called non-allocatable.
To recap:
- non-relocatable code symbols will go into the
.symtab
table - non-allocatable symbols will go into the
.symtab
table - relocatable code goes into the
.dynsym
table
When the dynamic linker loads a shared object it will only look at the .dynsym
table. The same is true of
dlopen(3), which will only
find symbols that are defined in the .dynsym
table.
Position Independent Executables
You can use the -pie
or -PIE
flags to GCC to create what is called a
"position independent executable" (a.k.a. PIE). When you do this GCC will only
generate relocatable code. There is still an entry point for the program with a
hard-coded address, but all the entry point does is set things up to run the
relocatable code.
The main use case for this is
ASLR
hardening. When a PIE ASLR binary starts up the kernel picks a random virtual
memory address to load all code other than the entry point stub at. This makes
it harder to exploit a large class of security vulnerabilities common to C/C++
programs. Most Linux distributions do not compile typically binaries with this
option because there is real, measurable overhead to invoking relocatable
functions. Distributions like Debian and Ubuntu only compile particularly
security sensitive binaries (e.g. ssh
) as PIEs. (Traditional non-PIE binaries
will still use ASLR on Linux, but only for loading dynamic libraries).
There is another interesting use case here though. As I mentioned earlier, the
dynamic linker and dlopen(3)
can find symbols in the .dynsym
table but not
symbols in the .symtab
table. However, by default executables don't put their
symbols into .dynsym
. If an executable is created as a PIE the linker has the
option to put the symbols into .dynsym
table. If this is done then the symbols
will be available both to the executable as well as to the dynamic linker and
dlopen
.
By default GNU ld will not put symbols into .dynsym
for PIEs, even though the
symbols are relocatable. However, by invoking ld -E
you can ask the linker to
export the symbols as dynamic symbols. This doesn't change the generated code,
it simply adds the symbols to the .dynsym
table which takes up a small amount
of additional disk space. If you want to export only certain symbols you can use
the --dynamic-list
option to control the exported symbols.
By doing this you can create an ELF executable that can be both run on the
command line as well as loaded dynamically by dlopen(3)
. There are a lot of
strange things you can use this for. For instance, I am using this technique to
create an executable that is also a Python module. This is mostly for fun---I
could just as easily set up the build system to create an executable and a
shared object separately---but I think it's pretty neat.
I have put a simple example demonstrating
the concept on GitHub. If you modify the Makefile to not use -Wl,-E
you will
see that the dl
program will fail to load the symbols.