Recently I had the idea of trying to combine two of my favorite things: bash and
assembly. My original idea was to try to make a way to inline assembly code into
bash scripts. While I was researching this project I discovered Tavis Ormandy's
ctype.sh. This is a really clever project that makes use of
the nearly undocumented enable -f
feature in bash to add new shell builtins
that expose the functionality provided by
dlopen(3). The project is
extremely ingenious, and I would recommend browsing the code to anyone who
interested in these kinds of low level things.
At first I wanted to try to extend bash in the same way that Tavis did to
provide new builtins for inlining assembly code. It occurred to me, however,
that it's not actually necessary to do this as long as the assembly can be built
into a shared object. Furthermore I realized that because of the lack of a real
type system in both assembly and bash that trying to actually inline assembly
into bash would be really difficult for all but the most trivial functions. This
is because bash primarily interacts with string data, and doing string
manipulation and memory allocation in x86 is not fun. Tavis' code simplifies a
lot of this already by implementing a type marshaling system. Plus it's already
possible to inline assembly into C via the asm
keyword, so by simply coming up
with a way to inline C code I'd also be able to inline assembly as well.
Fortunately bash already has a syntax feature that makes writing "inline" code
feasible: here documents.
Here documents are a useful way to write multi-line literal statements without
having to do lots of escaping. Using this feature we can add "inline" C code to
our bash script, and then generate a DSO with a C compiler (both GCC and Clang
shoudl work). The DSO can be loaded by ctypes.sh
, and then our code will run.
I have an example demonstrating this here: github.com/eklitzke/c.sh. In this example I have written a "hello world" function in C and a trivial implementation of popcnt (which counts the bits in a word) in inline assembly. Hopefully the code should be pretty straightforward to follow.
I have some more ambitious ideas for this project. I'd like to try to embed the Python interpreter into a bash process. This would allow one to write "inline" Python code in a bash script, and then the Python functions defined would be executable from the context of the bash script.