A while back I learned a curious mathematical fact: 666 is the sum of the first 36 natural numbers. Formally,
$$\sum_{i=1}^{37} i = 666$$
I'm a dutiful servant of the Dark Lord, so I made my Twitter bio the Python equivalent:
sum(range(37))
This code is straightforward and easily understood. The range(37)
expression
creates a generator yielding the numbers from 0 to 36, and sum()
of course
sums them. But, it's a tad banal for my taste.
Somehow I got the idea to reproduce the series as a program written for
[dc](https://en.wikipedia.org/wiki/Dc_(computer_program)). If you're fortunate
enough not to be familiar with dc
, it's a "desk calculator" program that's old
enough to predate Unix and C. It's also programmable, by way of an esoteric
macro language.
Here's the simplest dc
program I could come up with to compute the series:
# Sum the numbers 1 to 36 (inclusive).
dc -e '36[d1-d1<F+]dsFxp'
How It Works
The dc -e
invocation just means that the code to evaluate is passed as the
next argument (instead of reading input interactively from stdin). So the only
part we need to worry about is the actual dc
code itself, which is:
36[d1-d1<F+]dsFxp
Since dc
is a reverse-polish calculator, the program works with a stack of
values, and operators that manipulate the stack. As you might guess, the first
expression (literally 36
) pushes 36 onto the stack. Also as you might guess,
this is pushed onto the stack because we intend to count down from that value
when accumulating.
The next part of the expression is the bracketed term [d1-d1<F+]
. Anything
between square brackets is considered to be a string literal. As is often the
case in dc
programs, the string literal is itself a dc
expression that will
be evaluated later as a macro; we'll revisit this in a moment. For now, just
know that a bracketed expression pushes that string value onto the stack. After
parsing this term, the stack has two values: the integer 36 followed by the
string d1-d1<F+
.
Next is the character d
. This simply duplicates the value on the top of the
stack. After encountering this term, the stack contains 36 and then the string
d1-d1<F+
twice.
dc
has 256 registers, each named by a single character. The s
command pops
the top of the stack into a register named by the next character, which is F
in this case. So after evaluating this term, our string d1-d1<F+
will be in
the F
register (and also still on top of the stack). Note that by convention
I've chosen to use a register whose name doesn't collide with a dc
command,
but that's not a requirement. Mercifully, I didn't use d
register, which would
have yielded the equivalent (but even more obfuscated) program
36[d1-d1<d+]dsdxp
.
The x
term is next, and it evaluates the macro that is currently on the top of
the stack. More on this in a moment, since the final term is simple enough: p
just prints the top value on the stack. Which brings us back to x
: what
happens when it's evaluated?
The Macro
The macro, which as you should recall has also been stored into the F
register, is the heart of the program. It looks arcane, but it's pretty simple.
We've already encountered d
before: it duplicates the value on top of the
stack. The next expression is the literal 1
, which simply pushes the value 1
on top of the stack. Therefore the first time the macro is evaluated after
reaching the first d1
the stack will contain: 36, 36, 1.
The -
term pops the top two values from the stack, calculates the subtraction,
and pushes that value onto the stack. In this case that will be 36 - 1, so after
evaluating -
the stack will contain: 36, 35.
Now we get another d1
term, which once again duplicates the top value and then
pushes 1. The stack will now contain: 36, 35, 35, 1.
The term <F
pops the two values at the top of the stack and then evaluates the
macro if the less-than condition is satisfied. In this case 1 is less than 35,
so F will be invoked recursively. When F
is invoked this time, the stack
contains: 36, 35.
The final term is +
which adds two top two stack values. But before the
addition can happen, F
has to be evaluated again.
The first time through the loop we started with 36 on the stack and ended with 36, 35. As you can imagine, the next time through the stack ends up as 36, 35, 34. This continues until the base condition is hit, at which case the stack will contain the values from the full range, with 36 at the bottom of the stack and 1 at the top.
After the base condition is hit the stack unwinds, with +
being evaluated for
each pair of terms. This is a "right" fold through the stack, somewhat like
foldr
in Haskell or reduce()
in Python 2 (which is actually a "left" fold).
Observations
After stumbling across arcane code, one hopes that there's some purpose behind it. Sometimes code is obtuse because it has inherent complexity. Other times there's a performance argument for writing code in a strange way (or using a strange language).
None of those apply here. dc
is just senseless violence. In fact, the dc
program here is less efficient than a straightforward loop written in
basically any other programming language. Consider the equivalent C program
written using a for
loop:
int sum = 0;
for (int i = 1; i < 37; i++)
sum += i;
An expression like this has $O(n)$ running time, but only $O(1)$ memory/storage.
Likewise, in Python 3 the expression sum(range(37))
evaluates the sum of a
generator expression, which also takes a fixed amount of storage. But that's
not the case with our dc
program, which has to build up the stack before
evaluating it. The dc
program is strictly worse, as it takes $O(n)$ running
time and $O(n)$ memory.
Exercise for the reader: Rewrite the program to use a fixed amount of stack space. There are a couple of ways to do this, but unfortunately at the cost of some human complexity.