C Functions Without Arguments

A little known fact about C is that the following two declarations are not considered equivalent by the C compiler:

void hello1() { puts("Hello, world!"); }
void hello2(void) { puts("Hello, world!"); }

The C compiler will consider hello1() to be a variadic function, and will consider hello2() to be a function that takes no arguments.

Now what's interesting here is that if you don't use the macros va_start() and va_arg(), the compiler won't actually generate the code for the two funtions differently. In other words, because hello1() is not actually variadic, and we didn't put any variadic argument unpacking code into it, the object code for these two will be the same. However, the calling convention is affected.

Looking At Some Object Code

Consider the following program:

#include <stdio.h>

void hello1() { puts("Hello, world!"); }

void hello2(void) { puts("Hello, world!"); }

int main(int argc, char **argv) {
  hello1();
  hello2();
  return 0;
}

I'm going to analyze what happens with gcc -O1, using GCC 5.3.1. (Note: at -O2 and above GCC gets too smart and will just directly embed puts() calls into main(), so that's why using -O1 is necessary here.)

We get the following in the disassembled output:

0000000000400536 <hello1>:
  400536:       48 83 ec 08             sub    $0x8,%rsp
  40053a:       bf 10 06 40 00          mov    $0x400610,%edi
  40053f:       e8 cc fe ff ff          callq  400410 <puts@plt>
  400544:       48 83 c4 08             add    $0x8,%rsp
  400548:       c3                      retq

0000000000400549 <hello2>:
  400549:       48 83 ec 08             sub    $0x8,%rsp
  40054d:       bf 10 06 40 00          mov    $0x400610,%edi
  400552:       e8 b9 fe ff ff          callq  400410 <puts@plt>
  400557:       48 83 c4 08             add    $0x8,%rsp
  40055b:       c3                      retq

000000000040055c <main>:
  40055c:       48 83 ec 08             sub    $0x8,%rsp
  400560:       b8 00 00 00 00          mov    $0x0,%eax
  400565:       e8 cc ff ff ff          callq  400536 <hello1>
  40056a:       e8 da ff ff ff          callq  400549 <hello2>
  40056f:       b8 00 00 00 00          mov    $0x0,%eax
  400574:       48 83 c4 08             add    $0x8,%rsp
  400578:       c3                      retq
  400579:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)

This isn't that interesting. As you can see the object code for hello1 and hello2 are exactly identical (other than the immediate argument passed to the relative CALL instruction), since neither is truly variadic nor takes any arguments.

But what if we change the order we call them in? If we change the definition of main() like this:

int main(int argc, char **argv) {
  hello1();
  hello2();
  return 0;
}

Then the generated code for main() will be slightly different. The new code is like this:

000000000040055c <main>:
  40055c:       48 83 ec 08             sub    $0x8,%rsp
  400560:       e8 e4 ff ff ff          callq  400549 <hello2>
  400565:       b8 00 00 00 00          mov    $0x0,%eax
  40056a:       e8 c7 ff ff ff          callq  400536 <hello1>
  40056f:       b8 00 00 00 00          mov    $0x0,%eax
  400574:       48 83 c4 08             add    $0x8,%rsp
  400578:       c3                      retq
  400579:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)

You can see something slightly different. Previously the call looked like:

  400560:       b8 00 00 00 00          mov    $0x0,%eax
  400565:       e8 cc ff ff ff          callq  400536 <hello1>
  40056a:       e8 da ff ff ff          callq  400549 <hello2>

But now it looks like:

  400560:       e8 e4 ff ff ff          callq  400549 <hello2>
  400565:       b8 00 00 00 00          mov    $0x0,%eax
  40056a:       e8 c7 ff ff ff          callq  400536 <hello1>

What's interesting here is that we can see that %eax is being cleared before the call to hello1(). But it's not cleared before the call to hello2().

If we remove the call to hello1() altogether, it's even more obvious what's happening. Our new main() is

int main(int argc, char **argv) {
  hello2();
  return 0;
}

And the new generated code is:

000000000040055c <main>:
  40055c:       48 83 ec 08             sub    $0x8,%rsp
  400560:       e8 e4 ff ff ff          callq  400549 <hello2>
  400565:       b8 00 00 00 00          mov    $0x0,%eax
  40056a:       48 83 c4 08             add    $0x8,%rsp
  40056e:       c3                      retq
  40056f:       90                      nop

As you can see here %eax is never cleared at all before calling hello2().

The reason this happens is because on x86-64, the calling ABI specifies that %al (i.e. the bottom-most byte of %eax) holds the number of vector registers that are used in the variadic function call. The vector registers are used when you pass variadic arguments in that contain floating point numbers. So even though the generate code for hello1() is exactly the same as the generated code for hello2(), and does not actually inspect the contents of %al, the caller of hello1() must clear the %eax register before calling hello1().

This is an extra instruction that has to happen every time you call a function like this. In practice, the overhead of this is so small that you probably wouldn't be able to measure it. In fact, modern Intel CPUs can execute multiple integer operations operations in a single clock cycle (as long as there are no register dependencies), so there's a good chance that in some of these cases there would literally be no overhead. For instance, in the first example we saw the instruction stream

  40055c:       48 83 ec 08             sub    $0x8,%rsp
  400560:       b8 00 00 00 00          mov    $0x0,%eax

A modern Intel CPU will execute both of these instructions simultaneously, in a single clock cycle, since there is no data dependency between %rsp and %eax.

Making a "Variadic" Call

However, what is scarier is that because of this rule if you accidentally pass arguments to hello1() the compiler won't generate an error or warning, even if you compile with -Wall! For instance, if you compile this program:

#include <stdio.h>

void hello1() { puts("Hello, world!\n"); }

int main(int argc, char **argv) {
  hello1(1);
  return 0;
}

Then gcc -Wall not throw a warning even though this is clearly a mistake. The code generated is interesting in this example:

0000000000400536 <hello1>:
  400536:       48 83 ec 08             sub    $0x8,%rsp
  40053a:       bf 00 06 40 00          mov    $0x400600,%edi
  40053f:       e8 cc fe ff ff          callq  400410 <puts@plt>
  400544:       48 83 c4 08             add    $0x8,%rsp
  400548:       c3                      retq

0000000000400549 <main>:
  400549:       48 83 ec 08             sub    $0x8,%rsp
  40054d:       bf 01 00 00 00          mov    $0x1,%edi
  400552:       b8 00 00 00 00          mov    $0x0,%eax
  400557:       e8 da ff ff ff          callq  400536 <hello1>
  40055c:       b8 00 00 00 00          mov    $0x0,%eax
  400561:       48 83 c4 08             add    $0x8,%rsp
  400565:       c3                      retq
  400566:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
  40056d:       00 00 00

As you can see, the compiler will load 1 into %edi even though hello1() doesn't actually read that value; in fact, hello1() immediately clobbers %edi with the pointer to the string literal that is passed to puts().

Making a Floating Point Variadic Call

Here's what happens if we use a vector register, this time by calling it with a floating point number:

#include <stdio.h>

void hello1() { puts("Hello, world!\n"); }

int main(int argc, char **argv) {
  hello1(1.0f);
  return 0;
}

And the generated code is:

0000000000400536 <hello1>:
  400536:       48 83 ec 08             sub    $0x8,%rsp
  40053a:       bf 00 06 40 00          mov    $0x400600,%edi
  40053f:       e8 cc fe ff ff          callq  400410 <puts@plt>
  400544:       48 83 c4 08             add    $0x8,%rsp
  400548:       c3                      retq

0000000000400549 <main>:
  400549:       48 83 ec 08             sub    $0x8,%rsp
  40054d:       f2 0f 10 05 bb 00 00    movsd  0xbb(%rip),%xmm0        # 400610 <__dso_handle+0x18>
  400554:       00
  400555:       b8 01 00 00 00          mov    $0x1,%eax
  40055a:       e8 d7 ff ff ff          callq  400536 <hello1>
  40055f:       b8 00 00 00 00          mov    $0x0,%eax
  400564:       48 83 c4 08             add    $0x8,%rsp
  400568:       c3                      retq
  400569:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)

As you can see, here the literal value 1 is stored into %eax to indicate that the parameter passed in is stored in the vector register %xmm0.

What Makes This Really Weird

This is already kind of strange as it is. But here's what's really interesting. You'll notice that in the example where we invoked hello1(1) that the argument was stored into %edi, but nothing else changed. That's because in C, when you call a variadic function the calling ABI doesn't actually tell you how many arguments you got!

So when you make a call like this:

printf("Hello from %d\n", getpid());

The way that printf() knows that it was passed a single argument is by actually scanning the format string for % formatters (%d in this case), and then counting those. This is why if you call printf() with the too few arguments you can can get unexpected things printed to stdout, since you'll print whatever happens to be in the registers that printf() is looking for that argument in.

In the very uncommon case when you write a variadic function that doesn't have something like a format string, the convention typically used is that that one of the fixed arguments to the function holds the number of arguments to expect. For instance, you might have a function declared like:

void magic(int num_args, ...);

And then to actually call it you'd have to use an invocation like:

magic(3, 1.0f, "foo", 42);

Then the implementation of magic() will have to use the va_start() and va_arg() macros and know to stop calling va_arg() purely based on inspecting num_arg, since there is literally no other way for magic() to know how many variadic arguments you passed in.

This means that considering a function declaration like:

void magic();

as variadic makes almost no sense. Because there's no a priori way for the implementation of this version of magic() to know how many variadic arguments it was passed in. In fact, to use the va_start() macro you are required to pass it the last non-variadic argument in the function signature. That means that even if you did come up with a weird protocol for figuring out when va_arg() should stop being called, you'd have to no way to initialize your va_list using the va_start() macro. Really the only way to do this at all would be to write inline assembler in the C function, which wouldn't make a lot of sense.

C is a rather beautiful language, but it definitely has its warts.