Don't use div(3)

September 16, 2021

The C math library has functions div(3) and relatives ldiv(3) and lldiv(3) that calculate division and modulus operations in one function. That is, instead of some code like:

int quot = a / b;
int rem = a % b;
// Do something with quot and rem

You could instead write:

div_t result = div(a, b);
// Do something with result.quot and result.rem

This might seem kind of useless, but maybe libc does something smart to make the div(3) function more optimized? For example, maybe it computes both the quotient and the remainder at once. It's a nice idea, but this isn't what actually happens. Here's the glibc implementation of div:

div_t div (int numer, int denom) {
  div_t result;
  result.quot = numer / denom;
  result.rem = numer % denom;
  return result;
}

So it turns out that div(3) is kind of useless, because it doesn't save anything over using the regular / and % operators, except perhaps being one line of code shorter. On x86 the div instruction calculates both the quotient and the remainder at once, so even though the implementation above looks like it does two divisions (one for / and the other for %), the compiler will actually just emit a single div instruction that does both operations. When I was playing with this Clang actually generated two branches, each doing one div, and only one of which is taken; but that's basically the same thing.

In fact, things are even worse if the division is by a constant. It's well known that integer division where the denominator is a constant known at compile-time can be implemented using multiply/shift/add operations instead of using a bona fide div instruction. This means that if you use div(3) where the denominator is a constant, the compiler will needlessly implement the division using an actual div instruction on x86, rather than the multiply/shift trick.

Here's a simplified example showing the problem (Godbolt link). The goal here is to take a 64-bit hash of a string and turn it into a string short code with len characters:

#include <array>
#include <cmath>
#include <string>

#define USE_DIV 0

std::string Shorten(uint64_t hash, size_t len) {
  constexpr std::array<char, 26> letters{
      'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
      'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'};

  std::string s;
  s.reserve(len);
  for (size_t i = 0; i < len; i++) {
#if USE_DIV
    auto result = std::lldiv(hash, letters.size());
    s.push_back(result.rem);
    hash = result.quot;
#else
    s.push_back(letters[hash % letters.size()]);
    hash /= letters.size();
#endif
  }
  return s;
}

The .size() method on a std::array is constexpr, so the compiler will replace letters.size() with the integer literal 26. When we have #define USE_DIV 0 then the compiler will recognize that the division and modulus operations are by the same constant value, and will implement both operations using a single instance of the shift/multiply trick. If you instead change the code to #define USE_DIV 1 then the compiler will instead use an actual function call to div(3), which will then use the (expensive) x86 div instruction.