How To Use std::lower_bound and std::upper_bound

January 4, 2021

Two of my favorite algorithms in the C++ <algorithm> header are std::lower_bound and std::upper_bound, which are methods that do a binary search on sorted input. They are useful because they don't just return a true/false value (as std::binary_search does); they return an iterator in the input range, so they can be used to as the basis of higher-level operations on sorted inputs.

These methods are very similar but have slightly different conditions on the return value, so it's understandably difficult to remember how they differ and when to use each. I think that remembering when to use each is easier than remembering the formal conditions for their return values, so that's what I'll present here. This also has the nice property once you know when each, it's easy to work backwards exactly which element would be returned by each for some input.

When To Use These Methods

First a quick digression: why would you want to use any of these methods at all, instead of using a proper associative container like std::map or std::set? After all, these containers also provide logarithmic access to elements using binary search, and can also be updated in log time.

In general, if you have an immutable collection you might consider storing it in a sorted std::vector (or another linear collection like std::array) instead of a map/set because vectors store all their data in one contiguous memory region. This means using a sorted vector will use less memory than a map/set, and the vector will have better data locality.

You can also use these methods if you have a container that is mostly immutable, but which still needs to support occasional insertions. In this situation the inserts into the vector will be less efficient than inserting into a std::map or std::set, but if the data access pattern is mostly reads then it might still be more beneficial to use a vector.

Both of these patterns come up quite often. For example, when using the STL set methods (e.g. std::set_intersection) you could use a std::set as your output iterator, but if you plan on doing read-only operations with the intersection it will be more efficient to output to a std::vector. What if you want to compute a set intersection, but you sometimes also want to insert an extra item into the the intersection? This is an example of a pattern that is mostly immutable, because you're planning on only doing one extra write after creating the collection. It turns out that this pattern can be supported efficiently using std::upper_bound as will be described later.

One final (but very important) note: even though binary search has logarithmic run time, in practice it is faster to do linear scans for small collections (e.g. a vector of up to 100 ints). There are a number of reasons for this. The first is that modern CPUs have a hardware that detects linear memory scans, and the CPU can prefetch memory when this linear access pattern is detected. When this works properly it means that data will already be loaded into CPU caches by the time it's needed, avoiding stalls when loading data from main memory. Modern CPUs also have branch prediction hardware, and branching patterns that are highly predictable can be greatly accelerated by pipelining the predicted of path a branch. Binary search is an anathema to both prefetching and branch prediction, since data is accessed in an essentially random pattern, and at each level in the binary search the search will branch left or right in an unpredictable manner. Finally, in some cases compilers can automatically vectorize linear searches, which improves the IPC of searching code. If you're interested in this topic, Andrei Alexandrescu touches on this topic in his excellent CppCon 2019 talk Speed Is Found In The Minds Of People.

`std::binary_search`

Since I've already mentioned it in passing, I'll first recap std::binary_search. This is a method that takes an input iterator, output iterator, and search value, and using a binary search it returns true if the value is found in the input range and false if the value was not found. The most common reason to use this would be if you have a sorted non-associative container like std::vector, and you want to use binary search to check if it contains some value. This is conceptually similar to the new contains() method added to certain associative containers in C++20.

`std::lower_bound`

Next I would suggest learning std::lower_bound, since you'll probably use this more frequently than std::upper_bound. Formally, it returns the lower bound of a search value, i.e. the first element in the provided range that is not less than the search value. The return value could be to the end of the range, or to an iterator that compares greater than the search value, or an iterator that compares equal to the search input; but it will never return an iterator less than the search value.

Instead of remembering how the function works by this formal definition, I think it's easier to remember that you should use std::lower_bound when you want to access the item found by a binary search. For example, you might want to search a vector for some item, and if it's found then you want to do something with the item:

// Find item by binary search.
auto it = std::lower_bound(vec.begin(), vec.end(), item);

// Must check that the iterator returned actually equals item before use.
if (it != vec.end() && *it == item) {
    it->SomeMethod();  // found the item, do something with it
}

In fact, cppreference suggests the following as a possible implementation of std::binary_search:

// Possible implementation of std::binary_search.
template<class ForwardIt, class T>
bool binary_search(ForwardIt first, ForwardIt last, const T& value)
{
    first = std::lower_bound(first, last, value);
    return (!(first == last) && !(value < *first));
}

If the input container contains multiple values equal to the search item, the iterator returned by std::lower_bound will be the first item equal to the search input. Therefore if you want to do some work with all items matching your search, you simply increment the returned iterator until you reach the end of the container or an item not equal to the search input:

// Call a method for all items that match the search item.
for (auto it = std::lower_bound(vec.begin(), vec.end(), item);
     it != vec.end() && *it == item; ++it) {
    it->SomeMethod();
}

`std::upper_bound`

Finally we get to std::upper_bound. Formally, it returns the first item strictly greater than the search input (or end if no item is greater). The return value could be to the end of the range, or to an iterator that compares greater than the search value, but it will never return an iterator equal to the search value or less than the search value. If you look at the previous code example that uses std::lower_bound with a for-loop, std::upper_bound on the same input range would return the value of it when the for-loop terminates. In fact, we could implement std::upper_bound this way if we wanted to:

// Possible implementation of std::upper_bound.
template <typename ForwardIt, typename T>
upper_bound(ForwardIt begin, ForwardIt end, const T &value) {
    ForwardIt it = std::lower_bound(begin, end, value);
    while (it != end && *it == value) {
      ++it;
    }
    return it;
}

Since std::upper_bound will never return an iterator equal to the search value, you might wonder why anyone would want to use this method. The reason for the existence of std::upper_bound is that it returns the iterator that can be used as the insertion position for an insertion sort, meaning you should use std::upper_bound when you want to insert an item into an already-sorted container. It has this property because the insert() method of std::vector and other STL containers inserts an item before the iterator provided to insert, and std::upper_bound returns an iterator after the search item.

I usually use std::upper_bound when I have a sorted collection that is mostly immutable, but when I still want to rarely insert into the collection. As already discussed, if you're doing a lot of insertions, a tree-based associative container like std::set or absl::btree_set will provide sorted semantics and will be much more efficient than a vector, but for a read-mostly access pattern using a vector may be more efficient.

Here are two methods that take an already-sorted vector, and insert a single item into the vector, maintaining the sorted property of the vector (i.e. after calling this method, calling std::is_sorted on the vector will still return true). This first method uses the naive approach of appending + sorting, the second method uses std::upper_bound to insert directly into the appropriate position in the vector:

// XXX: Bad way to insert into a sorted vector.
template <typename T>
void BadInsertIntoSortedVector(std::vector<T> &vec, const T &value) {
    vec.emplace_back(value);
    std::sort(vec.begin(), vec.end());
}

// Good way to insert into a sorted vector.
template <typename T>
void InsertIntoSortedVector(std::vector<T> &vec, const T &value) {
    auto it = std::upper_bound(vec.begin(), vec.end(), value);
    vec.insert(it, value);
}

In fact, for small collections (up to 1000 elements or so) pretty much all STL implementations of std::sort use insertion sort, so for a small collection the two approaches above are almost equivalent. However, emplace_back/push_back followed by std::sort is still worse than using std::upper_bound, since std::sort still has to traverse the entire input range to find the first non-sorted position (the item that was just appended).