Two of my favorite algorithms in the C++
`<algorithm>`

header are
`std::lower_bound`

and
`std::upper_bound`

,
which are methods that do a binary search on sorted input. They are useful
because they don't just return a true/false value (as
`std::binary_search`

does); they return an iterator in the input range, so they can be used to as the
basis of higher-level operations on sorted inputs.

These methods are very similar but have slightly different conditions on the
return value, so it's understandably difficult to remember how they differ and
when to use each. I think that remembering **when** to use each is easier than
remembering the formal conditions for their return values, so that's what I'll
present here. This also has the nice property once you know when each, it's easy
to work backwards exactly which element would be returned by each for some
input.

### When To Use These Methods

First a quick digression: why would you want to use any of these methods at all,
instead of using a proper associative container like `std::map`

or `std::set`

?
After all, these containers also provide logarithmic access to elements using
binary search, and can also be updated in log time.

In general, if you have an **immutable** collection you might consider storing
it in a sorted `std::vector`

(or another linear collection like `std::array`

)
instead of a map/set because vectors store all their data in one contiguous
memory region. This means using a sorted vector will use less memory than a
map/set, and the vector will have better data locality.

You can also use these methods if you have a container that is **mostly
immutable**, but which still needs to support occasional insertions. In this
situation the inserts into the vector will be less efficient than inserting into
a `std::map`

or `std::set`

, but if the data access pattern is mostly reads then
it might still be more beneficial to use a vector.

Both of these patterns come up quite often. For example, when using the STL set
methods (e.g.
`std::set_intersection`

)
you could use a `std::set`

as your output iterator, but if you plan on doing
read-only operations with the intersection it will be more efficient to output
to a `std::vector`

. What if you want to compute a set intersection, but you
sometimes also want to insert an extra item into the the intersection? This is
an example of a pattern that is **mostly immutable**, because you're planning on
only doing one extra write after creating the collection. It turns out that this
pattern can be supported efficiently using `std::upper_bound`

as will be
described later.

One final (but very important) note: even though binary search has logarithmic
run time, **in practice it is faster to do linear scans for small collections**
(e.g. a vector of up to 100
ints). There are a number of
reasons for this. The first is that modern CPUs have a hardware that detects
linear memory scans, and the CPU can prefetch memory when this linear access
pattern is detected. When this works properly it means that data will already be
loaded into CPU caches by the time it's needed, avoiding stalls when loading
data from main memory. Modern CPUs also have branch prediction hardware, and
branching patterns that are highly predictable can be greatly accelerated by
pipelining the predicted of path a branch. Binary search is an anathema to both
prefetching and branch prediction, since data is accessed in an essentially
random pattern, and at each level in the binary search the search will branch
left or right in an unpredictable manner. Finally, in some cases compilers can
automatically vectorize linear searches, which improves the IPC of searching
code. If you're interested in this topic, Andrei Alexandrescu touches on this
topic in his excellent CppCon 2019 talk Speed Is Found In The Minds Of
People.

`std::binary_search`

Since I've already mentioned it in passing, I'll first recap
`std::binary_search`

.
This is a method that takes an input iterator, output iterator, and search
value, and using a binary search it returns true if the value is found in the
input range and false if the value was not found. The most common reason to use
this would be if you have a sorted non-associative container like `std::vector`

,
and you want to use binary search to check if it contains some value. This is
conceptually similar to the new
`contains()`

method
added to certain associative containers in C++20.

`std::lower_bound`

Next I would suggest learning
`std::lower_bound`

,
since you'll probably use this more frequently than `std::upper_bound`

.
Formally, it returns the lower bound of a search value, i.e. the first element
in the provided range that is *not less* than the search value. The return value
could be to the end of the range, or to an iterator that compares greater than
the search value, or an iterator that compares equal to the search input; but it
will *never* return an iterator less than the search value.

Instead of remembering how the function works by this formal definition, I think
it's easier to remember that you should **use std::lower_bound when you want
to access the item found by a binary search**. For example, you might want to
search a vector for some item, and if it's found then you want to do something
with the item:

```
// Find item by binary search.
auto it = std::lower_bound(vec.begin(), vec.end(), item);
// Must check that the iterator returned actually equals item before use.
if (it != vec.end() && *it == item) {
it->SomeMethod(); // found the item, do something with it
}
```

In fact, cppreference suggests the following as a possible implementation of
`std::binary_search`

:

```
// Possible implementation of std::binary_search.
template<class ForwardIt, class T>
bool binary_search(ForwardIt first, ForwardIt last, const T& value)
{
first = std::lower_bound(first, last, value);
return (!(first == last) && !(value < *first));
}
```

If the input container contains multiple values equal to the search item, the
iterator returned by `std::lower_bound`

will be the *first* item equal to the
search input. Therefore if you want to do some work with all items matching your
search, you simply increment the returned iterator until you reach the end of
the container or an item not equal to the search input:

```
// Call a method for all items that match the search item.
for (auto it = std::lower_bound(vec.begin(), vec.end(), item);
it != vec.end() && *it == item; ++it) {
it->SomeMethod();
}
```

`std::upper_bound`

Finally we get to
`std::upper_bound`

.
Formally, it returns the first item *strictly greater* than the search input (or
`end`

if no item is greater). The return value could be to the end of the range,
or to an iterator that compares greater than the search value, but it will
*never* return an iterator equal to the search value or less than the search
value. If you look at the previous code example that uses `std::lower_bound`

with a for-loop, `std::upper_bound`

on the same input range would return the
value of `it`

when the for-loop terminates. In fact, we could implement
`std::upper_bound`

this way if we wanted to:

```
// Possible implementation of std::upper_bound.
template <typename ForwardIt, typename T>
upper_bound(ForwardIt begin, ForwardIt end, const T &value) {
ForwardIt it = std::lower_bound(begin, end, value);
while (it != end && *it == value) {
++it;
}
return it;
}
```

Since `std::upper_bound`

will never return an iterator equal to the search
value, you might wonder why anyone would want to use this method. The reason for
the existence of `std::upper_bound`

is that it returns the iterator that can be
used as the insertion position for an insertion
sort, meaning you should **use
std::upper_bound when you want to insert an item into an already-sorted
container**. It has this property because the

`insert()`

method
of `std::vector`

and other STL containers inserts an item **before**the iterator provided to insert, and

`std::upper_bound`

returns an iterator
**after**the search item.

I usually use `std::upper_bound`

when I have a sorted collection that is
**mostly immutable**, but when I still want to rarely insert into the
collection. As already discussed, if you're doing a lot of insertions, a
tree-based associative container like `std::set`

or `absl::btree_set`

will
provide sorted semantics and will be much more efficient than a vector, but for
a read-mostly access pattern using a vector may be more efficient.

Here are two methods that take an already-sorted vector, and insert a single
item into the vector, maintaining the sorted property of the vector (i.e. after
calling this method, calling
`std::is_sorted`

on the
vector will still return true). This first method uses the naive approach of
appending + sorting, the second method uses `std::upper_bound`

to insert
directly into the appropriate position in the vector:

```
// XXX: Bad way to insert into a sorted vector.
template <typename T>
void BadInsertIntoSortedVector(std::vector<T> &vec, const T &value) {
vec.emplace_back(value);
std::sort(vec.begin(), vec.end());
}
// Good way to insert into a sorted vector.
template <typename T>
void InsertIntoSortedVector(std::vector<T> &vec, const T &value) {
auto it = std::upper_bound(vec.begin(), vec.end(), value);
vec.insert(it, value);
}
```

In fact, for small collections (up to 1000 elements or so) pretty much all STL
implementations of `std::sort`

use insertion sort, so for a small collection the
two approaches above are almost equivalent. However, `emplace_back/push_back`

followed by `std::sort`

is *still* worse than using `std::upper_bound`

, since
`std::sort`

still has to traverse the entire input range to find the first
non-sorted position (the item that was just appended).