Two of my favorite algorithms in the C++
<algorithm>
header are
std::lower_bound
and
std::upper_bound
,
which are methods that do a binary search on sorted input. They are useful
because they don't just return a true/false value (as
std::binary_search
does); they return an iterator in the input range, so they can be used to as the
basis of higher-level operations on sorted inputs.
These methods are very similar but have slightly different conditions on the return value, so it's understandably difficult to remember how they differ and when to use each. I think that remembering when to use each is easier than remembering the formal conditions for their return values, so that's what I'll present here. This also has the nice property once you know when each, it's easy to work backwards exactly which element would be returned by each for some input.
When To Use These Methods
First a quick digression: why would you want to use any of these methods at all,
instead of using a proper associative container like std::map
or std::set
?
After all, these containers also provide logarithmic access to elements using
binary search, and can also be updated in log time.
In general, if you have an immutable collection you might consider storing
it in a sorted std::vector
(or another linear collection like std::array
)
instead of a map/set because vectors store all their data in one contiguous
memory region. This means using a sorted vector will use less memory than a
map/set, and the vector will have better data locality.
You can also use these methods if you have a container that is mostly
immutable, but which still needs to support occasional insertions. In this
situation the inserts into the vector will be less efficient than inserting into
a std::map
or std::set
, but if the data access pattern is mostly reads then
it might still be more beneficial to use a vector.
Both of these patterns come up quite often. For example, when using the STL set
methods (e.g.
std::set_intersection
)
you could use a std::set
as your output iterator, but if you plan on doing
read-only operations with the intersection it will be more efficient to output
to a std::vector
. What if you want to compute a set intersection, but you
sometimes also want to insert an extra item into the the intersection? This is
an example of a pattern that is mostly immutable, because you're planning on
only doing one extra write after creating the collection. It turns out that this
pattern can be supported efficiently using std::upper_bound
as will be
described later.
One final (but very important) note: even though binary search has logarithmic run time, in practice it is faster to do linear scans for small collections (e.g. a vector of up to 100 ints). There are a number of reasons for this. The first is that modern CPUs have a hardware that detects linear memory scans, and the CPU can prefetch memory when this linear access pattern is detected. When this works properly it means that data will already be loaded into CPU caches by the time it's needed, avoiding stalls when loading data from main memory. Modern CPUs also have branch prediction hardware, and branching patterns that are highly predictable can be greatly accelerated by pipelining the predicted of path a branch. Binary search is an anathema to both prefetching and branch prediction, since data is accessed in an essentially random pattern, and at each level in the binary search the search will branch left or right in an unpredictable manner. Finally, in some cases compilers can automatically vectorize linear searches, which improves the IPC of searching code. If you're interested in this topic, Andrei Alexandrescu touches on this topic in his excellent CppCon 2019 talk Speed Is Found In The Minds Of People.
std::binary_search
Since I've already mentioned it in passing, I'll first recap
std::binary_search
.
This is a method that takes an input iterator, output iterator, and search
value, and using a binary search it returns true if the value is found in the
input range and false if the value was not found. The most common reason to use
this would be if you have a sorted non-associative container like std::vector
,
and you want to use binary search to check if it contains some value. This is
conceptually similar to the new
contains()
method
added to certain associative containers in C++20.
std::lower_bound
Next I would suggest learning
std::lower_bound
,
since you'll probably use this more frequently than std::upper_bound
.
Formally, it returns the lower bound of a search value, i.e. the first element
in the provided range that is not less than the search value. The return value
could be to the end of the range, or to an iterator that compares greater than
the search value, or an iterator that compares equal to the search input; but it
will never return an iterator less than the search value.
Instead of remembering how the function works by this formal definition, I think
it's easier to remember that you should use std::lower_bound
when you want
to access the item found by a binary search. For example, you might want to
search a vector for some item, and if it's found then you want to do something
with the item:
// Find item by binary search.
auto it = std::lower_bound(vec.begin(), vec.end(), item);
// Must check that the iterator returned actually equals item before use.
if (it != vec.end() && *it == item) {
it->SomeMethod(); // found the item, do something with it
}
In fact, cppreference suggests the following as a possible implementation of
std::binary_search
:
// Possible implementation of std::binary_search.
template<class ForwardIt, class T>
bool binary_search(ForwardIt first, ForwardIt last, const T& value)
{
first = std::lower_bound(first, last, value);
return (!(first == last) && !(value < *first));
}
If the input container contains multiple values equal to the search item, the
iterator returned by std::lower_bound
will be the first item equal to the
search input. Therefore if you want to do some work with all items matching your
search, you simply increment the returned iterator until you reach the end of
the container or an item not equal to the search input:
// Call a method for all items that match the search item.
for (auto it = std::lower_bound(vec.begin(), vec.end(), item);
it != vec.end() && *it == item; ++it) {
it->SomeMethod();
}
std::upper_bound
Finally we get to
std::upper_bound
.
Formally, it returns the first item strictly greater than the search input (or
end
if no item is greater). The return value could be to the end of the range,
or to an iterator that compares greater than the search value, but it will
never return an iterator equal to the search value or less than the search
value. If you look at the previous code example that uses std::lower_bound
with a for-loop, std::upper_bound
on the same input range would return the
value of it
when the for-loop terminates. In fact, we could implement
std::upper_bound
this way if we wanted to:
// Possible implementation of std::upper_bound.
template <typename ForwardIt, typename T>
upper_bound(ForwardIt begin, ForwardIt end, const T &value) {
ForwardIt it = std::lower_bound(begin, end, value);
while (it != end && *it == value) {
++it;
}
return it;
}
Since std::upper_bound
will never return an iterator equal to the search
value, you might wonder why anyone would want to use this method. The reason for
the existence of std::upper_bound
is that it returns the iterator that can be
used as the insertion position for an insertion
sort, meaning you should use
std::upper_bound
when you want to insert an item into an already-sorted
container. It has this property because the
insert()
method
of std::vector
and other STL containers inserts an item before the
iterator provided to insert, and std::upper_bound
returns an iterator
after the search item.
I usually use std::upper_bound
when I have a sorted collection that is
mostly immutable, but when I still want to rarely insert into the
collection. As already discussed, if you're doing a lot of insertions, a
tree-based associative container like std::set
or absl::btree_set
will
provide sorted semantics and will be much more efficient than a vector, but for
a read-mostly access pattern using a vector may be more efficient.
Here are two methods that take an already-sorted vector, and insert a single
item into the vector, maintaining the sorted property of the vector (i.e. after
calling this method, calling
std::is_sorted
on the
vector will still return true). This first method uses the naive approach of
appending + sorting, the second method uses std::upper_bound
to insert
directly into the appropriate position in the vector:
// XXX: Bad way to insert into a sorted vector.
template <typename T>
void BadInsertIntoSortedVector(std::vector<T> &vec, const T &value) {
vec.emplace_back(value);
std::sort(vec.begin(), vec.end());
}
// Good way to insert into a sorted vector.
template <typename T>
void InsertIntoSortedVector(std::vector<T> &vec, const T &value) {
auto it = std::upper_bound(vec.begin(), vec.end(), value);
vec.insert(it, value);
}
In fact, for small collections (up to 1000 elements or so) pretty much all STL
implementations of std::sort
use insertion sort, so for a small collection the
two approaches above are almost equivalent. However, emplace_back/push_back
followed by std::sort
is still worse than using std::upper_bound
, since
std::sort
still has to traverse the entire input range to find the first
non-sorted position (the item that was just appended).