University of Michigan at Ann Arbor
Last Edit Date: 04/10/2023
Disclaimer and Term of Use:
We do not guarantee the accuracy and completeness of the summary content. Some of the course material may not be included, and some of the content in the summary may not be correct. You should use this file properly and legally. We are not responsible for any results from using this file
This personal note is adapted from Professor Amir Kamil, Andrew DeOrio, James Juett, Sofia Saleem, and Saquib Razak. Please contact us to delete this file if you think your rights have been violated.
This work is licensed under a Creative Commons Attribution 4.0 International License.
A map is a data structure that maps keys to values. It is thus an associative data structure, since it associates a value with each key. As an example, we may want to associate populations (in million) with countires:
Russia | 143.4 |
Canada | 38.25 |
China | 1412 |
United States | 331.9 |
We can store these data in a standard-library map:
1 std::map<std::string, double> pop;
2 pop["Russia"] = 143.4;
3 pop["Canada"] = 38.25;
4 pop["China"] = 1412;
5 pop["United States"] = 331.9;
6 cout << pop["China"] << endl; // prints 1412
7 cout << pop["United States"] << endl; // prints 331.9
The map
class template has two required template parameters, the first for the key type and the second for the value type.
We are mapping string names to number of populations, so we use a map<string, double>
. We can then use the subscript operator with a key to insert a key-value pair into the map, or to look up the value associated with a key.
A key feature of the standard-library map is that it is not erroneous to look up a non-existent key. Instead, the map inserts and returns a value-initialized datum for the key. For primitive types, value initialization produces the zero value for the type. (This is distinct from default initialization, which produces an undefined value.)
cout << pop["Belarus"] << endl; // prints 0
Try it out:
Here is another example implemented by a map:
1 std::map<std::string, int> scores;
2 scores["aliceywu"] = 100;
3 scores["akamil"] = 23;
4 scores["taligoro"] = 100;
5 scores["jjuett"] = 73;
A map is similar to a set in that the keys in the map are unique; each key is associated with at most one value. However, the values themselves are not unique. The map above has two keys that are associated with the value 100.
Given the similarity with a set, we can also implement a map using a binary search tree. However, rather than using the key-value pair for ordering and uniqueness, we need to use just the key, and the value merely tags along for the ride, as shown below.
To make this work, we need a heterogeneous data type for the datum
member of a node, so that it can store separate key and value items. We can define our own struct or class, or we can use the pair
template in the standard <utility>
library.
1 std::pair<int, bool> p1; // default constructor value initializes both items
2 p1.first -= 5; // first now has value -5
3 p1.second = false;
4
5 std::pair<string, int> p2 = { "hello", 4 }; // explicit initialization
6 cout << p2.first << ": " << p2.second << endl; // prints hello: 4
A pair stores a first and second item, which can be of different types. The pair
template is parameterized by the types of these items.
Try it out:
We can use a pair and a BST to implement a map:
1 template <typename Key_type, typename Value_type>
2 class Map {
3 public:
4 // EFFECTS: Returns whether this map is empty.
5 bool empty() const;
6
7 // EFFECTS: Returns the number of key-value pairs in this map.
8 size_t size() const;
9
10 // EFFECTS: Returns the value associated with the given key. If
11 // the key is not in the map, inserts the key,
12 // associating it with a value-initialized object of type
13 // Value_type, and returns the newly inserted value.
14 Value_type& operator[](const Key_type& key);
15
16 private:
17 // Type alias for a key-value pair.
18 using Pair_type = std::pair<Key_type, Value_type>;
19
20 // Comparator that compares pairs by their key components.
21 class PairLess {
22 public:
23 bool operator()(...) const;
24 };
25
26 // Data representation.
27 BinarySearchTree<Pair_type, PairLess> entries;
28 };
For this to work, we need a BinarySearchTree
that can take a custom comparator, to compare key-value pairs by just the value component. This comparator can be defaulted to std::less
, which compares elements according to the <
operator:
template <typename T, typename Compare=std::less<T>>
class BinarySearchTree {
...
};
Suppose we wanted to write a function template that prints out the elements of sequence. We could write it to work with itertors:
1 template <typename Iter_type>
2 void print_all(Iter_type begin, Iter_type end) {
3 for (Iter_type it = begin; it != end; ++it) {
4 cout << *it << endl;
5 }
6 }
On the other hand, we desire an interface that works directly on a sequence container:
1 template <typename Sequence>
2 void print_all(const Sequence &sequence) {
3 for (typename Sequence::Iterator it = sequence.begin();
4 it != sequence.end(); ++it) {
5 cout << *it << endl;
6 }
7 }
Declaring the iterator type is very verbose, as it consists of both a qualified name and the typename
keyword, since the qualified name is a dependent type. Furthermore, the declaration makes the assumption that the iterator type is named Iterator
, which is not true for standard-library containers that use lower-case type names.
Rather than writing out the type of it
explicitly, we can ask the compiler to deduce it for us by using the auto
keyword:
1 template <typename Sequence>
2 void print_all(const Sequence &sequence) {
3 for (auto it = sequence.begin(); it != sequence.end(); ++it) {
4 cout << *it << endl;
5 }
6 }
The compiler deduces the type from its initialization. If the return type of sequence.begin()
is List<int>::Iterator
, then the type of it
is deduced to have type List<int>::Iterator
. On the other hand, if the return type is vector<Duck>::iterator
, then the type of it
is deduced to be vector<Duck>::iterator
. The return type need not be a nested type; if it is char *
, then it
will be deduced to have type char *
. Thus, not only does type deduction save us keystrokes for complicated types, it also abstracts over how the types are defined.
A range-based for loop is a special syntax for iterating over sequences that support traversal by iterator. Rather than writing out each piece of the traversal, we can have the compiler generate it for us:
1 vector<int> vec = { 1, 2, 3, 4, 5 };
2 for (int item : vec) {
3 cout << item << endl;
4 }
Try it out:
The syntax of a range-based for loop is:
for (<type> <variable> : <sequence>)
<body>
In the example above, the variable is named item
and has type int
, and vec
is the sequence over which the loop iterates. The compiler automatically converts this into a traversal by iterator:
1 for (auto it = vec.begin(); it != vec.end(); ++it) {
2 int item = *it;
3 cout << item << endl;
4 }
The variable item
is initialized in each iteration from an element in the sequence. Then the code in the body of the range-based for loop follows.
Try it out:
The following loop attempts to set every element in the vector to the value 42:
1 for (int item : vec) {
2 item = 42; // attempt to set element to 42
3 }
However, this loop does not actually modify any elements. To understand why, let us take a look at its translation into a traversal by iterator:
1 for (auto it = vec.begin(); it != vec.end(); ++it) {
2 int item = *it;
3 item = 42;
4 }
Try it out:
The loop is actually modifying a copy of each element rather than an element itself. To avoid making a copy, we can declare the loop variable to be a reference:
1 for (int &item : vec) {
2 item = 42; // actually set element to 42
3 }
This translates to:
1 for (auto it = vec.begin(); it != vec.end(); ++it) {
2 int &item = *it;
3 item = 42;
4 }
This successfully modifies each element to have the value 42.
Try it out:
Range-based for loops are often used in combination with auto, as in the following:
1 vector<int> vec = { 1, 2, 3, 4, 5 };
2
3 for (auto &item : vec) {
4 item = 42;
5 }
6
7 for (auto item : vec) {
8 cout << item << endl;
9 }
Try it out:
The first loop declares item
as a reference, so that it aliases an element in the sequence. The second loop does not declare item
as a reference, so it produces a copy of an element in the sequence. The following is the translation of both loops:
1 vector<int> vec = { 1, 2, 3, 4, 5 };
2
3 for (auto it = vec.begin(); it != vec.end(); ++it) {
4 int &item = *it; // alias
5 item = 42;
6 }
7
8 for (auto it = vec.begin(); it != vec.end(); ++it) {
9 int item = *it; // copy
10 item = 42;
11 }
With a range-based for loop, we can simplify print_all()
:
1 template <typename Sequence>
2 void print_all(const Sequence &sequence) {
3 for (auto &item : sequence) {
4 cout << item << endl;
5 }
6 }
Try it out:
Iterating over a Map
The standard-library map defines an iterator that produces key-value pairs upon dereference. The following is an example that constructs a map, counting how many times each unique word occurs in a vector. It then iterates over the map to print out the counts.
1 void print_word_counts(const std::vector<std::string> &words) {
2 std::map<std::string, int> word_counts;
3
4 // Each time a word is seen, add 1 to its entry in the map
5 for (const auto &word : words) {
6 word_counts[word] += 1;
7 }
8
9 // Print out the results by iterating through the map.
10 for (const auto &key_value : word_counts) { // key-value pairs
11 const auto &word = key_value.first;
12 const auto &count = key_value.second;
13 cout << word << "occurred " << count << " times." << endl;
14 }
15 }
When incrementing a word’s count, we do not need to check whether the word is in the map; if not, the map automatically inserts the word into the map with a value-initialized count of zero, which we then increment to one.
We use range-based for loops to iterate over both the vector and the map, declaring a type-deduced reference to each element. A reference avoids making a copy, which is nontrivial for strings. The iteration over the map produces key-value pairs (std::pair<std::string, int>
), and we access the first
and second
members to obtain the word and count, respectively.