Content
Ordered and Sorted Containers Binary Search and Bounds Representing Sets Union-Find <- Go BackUniversity of Michigan at Ann Arbor
Last Edit Date: 06/07/2023
Disclaimer and Term of Use:
We do not guarantee the accuracy and completeness of the summary content. Some of the course material may not be included, and some of the content in the summary may not be correct. You should use this file properly and legally. We are not responsible for any results from using this file
This personal note is adapted from Professor Marcus Darden and David Paoletti. Please contact us to delete this file if you think your rights have been violated.
This work is licensed under a Creative Commons Attribution 4.0 International License.
i. Container Review
Object storing a variable number of other objects
Allow for control / protection of data
Can copy / edit / sort / move many objects at once
Examples: vector, deque, stack, map, list, array
ii. Types of Containers
Type | Distinctive interfaces (not all methods listed) |
---|---|
Container | Supports add() and remove() operations |
Searchable Container | Adds find() operation |
Ordered Container | Sequential container which maintains current order. Can arbitrarily insert new elements anywhere. Example: Books in a shelf |
Sorted Container | Sequential container with pre-defined order. Cannot arbitrarily insert elements. Example: Students sorted by ID |
iii. Ordered Clarification
Ordered container elements maintain thier "relative" position unless they are removed
Example: If A comes before Q, and then Z is inserted between them, their relative ordering has not changed
If Q then removed, the container is still ordered
iv. Implemented Sorted and Ordered Containers
Preferred implementation dependent upon requirements of application
Study multiple implementations
v. Asymptototic Complexities
Ordered Container:
Operation | Array | Linked List |
---|---|---|
addElement(val) |
$$O(1)$$ | $$O(1)$$ |
remove(val) |
$$O(n)$$ | $$O(n)$$ |
remove(iterator) |
$$O(n)$$ | $$O(n)~\text{or}~O(1)$$ |
find(val) |
$$O(n)$$ | $$O(n)$$ |
Iterator::operator*() |
$$O(1)$$ | $$O(1)$$ |
operator[](unsigned) |
$$O(1)$$ | $$O(n)$$ |
insertAfter(iterator, val) |
$$O(n)$$ | $$O(1)$$ |
insertBefore(iterator, val) |
$$O(n)$$ | $$O(n)~\text{or}~O(1)$$ |
Sorted Container:
Operation | Array | Linked List |
---|---|---|
addElement(val) |
$$O(n)$$ | $$O(n)$$ |
remove(val) |
$$O(n)$$ | $$O(n)$$ |
remove(iterator) |
$$O(n)$$ | $$O(n)~\text{or}~O(1)$$ |
find(val) |
$$O(\log n)$$ | $$O(n)$$ |
Iterator::operator*() |
$$O(1)$$ | $$O(1)$$ |
operator[](unsigned) |
$$O(1)$$ | $$O(n)$$ |
insertAfter(iterator, val) |
||
insertBefore(iterator, val) |
Note: Binary search requires the data to be sorted.
Sample code:
1 int bsearch(double a[], double val,
2 int left, int right) {
3 while (right > left) {
4 int mid = left + (right - left) / 2;
5 if (val == a[mid])
6 return mid;
7 if (val < a[mid])
8 right = mid;
9 else
10 left = mid + 1;
11 } // while
12 return -1; // val not found
13 } // bsearch()
Comparators (Function Objects)
Given elements a and b, tell if a < b.
1 struct Point {
2 int x, y;
3 Point() : x(0), y(0) {}
4 Point(int xx, int yy) : x(xx), y(yy) {};
5 };
1 struct CompareByX {
2 bool operator() (const Point &a, const Point &b) const {
3 return a.x > b.x; // Dscending sort by X
4 }
5 };
1 struct CompareByY {
2 bool operator() (const Point &a, const Point &b) const {
3 return a.y > b.y; // Ascending sort by Y
4 }
5 };
Speeding up Binary Search
Speed-up idea: ==
rarely triggers
check if <
first
else if >
else must be ==
More radical idea: move the ==
check out of the loop
Find a sharp lower bound for the sought element first
First item >=
what I am looking for
Check for the value ==
after the loop
2 Comparasions Half the Time:
1 int bsearch(double a[], double val,
2 int left, int right) {
3 while (right > left) { // ONE
4 int mid = left + (right - left) / 2;
5
6 if (a[mid] < val) // TWO: check < not ==
7 left = mid + 1;
8 else if (val < a[mid]) // THREE
9 right = mid;
10 else // (val == a[mid])
11 return mid;
12 } // while
13
14 return -1; // val not found
15 } // bsearch()
Always 2 Comparasions / Loop
1 int lower_bound(double a[], double val,
2 int left, int right) {
3 while (right > left) { // ONE
4 int mid = left + (right - left) / 2;
5
6 if (a[mid] < val) // TWO
7 left = mid + 1;
8 else // (a[mid] >= val)
9 right = mid;
10 } // while
11
12 return left;
13 } // lower_bound()
Binary Search in STL
binary_search()
returns a bool
, not the location.
To find the locations (iterators), use
lower_bound()
First item not less than target
upper_bound()
First item greater than target
equal_range()
Pair(lower_bound, upper_bound), all items equal to target
Try it out:
i. Searchable Containers as Sets
A set is well-defined if you can tell if any given element is in the set.
Set Operations (SLT implements many of these):
Union ($\cup$): in one set or the other (OR)
Intersection ($\cap$): in both sets (AND)
Set Difference (\): in one and not the other (AND-NOT)
Symmetric Difference ($\div$): in only one (XOR)
addElement ($+$)
isElement ($\in$)
isEmpty
set_union()
Example
Set 1 and Set 2 are sorted ranges. Set 3 is a union of Set 1 and Set 2.
set_union
Example Code
1 template<class ForIterator1, class ForIterator2, class OutIterator,
2 class Compare>
3 OutIterator set_union(ForIterator1 first1, ForIterator1 last1,
4 ForIterator2 first2, ForIterator2 last2,
5 OutIterator result, Compare compare) {
6 while (first1 != last1 && first2 != last2) {
7 if (compare(*first1, *first2))
8 *result++ = *first1++; // set1 element less than set2 element
9 else if (compare(*first2, *first1))
10 *result++ = *first2++; // set2 element less than set1 element
11 else {
12 *result++ = *first1++; // set1 element == set2 element
13 ++first2;
14 } // else
15 } // while
16 while (first1 != last1)
17 *result++ = *first1++; // Remaining elements
18 while (first2 != last2)
19 *result++ = *first2++; // Remaining elements
20 return result; // returns end of sorted union of set1 and set2
21 } // set_union()
Implementing [sub]sets with ranges
Method | Asymptotic Complexity |
---|---|
initialize() |
$$O(1)\text{ or }O(n \log n)$$ |
clear() |
$$O(1)\text{ or }O(n)$$ |
isMember() |
$$O(\log n)$$ |
copy() |
$$O(n)$$ |
set_union() |
$$O(n)$$ |
set_intersection() |
$$O(n)$$ |
Universe: set of all elements that may be in a set.
i. Representing Disjoint Sets
Sets are disjoint if they do not share any elements (i.e. an element may not only belong to one set)
Many applications require representing and operating on disjoint sets.
Example:
$A = \{1, 9, 3, 2, 10\},~B = \{23, 13, 25\},~C = \{14, 15, 27, 6, 16, 17\}$ are disjoint.
In this context, only two set operations make sense:
Find: check if an element belongs to a particular set (or if two belong to the same set)
Union: merge two sets together into a single set
Consequently, data structures for representing disjoint sets are often referred to as "union-field".
Example 1: Separate Containers
Time complexity: Union()
- $O(n)$; Find()
- $O(n)$
Union-Fild Data Structure
Idea 1: every disjiont set should have its unique representive (selected element)
Therefore: to tell if j and m are in the same set, compare their representatives
Find()
operation becomes quites fastExample 2: Representatives
Making Union-Find Faster
Idea 2: When performing union of two sets, update the smaller set (less work)
Measure complexity of all unions throughout the lifecycle (together)
We call Union()
exactly n - 1 times
If we connect to a disjoint element every time, it will take n time total (best case)
Merging large sets, say n / 2 and n / 2 elements, will take $O(n)$ time for one Union()
- too slow
Smarter Union-Find
Idea 3: No need to store the actual representative for each element, as long as you can find it quickly
Each element knows someone who knows the representative (may need more steps)
Union()
becomes very fast: one of representatives will need to know the other
Find()
becomes slower
Example 3: Hierarchical Representatives
Path Compression
So far, Find()
was read-only
For element j, finds the representative k
Traverses other elements on the way (for which k is also the representative)
Idea 4: We can tell j that its representative is now k
Same for other elements on path from j to k
Doubles workload of Find()
, but same Big-O
Complexity with Path Compression
Must use amortized analysis over the life cycle of union-find (starting with n disjoint sets, and merging until there is one set containing all elements)
Result is surprising
$O(n \times \alpha(n))$, where $\alpha()$ grows very slowly
$\alpha()$ is the inverse-Ackerman function
In practice, almost-linear-time performance
Details taught in more advanced courses
Example 4: Path Compression