Content
Introduction to Complexity Analysis Metrics of Algorithm Complexity Counting Steps Big-O Notation Big-O, Big-Theta, and Big-Omega Amortized Complexity <- Go BackUniversity of Michigan at Ann Arbor
Last Edit Date: 05/09/2023
Disclaimer and Term of Use:
We do not guarantee the accuracy and completeness of the summary content. Some of the course material may not be included, and some of the content in the summary may not be correct. You should use this file properly and legally. We are not responsible for any results from using this file
This personal note is adapted from Professor Marcus Darden and David Paoletti. Please contact us to delete this file if you think your rights have been violated.
This work is licensed under a Creative Commons Attribution 4.0 International License.
1. What is it?
Given an algorithm and input size $n$, how many steps are needed?
Each step should take $O(1)$ time
As input size grows, how does number of steps change? Need to focus on trend
2. How do we measure complexity?
Express the rate of growth as function $f(n)$
Use the big-O notation
3. Why is this important?
Tells how well an algorithm scales to larger inputs
Given two algorightms, we can compare performance before implementation
Example: Find an element in a array. The red square represents the best case; the green triangle represents the average case; the blue oval represents the worst case.
What Affects Runtime?
The algorithm
Implementation details
CPU speed / memory speed
Compiler (options used)
g++ -g3
(for debugging, highest level of information)
g++ -O3
(optimizaiton level 3 for speed)
Other programs running in parallel
Amount of data processed (input size)
Input Size vs Runtime
Rate of growth independent of most factors (CPU speed, compiler, etc.)
We measure & use input size using the follwing:
Number of bits
int
(32 bits), double
(64 bits)Number of items
Notation and terminology
Common Orders of Functions
|
|
Example:
Use V and E to determine which contributes more to the total number of steps. Big-O examples: $E~\text{log}~V$, $EV$, $V^2~\text{log}~E$.
From Analysis to Application
Algorithm comparisons are independent of hardware, compilers, and implementation tweaks.
Predict which algorithms will eventually be faster
Constants can often be ignored because they do nto affect asymptotic cmoparisons
The following can be counted as one step in a program:
Note: Runtime of 1 step is independent on input.
1. for
Loop Step Counting
The basic form of a for-loop: for (initialization; test; update)
The initialization is performed once ($1$)
The test is performed every time the body of the loop runs, plus once for when the loop ends ($n + 1$)
The update is performed every time the body of the loop runs ($n$)
Example:
1 int func1(int n) {
2 int sum = 0;
3 for (int = 0; i < n; ++i) {
4 sum += 1;
5 }
6 return sum;
7 }
In the function above, line 2 is an assignment, which takes 1 step.
On line 3, the initialization takes 1 step, the test takes $n + 1$ steps, and the update takes $n$ steps.
On line 4, it is a primitive operation, takes 1 step. On line 6, it is a function return, takes 1 step.
Because the loop body will be implemented $n$ times, the total steps will be $4 + 3n$.
2. Polynomial Step Counting
Example:
1 int func2(int n) {
2 int sum = 0;
3 for (int i = 0; i < n; ++i) {
4 for (int j = 0; j < n; ++j) {
5 ++sum;
6 }
7 }
8 for (int k = 0; k < n; ++k) {
9 --sum;
10 }
11 return sum;
12 }
In the function above, line 2 is an assignment, which takes 1 step.
On line 3, the initialization takes 1 step, the test takes $n + 1$ steps, and the update takes $n$ steps.
On line 4, the initialization takes 1 step, the test takes $n + 1$ steps, and the update takes $n$ steps. However, it is inside another for loop on line 3, so we have to multiply with n when calculating the total steps.
On line 5, it is an arithmetic operation, which takes 1 step. Hoever, it is inside two for loops, so it actually runs $n \times n = n^2$ times in total.
On line 8, the initialization takes 1 step, the test takes $n + 1$ steps, and the update takes $n$ steps.
On line 9, it is an arithmetic operation, which takes 1 step. However, it is inside another for loop on line 9, so we have to multiply with n when calculating the total steps.
On line 11, it is the function return, which takes 1 step.
Adding all these up, we will get the total steps: $1 + (2 + 2n) + (2 + 2n) \times n + n^2 + (2 + 2n) + n + 1 = 3n^2 + 7n + 6$
3. Logarithmic Step Counting
Example:
1 int func3(int n) {
2 int sum = 0;
3 for (int i = n; i > 1; i /= 2) {
4 sum += 1;
5 }
6 return sum;
7 }
In the function above, line 2 is an assignment, which takes 1 step.
On line 3, the initialization takes 1 step, the test takes $n + 1$ steps, and the update takes about $\text{log}~n$ steps.
On line 4, it is an arithmetic operation, which takes 1 step. However, it is in a for loop, so we have to multiply with $\text{log}~n$ when counting the total steps.
On line 6, it is a function return, which takes 1 step.
Adding all these up, we will get the total steps: $4 + 3\text{log}~n$
Note: Why $\text{log}~n$?
1. Definition
$f(n) = O(g(n))$ if and only if there are constants.
$ \begin{equation} \begin{cases} c > 0\\ n_0 \ge 0\\ \end{cases} \end{equation} $, such that $f(n) \le c \cdot g(n)$ whenever $n \ge n_0$.
2. Sufficient (but no necessary) Condition
If $\lim_{n \rightarrow \infty} \left( \frac{f(n)}{g(n)} \right) = d < \infty$, then $f(n)$ is $O(g(n))$.
Example 1:
Is $\log_2n = O(2n)$?
$f(n) = \log_2n$
$g(n) = 2n$
$\Rightarrow \lim_{n \rightarrow \infty} \left(\frac{\log_2n}{2n}\right) = \lim_{n \rightarrow \infty} \left(\frac{\frac{d}{dn}\log_2n}{\frac{d}{dn}2n}\right) = \lim_{n \rightarrow \infty} \left(\frac{1}{2 \ln(2) x}\right) = 0 < \infty$
As a result, $\log_2n = O(2n)$.
Example 2:
Is $\sin \left(\frac{n}{100}\right) = O(100)$?
$f(n) = \sin \left(\frac{n}{100}\right)$
$g(n) = 100$
$\Rightarrow \lim_{n \rightarrow \infty} \left(\frac{\sin \left(\frac{n}{100}\right)}{100}\right)$ diverges, but it is true that $f(n) = O(g(n))$.
We can drop the constant when the sufficient condition is met.
Log Identities:
Identity | Example |
---|---|
$$\text{log}_a \left(xy\right) = \text{log}_a (x) + \text{log}_a (y)$$ | $$\text{log}_2 \left(3 \times 4 \right)$$ |
$$\log_a (\frac{x}{y}) = \log_a (x) - \log_a (y)$$ | $$\log_2 (\frac{4}{3})$$ |
$$\log_a (x^r) = r\log_a (x)$$ | $$\log_2 (x^3)$$ |
$$\log_a (\frac{1}{x}) = -\log_a (x)$$ | $$\log_2 (\frac{1}{3})$$ |
$$\log_a x = \frac{\log (x)}{\log (a)} = \frac{\ln (x)}{\ln (a)}$$ | $$\log_7 9$$ |
$$\log_a a = 1,~\log_a 1 = 0$$ |
Power Identities:
Identity | Example |
---|---|
$$a^{(m+n)} = a^m \cdot a^n$$ | $$2^{2+3}$$ |
$$a^{(m-n)} = \frac{a^m}{a^n}$$ | $$2^{3 - 2}$$ |
$$(a^m)^n = a^{mn}$$ | $$(2^2)^3$$ |
$$a^{-n} = \frac{1}{a^n}$$ | $$2^{-4}$$ |
$$a^{-1} = \frac{1}{a},~a^0=1,~a^1=a$$ |
Big-O ($O$) | Big-Theta ($\Theta$) | Big-O ($\Omega$) | |
---|---|---|---|
Defines | Asymptotic upper bound | Asymptotic tight bound | Asymptotic lower bound |
Definition | $f(n) = O(g(n))$ if and only if there exists an integer $n_0$ and a real number $c$ such that for all $n \ge n_0$, $f(n) \le c \cdot g(n)$ | $f(n) = \Theta (g(n))$ if and only if there exists an integer $n_0$ and real constants $c_1$ and $c_2$ such that for all $n \ge n_0$: $c_1 \cdot g(n) \le f(n) \le c_2 \cdot g(n)$ | $f(n) = \Omega(g(n))$ if and only if there exists an integer $n_0$ and a real number $c$ such that for all $n \ge n_0$, $f(n) \ge c \cdot g(n)$ |
Mathmatical Definition | $\exists n_0 \lfloor Z$, $\exists c \lfloor R$: $\forall_{n^3} n_0$, $f(n) \le c \cdot g(n)$ |
$\Theta(f(n)) = O(f(n)) \cap \Omega(f(n))$ | $\exists n_0 \lfloor Z$, $\exists c \lfloor R$: $\forall_{n^3} n_0$, $f(n)^3 > c \cdot g(n)$ |
Examples:
$f_1(n) = 2n + 1$:
$f_2(n) = n^2 + n + 5$:
A type of worst-case complexity
Used when the work / time profile is "spiky" (sometimes it is very expensive, but most times it is a small expense)
Analysis performed over a sequence of operations covering of a given range
Considers the average cost of one operations over a sequence of operations
Key to understanding expandable arrays and STL vectors, priority queues, and hash tables
Example 1: Cell Phone Bill (assume unlimited calls / texts)
Pay $100 once per month, each call call and text has no (added) cost
If you make 1000 calls / texts, each one effectively costs $0.10
The rate at which money leaves your pocket is very "spiky"
But each call or text appears to have basically a contant cost: the amortized cost: the amortized cost per text is $O(1)$
Example 2:
The asymptotic runtime complexity of the push
operation for a stack implemented using an array / vector.
Container Growth Options