Content
Quick Sort Basics Quick Sort Analysis & Performance Quick Sort Advantages & Disadvantages Improvements <- Go BackUniversity of Michigan at Ann Arbor
Last Edit Date: 06/14/2023
Disclaimer and Term of Use:
We do not guarantee the accuracy and completeness of the summary content. Some of the course material may not be included, and some of the content in the summary may not be correct. You should use this file properly and legally. We are not responsible for any results from using this file
This personal note is adapted from Professor Marcus Darden and David Paoletti. Please contact us to delete this file if you think your rights have been violated.
This work is licensed under a Creative Commons Attribution 4.0 International License.
i. Two Problems with Simple Sorts
They might compare every pair of elements
Learn only one puece of information / comparison
Contrast with binary search: learns n / 2 pieces of inofrmation with first comparisons
They often move elements one place at a time (bubble and insertion)
Even if the element is "far" out of place
Contrast with selection sort: moves each element exactly to its final place
Faster sorts attack these two problems
ii. Quick Sort: Background
"Easy" to implement
Works well with variety of input data
In-place sort (no extra memory for data)
Additional memory for stack frames
iii. Quick Sort: Divide and Conquer
Base case:
Inductive step
Guess an element elt
to partition the array
Form array of [LHS] elt
[RHS] (divide)
$\forall x \in LHS,~x \le elt$
$\forall y \in RHS,~y \ge elt$
Recursively sort [LHS] and [RHS] (conquer)
iv. Quick Sort with Simple Partition
1. Code
1 void quicksort(Item a[], size_t left, size_t right) {
2 if (left + 1 >= right) {
3 return;
4 }
5 size_t pivot = partition(a, left, right);
6 quicksort(a, left, pivot);
7 quicksort(a, pivot + 1, right);
8 }
Range is [left, right)
If base case, return
Else divide (partition and find pivot) and conquer (recursively quick sort)
The following code is for simple partition
1 size_t partition(Item a[], size_t left, size_t right) {
2 size_t pivot = --right;
3 while (true) {
4 while (a[left] < a[pivot])
5 ++left;
6 while (left < right && a[right - 1] >= a[pivot])
7 --right;
8 if (left >= right)
9 break;
10 swap(a[left], a[right - 1]);
11 }
12 swap(a[left], a[pivot]);
13 return left;
14 }
Choose last item as pivot
Scan
from left for >= pivot
from right for < pivot
Swap left & right pairs and continue scan until left & right cross
Move pivot to middle
The following is another partition
1 size_t partition(Item a[], size_t left, size_t right) {
2 size_t pivot = left + (right - left) / 2;
3 swap(a[pivot], a[--right]);
4 pivot = right;
5 while (true) {
6 while (a[left] < a[pivot])
7 left++;
8 while (left < right && a[right - 1] >= a[pivot])
9 right++;
10 if (left >= right)
11 break;
12 swap(a[left], a[right - 1]);
13 }
14 swap(a[left], a[pivot]);
15 return left;
16 }
Choose middle item as a pivot
Swap it with the right end
Repeat as before
i. Time Analysis
Cost of partitioning n elements: $O(n)$
Worst case: pivot always leaves one side empty
$T(n) = n + T(n - 1) + T(0)$
$T(n) = n + T(n - 1) + c$
$T(n) \sim n^2 / 2 \Rightarrow O(n^2)$
Best case: pivot divides elements equally
$T(n) = n + T(n / 2) + T(n / 2)$
$T(n) = n + 2T(n / 2) = n + 2(n / 2) + 4(n / 4) + ... + O(1)$
$T(n) \sim n \log n \Rightarrow O(n \log n)$
Average case: tricky
ii. Memory Analysis
Requires stack space for recursive calls
The first recursive call is NOT tail
The second recursive call IS tail recursive, which reuses the current stack frame
When pivoting is going terribly:
$O(n)$ stack frames if split is $(n - 1)$, pivot, $(0)$
$O(1)$ stack frames if split is $(0)$, pivot, $(n - 1)$
i. Advantages
On average, $n \log n$ time to sort n items
Short inner loop $O(n)$
Efficient memory usage
Thoroughly analyzed and understood
ii. Disadvantages
Worst case, $n^2$ time to sort n items
Not stable, making it stable sacrifices time and / or memory
Partitioning is fragile (simple mistable will eighter segfault or not sort propertly)
i. Improving Split
Key to perfomance: a "good" split
Any single choice could always be worst one
Too expensive to actually compute best one (median)
Rather than compute median, sample it
Simple way: pick three elements, take their medians
Very likely to give you better performance
Sampling is a very powerful technique
Fixed Sampling | Random Sampling |
---|---|
ii. Other Improvements
Divide and conquer
Most sorts are "small" regions
Lots of recursive calls to small regions
Reduce the cost of sorting small regions
Insertion sort is faster than quick sort on small arrays
Bail out of quick sort when size < k
Insertion sort each small sub array