Introduction to Linked List¶

Sequential Containers

A sequential container is a container that holds elements and allows them to be accessed in sequential order. The simplest sequential container is a build-in array, which stores its elements contiguously in memory.

Traverse by pointer

Traverse by index

const int SIZE = 5;
int array[SIZE] = { 1, 2, 3, 4, 5 };
for (int *ptr = array; ptr != array + SIZE; ++ptr) {
  cout << *ptr < endl;
}

const int SIZE = 5;
int array[SIZE] = { 1, 2, 3, 4, 5 };
for (int i = 0; i < SIZE; ++i) {
  cout << array[i] < endl;
}

A vector has the same property, which stores elements contiguously in memory. It grows as needed by swapping out an old array for a new one.

Sequence abstractions based on contiguous memory have the drawback of inefficient insertion at the beginning or middle of the sequence. For instance, to insert an item at a particular position, we need to first move the elements at that position and later out of the way. Shown as the following picture.

Thus, insertion at the beginning or middle is a linear-time operation.

On the other hand, if we give up the abstraction of contiguous memory, we can place a new item anywhere in memory.

Since the elements are no longer stored contiguously, we cannot move to the next element by just incrementing an address.

Instead, we have to explicitly keep track of where the next element is through a pointer, which just involves changing the value of a pointer.

Node

For each element in our sequence, we need to keep track of both the datum itself as well as a pointer to the next piece of the sequence.

This is heterogeneous data, so we use a node class-type object to keep track of the two values:

 1     struct Node {
 2       int datum;
 3       Node *next;
 4     };

Here the Node is a plain-old data, using the struct keyword and can access its members directly.
int is they type of the element in a node, here the element is called datum.
The remaining Node member, which is *next, is a pointer to the next Node in out sequence.

Take the following sequence (1, 2, 3, 4) as an example:

Note: We use a null pointer as a sentinel denoting the end of the sequence, storing that in the next pointer of the last node. This is called a linked list, which consists of a list of items linked together by pointers.

We can also define a class called IntList to internally maintains a sequence of nodes.

 1     class IntList {
 2       ...
 3     private:
 4       struct Node {
 5         int datum;
 6         Node *next;
 7       };
 8       Node *first;
 9     };

Since the Node objects are stored indirectly from the IntList object, they must be in dynamic memory, and IntList must ensure that thier memory is properly managed.

The following is the interface of the IntList class:

 1     class IntList {
 2     public:
 3       // EFFECTS: Constructs an empty list.
 4       IntList();
 5
 6       // EFFECTS:  Returns true if the list is empty.
 7       bool empty() const;
 8
 9       // REQUIRES: the list is not empty
10       // EFFECTS:  Returns (by reference) the first element.
11       int & front();
12    
13       // EFFECTS:  Inserts datum at the front of the list.
14       void push_front(int datum);
15     
16       // REQUIRES: the list is not empty
17       // EFFECTS:  Removes the first element.
18       void pop_front();
19  
20       // MODIFIES: os
21       // EFFECTS:  Prints the items in the list, each followed by a space.
22       void print(std::ostream &os) const;
23
24       private:
25       struct Node {
26         int datum;
27         Node *next;
28       };
29       Node *first;
30     };

If we want to have an empty linked list, we just let first store a null pointer.

The following are some invariants for the IntList object:

first is either null or a pointer to a valid Node
the next member of all but the last node points to another valid Node
the next member of the last is null

Implementation

The constructor must ensure that the list is initialized to satisfy the representation invariants. Since the default constructor makes the list empty, it must initialize first to be null:

 1     IntList::IntList()
 2       : first(nullptr) {}

The empty() function just needs to check whether first is null:

 1     bool IntList::empty() const {
 2       return first == nullptr;      // return first; also works
 3     }

For front(), we will first assert the REQUIRES clause that the list not be empty. If that is satisfied, the representation invariants tell us that first is pointing to a valid node. Its datum member holds the first element.

 1     int & IntList::front() {
 2       assert(!empty());
 3       return first->datum;
 4     }

For push_front(), we need to consider two cases:

The list is empty, in which case first is null
The list is not empty, in which case first points to a valid node

 1     void IntList::push_front(int datum) {
 2       Node *p = new Node;
 3       p->datum = datum;
 4       p->next = first;
 5       first = p;
 6     }

We can also use a initialization list to achieve this:

 1     void IntList::push_front(int datum) {
 2       first = new Node{ datum, first };
 3     }

In pop_front(), we again assert the REQUIRES clause. We should then consider two possible cases:

The list has one item, in which case it will end up as empty
The list has more than one item, and it will not end up empty

The following two implementations are both correct:

 1     void IntList::pop_front() {
 2       assert(!empty());
 3       Node *victim = first;           // temporary keeps track of old first
 4       first = first->next;
 5       delete victim;
 6     }

 1     void IntList::pop_front() {
 2       assert(!empty());
 3       Node *new_first = first->next;  // temporary keeps track of new first
 4       delete first;
 5       first = new_first;
 6     }

The following is the test code:

 1     int main() {
 2       IntList list;             // ( )
 3       list.push_front(1);       // ( 1 )
 4       list.push_front(2);       // ( 2 1 )
 5       list.push_front(3);       // ( 3 2 1 )
 6
 7       cout << list.front();     // 3
 8
 9       list.front() = 4;         // ( 4 2 1 )
10
11       list.print(cout);         // 4 2 1
12     
13       list.pop_front();         // ( 2 1 )
14       list.pop_front();         // ( 1 )
15       list.pop_front();         // ( )
16
17       cout << list.empty();     // true (or 1)
18     }

Try it out:

Open in new window

Traversing a Linked List¶

You may notice we have a print() function inside the class. It prints out the elements linked from the first to the end.

Iterating over a linked list’s elements from outside the class requires an iterator abstraction.

The Node struct is private, so external code cannot use Nodes to iterate through the list. Code within IntList, on the other hand, does have access to Node, so it can traverse the list by starting at the first member and following each node’s next pointer until reaching the null sentinel.

The following print() member function uses this strategy:

 1     void print(std::ostream &os) const {
 2       for (Node *node_ptr = first; node_ptr; node_ptr = node_ptr->next) {
 3         os << node_ptr->datum << " ";
 4       }
 5     }

The loop initializes node_ptr as a copy of first.

If the list is empty, first is null, so node_ptr will be initialized to null as well.

The truth value of a null pointer is false, so the condition of the loop will evaluate to false and the loop will exit. (Alternatively, node_ptr can be compared to nullptr instead: node_ptr != nullptr.)

If the list is not empty, node_ptr will not be initially null, and its truth value will be true.

The body then executes, and it uses the -> operator to access the datum member of the node that node_ptr points to.

The loop update then copies the node’s next pointer into node_ptr, moving node_ptr to point to the next node.

When the iteration reaches the last node in the list, its next is null, so the update sets node_ptr to be null.

This results in the loop condition being false, so the loop exits.

Linked List Big Three¶

Destructor

Free resources - pop_all()

Copy Constructor

Copy regular members from other
Deep copy resources from other - push_all()

Assignment Operator

Check for self-assignment
Free resources - pop_all()
Copy regular members from right hand side
Deep copy resources from right hand side - push_all()
return *this

We can add pop_all() and push_all() as private members in the class. The big three for a linked list class should look like the following:

 1     IntList::IntList(const IntList &other) : IntList() {
 2       push_all(other);
 3     }
 4
 5     IntList & IntList::operator=(const IntList &rhs) {
 6       if (this != &rhs) {
 7         pop_all();
 8         push_all(rhs);
 9       }
10       return *this;
11     }
12
13     IntList::~IntList() {
14       pop_all();
15     }

pop_all()

 1     void IntList::pop_all() {
 2       while (!empty()) {
 3         pop_front();
 4       }
 5     }

push_all()

 1     void IntList::push_all(const IntList &other) {
 2       for (Node *node_ptr = other.first; node_ptr; node_ptr = node_ptr->next) {
 3         push_back(node_ptr->datum); 
 4       }
 5     }

In order to complete push_all() we have to define push_back(). If we use push_front(), we add elements in a reverse order. The following is a possible implementation of push_back(), but it is very inefficient.

 1     void IntList::push_back(int datum) {
 2       Node *new_node = new Node{ datum, nullptr };
 3       if (empty()) {
 4         first = new_node;
 5       } else {
 6         Node *node_ptr = first;
 7         for (; node_ptr->next; node_ptr = node_ptr->next);  // find last node
 8         node_ptr->next = new_node;        // set last node's next to new_node
 9       }
10     }

Insertion and Removal at the End¶

In order to make push_back() more efficient, we can change our list representation to keep track of the last node.

class IntList {
  ...
private:
  Node *first;
  Node *last;
};

We can then rewrite push_back() as follows:

 1     void IntList::push_back(int datum) {
 2       Node *new_node = new Node{ datum, nullptr };
 3       if (empty()) {
 4         first = last = new_node;
 5       } else {
 6         last = last->next = new_node;
 7       }
 8     }

In the case of inserting into an empty list, we need to set both first and last pointing at the new node. (We would need to modify pop_front() to do this as well.)

When inserting at the end of a non-empty list, we have to set the next pointer of what used to be the last node to point to the new node, and we must also update the last member of the IntList so that it points to the new node.

However, this interface is still not that efficient. The problem is that our nodes only have a next pointer, so they allow us to traverse in the forward direction but not in reverse. A doubly linked list can help solve this issue.

Doubly Linked List

class IntList {
  ...
private:
  struct Node {
    int datum;
    Node *prev;
    Node *next;
  };

  Node *first;
  Node *last;
};

This is now a doubly linked list, since each node has two links, one to the previous node and one to the next node. Our original list is a singly linked list.

The standard library provides both implementations: std::forward_list is a singly linked list, while std::list is a doubly linked list. The former uses less space than the latter, since the nodes don’t have a pointer to the previous node, but it does not provide the ability to iterate backwards over the list.

With a doubly linked list, we can easily implement pop_back():

 1     void IntList::pop_back() {
 2       assert(!empty());
 3       if (last->prev) {
 4         last->prev->next = nullptr;
 5       }
 6       Node* temp = last;
 7       last = last->prev;
 8       delete temp;
 9     }

List Template¶

We can generalize the IntList class to hold objects of other types by making it a template. Each instantiation is still homogeneous: List<int> only holds ints, List<string> only holds strings, and so on.

template <typename T>
class List {
public:
  List();

  void empty() const;

  T & front();

  void push_front(const T &datum);

  void pop_front();

  void push_back(const T &datum);

  void pop_back();

  ...

private:
  struct Node {
    T datum;
    Node *prev;
    Node *next;
  };

  Node *first;
  Node *last;
};

By placing the Node struct inside the List template, each instantiation of List will have its own Node type; List<int>::Node, List<string>::Node, and List<Duck>::Node are all distinct types.

Now that our container is a template, we pass elements by reference. For List<int>, the elements are small enough to copy, but for class-type elements such as in List<string> or List<Duck>, we want to avoid making unnecessary copies. A good rule of thumb is to pass a function parameter by reference if the function parameter’s type is a template parameter, since the latter may be instantiated with a large class type.

EECS 280: Programming and Introductory Data Structures

Section 16

Linked List

Introduction to Linked List¶

Traversing a Linked List¶

Linked List Big Three¶

Insertion and Removal at the End¶

List Template¶