Singly Linked Lists Idea and Simple Implementation video (40 minutes) (Spring 2021)
The idea behind a singly linked list is that you store each element/datum/value in the list with a pointer to the next element in the list. We call this data+pointer combo a node.
A singly linked list is dynamic data structure with unlimited size that includes linear collections or groups of nodes. Each node contains data and a link/pointer to the next node. The last node of the list points to nothing and it is just a nullptr
.
A node might look like this:
template <typename T>
struct LinkedListNode {
T value;
LinkedListNode<T> *next;
LinkedListNode(const T& value) : value(value), next(nullptr) {}
};
So now, if we want to represent a list, we can just represent it as a pointer to the first element, also known as the "head" of the list.
If we want to insert at the end of the list, we can just start at the head and continue following the next
pointers until we get to end and change next
in the last node, also known as the "tail" of the list, to the address of our new node.
LinkedListNode<T> *head = ...;
// insert some new value at the end
if (head == nullptr) {
head = new LinkedListNode<T>(value);
} else {
// Go to the end of the Linked List
for (LinkedListNode<T> * tail = head; tail->next != nullptr; tail = tail->next) {
}
// Insert the node at the end
tail->next = new LinkedListNode<T>(value);
}
The run time of inserting this way is O(n) if there are n nodes already in the list because we have to start at the head and loop once for every node in our list.
Inserting at the front is easier though, all we have to do is make a new node that has the current head as it's next pointer, then update head to our new node, like so:
LinkedListNode<T> *head = ...;
// insert some new value at the front
LinkedListNode<T> *new_node = new LinkedListNode<T>(value);
new_node->next = head;
head = new_node;
We can make the previous linked list a lot better if we keep track of two things instead of one. head
will be used to point to the first node and tail
would be used to point to the last node of the linked list. Since we're keeping track of two things for our Linked List, it makes sense to split the Linked List into two classes, one for the node and one for the whole list.
#include <exception>
template <typename T>
struct LinkedListNode {
T value;
LinkedListNode<T> *next;
LinkedListNode(const T& value) : value(value), next(nullptr) {}
};
template <typename T>
class LinkedList {
LinkedListNode<T> *head, *tail;
class Iterator {
LinkedListNode<T> *cur;
public:
Iterator(LinkedListNode<T> * cur) : cur(cur) {}
// Check for inequality
bool operator!=(const Iterator& other) {
return cur != other.cur;
}
// Increment (e.g. ++it)
Iterator& operator++() {
cur = cur->next;
return *this;
}
// Dereference (e.g. *it)
T& operator*() {
return cur->value;
}
};
public:
LinkedList() : head(nullptr), tail(nullptr) {}
~LinkedList() {
for(LinkedListNode<T> *cur = head, *next; cur != nullptr; cur = next) {
next = cur->next;
delete cur;
}
}
void pop_front() {
// Let's assume head != nullptr
// TODO: Sublcass std::exception and throw an exception with a proper error message.
if (head == nullptr) {
throw std::exception();
}
LinkedListNode<T> *old_head = head;
head = head->next;
delete old_head;
}
void push_back(const T& value) {
LinkedListNode<T> *new_node = new LinkedListNode<T>(value);
if (tail == nullptr) { // if the list is empty
head = tail = new_node;
} else {
tail->next = new_node;
tail = new_node;
}
}
Iterator begin() {
return Iterator(head);
}
Iterator end() {
return Iterator(nullptr);
}
};
singly_linked_list_demo.cpp#include <iostream>
#include <string>
#include "singly_linked_list.h"
using std::cout;
using std::endl;
using std::string;
int main() {
LinkedList<string> words;
words.push_back("apple");
words.push_back("banana");
words.push_back("cherry");
words.pop_front();
for (const string& word : words) {
cout << word << endl;
}
return 0;
}
$ clang++ -pedantic -Wall -lm -std=c++20 -o singly_linked_list_demo singly_linked_list_demo.cpp
$ ./singly_linked_list_demo
banana cherry
When you want to remove the first element of an array or a vector, the vector has to copy everything over one at a time. It becomes very expensive because it is a O(n) operation if I have n things in my array or vector. However, in a linked list it is tremendously efficient to remove an element. All you have to do is to update the pointer pointing to the node that you would like to delete and point it to the node right after. In fact, If you want to delete the node with the value 11, you would have to update the head’s pointer to point to the node containing the value 22 and then you can safely delete the node that you don’t need anymore. This is a O(1) operation because no matter how big the list is (no matter how many nodes are in the list), it takes the same number of constant time steps.
Another advantage of linked lists is that you can slice out an entire section pretty efficiently. The cost of run time is with number of things (number of pointer in this case) that you are deleting and the size of the elements deleted does not matter. Going back to our example, if we want to delete the nodes containing the values 22, 33, 44, all we have to do is to update the pointer of the node containing the value 11 and point it to the node with the value 55. Whereas with arrays and vectors, if you wanted to delete the middle part, the compiler would have to copy over the remaining elements and input it in the gap which becomes expensive because the size matters and the run time would O(n) where n is the size of the elements deleted.
Although a linked list has a better big-O for inserting or removing elements in the list, compared to an array or vector, in practice it is usually not useful. It is useful to understand how a linked list works because other data structures, like trees, are similar to linked lists but more complex.
Thanks to Lorenzo Jiang for creating the draft of these notes based on the lecture video.