Linked lists
1. Problem statement
Implement a linked list library (Aziz et al., 2018, p. 91).
Aziz et al. (2018) do not actually pose this problem as an interview question, but they advise that implementing this is "an excellent way to become comfortable with [linked] lists."
2. Insights
Manipulation of the links between nodes is key. Also, insertion and deletion is easy if we keep track of an initial "head" node.
3. Solution
class LinkedList: linked_list_class_methods
A linked list's node has data (for the current node) and a pointer to the next
node. We name the data part as elt
for element.
def __init__(self, *args): self.elt = None self.next = None
By default the initial (head) node is a placeholder node used solely to keep track of the overall linked list. If there are any arguments, we treat them as individual elements for insertion. But we first reverse the arguments because this is the only way to keep the original order; otherwise the order gets reversed because insertion always happens with respect to the head node (not the tail node). See the discussion about insertion in the next section for more details.
for elt in reversed(args): self.insert(elt)
We reverse args
before insertion because of the way insertion works. Below are
some examples of how initialization behaves.
list = LinkedList() list looks like: HEAD_NODE -> None list = LinkedList(15) list looks like: HEAD_NODE -> 15 -> None list = LinkedList(1, 2, 3) list looks like: HEAD_NODE -> 1 -> 2 -> 3 -> None
3.1. Insertion
Insertion just requires creating a new node with the given data, and inserting it after the current one.
We first make the new node also point to the next element of the current node. At this point both the current and new node point to the same "next" node. We then make the current node point instead to the new node. In other words, we are prepending to the list.
def insert(self, elt: Any) -> None: node_new = LinkedList() node_new.elt = elt node_new.next = self.next self.next = node_new
3.1.1. Complexity
- Time: \(O(1)\).
- Space: \(O(1)\).
3.1.2. What if the head node also stores an element?
We could indeed store data in the head node also.
For example, if the head node has data in it, we just make the new element point
to us and then make this new element's node our new head node. This way, we
still prepend to the list just like before (but without the separate HEAD_NODE
shown above).
One problem with the above (which is how Aziz et al. (2018) implement
linked lists) is that there is no way to represent an empty linked list with 0
nodes (nodes which have data). This is because (1) we must have at least the
head node to represent a linked list, and (2) because the head node itself has
data, a linked list cannot be "empty". Aziz et al. (2018) treat the integer 0
as a
non-data element, but this is awkward at best.
Other than non-emptiness, we face another problem. We cannot easily make the
insert()
function a method of the LinkedList
class, because we have to make
self
(the current head node) become another (new) node object. That is, when
we insert an element, we want this new element's node object to become the new
head node of an existing list.
This is problematic in Python because you aren't really meant to assign a new
object to self
. We could get around it by using copy.deepcopy()
in order to
assign self
to node_new
(overwriting self
in the process). See
https://stackoverflow.com/a/29591356. This copy is expensive though and
reflects the difficulty in trying to reset self
to another object in Python.
For this reason we don't bother with allowing the head node to have data. Still, for sake of completeness we demonstrate how one would still try to use such a scheme for purposes of inserting a new element.
import copy class LinkedList: def insert(self, elt: Any) -> None: node_new = LinkedList() node_new.elt = elt node_new.next = self self.__dict__ = copy.deepcopy(node_new.__dict__)
3.2. Print (__repr__
)
This is useful for debugging. The code here follows the __repr__
example from
CPython's datetime implementation.
def __repr__(self): elts = [] node = self while node.next: elts.append(repr(node.next.elt)) node = node.next if node.next is None: break return "%s.%s(%s)" % (self.__class__.__module__, self.__class__.__qualname__, ", ".join(elts))
3.3. Size
This calculates the total number of nodes in the linked list, not counting the head node if it does not contain any data.
def size(self) -> int: # Skip head node which doesn't hold any element. node = self.next n = 0 while node: n += 1 node = node.next return n
3.4. Equality
equal
compares our current linked list with another linked list, by comparing
every single node in each list. Specifically, we check both that the elements in
each linked list are the same, but also that there are an equal number of nodes.
def __eq__(self, b) -> bool: if not isinstance(b, LinkedList): return NotImplemented a = self while a.next or b.next: if a.next and b.next and a.next.elt == b.next.elt: a = a.next b = b.next continue return False return True
3.5. Lookup
In order to look up an element, we have to traverse through the list until we find it. We skip the head node though because it doesn't hold any data.
Strictly speaking, the head node could also hold data. In languages with directo
pointer access we would use a pointer type for elt and store NULL
(invalid
memory address) for the head node's element; this way, we would know if we were
looking at a head node by comparing with NULL
.
However, Python doesn't really have this concept so we choose to avoid storing data in the head node. One benefit of doing this is that the list behaves like a stack, even for the very first element. See the discussion around insertion below.
def lookup(self, elt: Any) -> Optional[LinkedList]: current_node = self.next while current_node and current_node.elt != elt: current_node = current_node.next # This will be None if we failed to find the given element. return current_node
3.5.1. Complexity
- Time: \(O(n)\) where \(n\) is the number of nodes in the linked list.
- Space: \(O(1)\).
3.6. Deletion
This deletes the node after the current one. In a way, it's the mirror of the
first step of insert()
above where we were in a state where both nodes were
pointing to the same next element. Here, we make the current node point to the
node after the next node.
We take special care to do nothing if the list is empty (as there is nothing to delete).
If we repeatedly delete nodes in the manner described above (by passing in the head node), the effect would be to delete the nodes in the opposite order from the order they were inserted — much like a stack data structure.
def delete_after(self) -> None: if self.next is None: return self.next = self.next.next
3.6.1. Complexity
- Time: \(O(1)\).
- Space: \(O(1)\).
4. Tests
The from __future__ import annotations
line is to enable self-referencing type
annotations. See 5.
from __future__ import annotations from hypothesis import given, strategies as st import unittest import random from .linked_list import LinkedList class Test(unittest.TestCase): test_cases if __name__ == "__main__": unittest.main(exit=False)
4.1. Initialization
Here are some basic checks regarding initialization.
def test_init_empty(self): linked_list = LinkedList() self.assertEqual(linked_list.size(), 0) def test_init_singleton(self): linked_list = LinkedList(1) self.assertEqual(linked_list.size(), 1) def test_init_multiple(self): linked_list = LinkedList(1, 2, 3, 4, 5) self.assertEqual(linked_list.size(), 5) # Head node holds nothing. self.assertEqual(linked_list.elt, None) self.assertEqual(linked_list.next.elt, 1)
4.2. Insertion
We check that insertion results in items being prepended to the list. So the head node always points to the last-inserted node (or is the last-inserted node itself if we are using head nodes that hold data).
def test_insertion_head_without_data(self): linked_list = LinkedList(1, 2, 3) self.assertEqual(linked_list, LinkedList(1, 2, 3)) @given(st.lists(st.integers(min_value=1, max_value=100), min_size=0, max_size=20)) def test_insertion_random(self, given_elts: list[int]): linked_list = LinkedList(*given_elts) self.assertEqual(linked_list.size(), len(given_elts))
4.3. Size
def test_size(self): cases = [ ([], 0), ([1], 1), ([1, 2], 2), ] for elts, expected in cases: linked_list = LinkedList(*elts) self.assertEqual(linked_list.size(), expected, msg=f'{expected=}')
4.4. Equality
def test_equality(self): cases = [ ([], [], True), ([1], [], False), ([1], [1], True), ([1], [1, 2], False), ] for xs, ys, expected in cases: x = LinkedList(*xs) y = LinkedList(*ys) self.assertEqual(x == y, expected, msg=f'{x=} {y=}')
4.5. Lookup
def test_lookup(self): cases = [ ([], None, lambda exp: exp is None), ([None], None, lambda exp: exp.elt is None), ([], 0, lambda exp: exp is None), ([], 1, lambda exp: exp is None), ([1], 1, lambda exp: exp.elt == 1), ([1, 2], 2, lambda exp: exp.elt == 2), ([1, 2], 3, lambda exp: exp is None), ] for elts, key, expected_func in cases: linked_list = LinkedList(*elts) self.assertTrue(expected_func(linked_list.lookup(key)), msg=f'{key=}')
4.6. Deletion
def test_deletion_basic(self): linked_list = LinkedList(1, 2, 3) # If we delete element 1, we're gonna be pointing at 2, then 3. linked_list.delete_after() self.assertEqual(linked_list, LinkedList(2, 3)) @given(st.lists(st.integers(min_value=1, max_value=100), min_size=1, max_size=20)) def test_deletion_random(self, given_elts: list[int]): linked_list = LinkedList(*given_elts) deletions = random.randint(0, len(given_elts)) for _ in range(deletions): linked_list.delete_after() self.assertEqual(linked_list.size(), len(given_elts) - deletions) def test_deletion_nop(self): linked_list = LinkedList() linked_list.delete_after() self.assertEqual(linked_list.size(), 0)
5. Export
This linked list library is used in other problems. So we export it here to a separate reusable file (aka Python module).
The line from __future__ import annotations
is so that the class methods can
refer to itself (the enclosing class). See this discussion.
from __future__ import annotations from typing import Any, Optional code