Linked lists

1. Problem statement

Implement a linked list library (Aziz et al., 2018, p. 91).

Aziz et al. (2018) do not actually pose this problem as an interview question, but they advise that implementing this is "an excellent way to become comfortable with [linked] lists."

2. Insights

Manipulation of the links between nodes is key. Also, insertion and deletion is easy if we keep track of an initial "head" node.

3. Solution

A linked list's node has data (for the current node) and a pointer to the next node. We name the data part as elt for element.

def __init__(self, *args):
    self.elt = None
    self.next = None

By default the initial (head) node is a placeholder node used solely to keep track of the overall linked list. If there are any arguments, we treat them as individual elements for insertion. But we first reverse the arguments because this is the only way to keep the original order; otherwise the order gets reversed because insertion always happens with respect to the head node (not the tail node). See the discussion about insertion in the next section for more details.

    for elt in reversed(args):
        self.insert(elt)

We reverse args before insertion because of the way insertion works. Below are some examples of how initialization behaves.

list = LinkedList()

list looks like:
HEAD_NODE -> None

list = LinkedList(15)

list looks like:
HEAD_NODE -> 15 -> None

list = LinkedList(1, 2, 3)

list looks like:
HEAD_NODE -> 1 -> 2 -> 3 -> None

3.1. Insertion

Insertion just requires creating a new node with the given data, and inserting it after the current one.

We first make the new node also point to the next element of the current node. At this point both the current and new node point to the same "next" node. We then make the current node point instead to the new node. In other words, we are prepending to the list.

def insert(self, elt: Any) -> None:
    node_new = LinkedList()
    node_new.elt = elt
    node_new.next = self.next
    self.next = node_new

3.1.1. Complexity

  • Time: \(O(1)\).
  • Space: \(O(1)\).

3.1.2. What if the head node also stores an element?

We could indeed store data in the head node also.

For example, if the head node has data in it, we just make the new element point to us and then make this new element's node our new head node. This way, we still prepend to the list just like before (but without the separate HEAD_NODE shown above).

One problem with the above (which is how Aziz et al. (2018) implement linked lists) is that there is no way to represent an empty linked list with 0 nodes (nodes which have data). This is because (1) we must have at least the head node to represent a linked list, and (2) because the head node itself has data, a linked list cannot be "empty". Aziz et al. (2018) treat the integer 0 as a non-data element, but this is awkward at best.

Other than non-emptiness, we face another problem. We cannot easily make the insert() function a method of the LinkedList class, because we have to make self (the current head node) become another (new) node object. That is, when we insert an element, we want this new element's node object to become the new head node of an existing list.

This is problematic in Python because you aren't really meant to assign a new object to self. We could get around it by using copy.deepcopy() in order to assign self to node_new (overwriting self in the process). See https://stackoverflow.com/a/29591356. This copy is expensive though and reflects the difficulty in trying to reset self to another object in Python.

For this reason we don't bother with allowing the head node to have data. Still, for sake of completeness we demonstrate how one would still try to use such a scheme for purposes of inserting a new element.

🔗
import copy

class LinkedList:
    def insert(self, elt: Any) -> None:
        node_new = LinkedList()
        node_new.elt = elt
        node_new.next = self
        self.__dict__ = copy.deepcopy(node_new.__dict__)

3.2. Print (__repr__)

This is useful for debugging. The code here follows the __repr__ example from CPython's datetime implementation.

def __repr__(self):
    elts = []
    node = self
    while node.next:
        elts.append(repr(node.next.elt))
        node = node.next
        if node.next is None:
            break
    return "%s.%s(%s)" % (self.__class__.__module__,
                        self.__class__.__qualname__,
                        ", ".join(elts))

3.3. Size

This calculates the total number of nodes in the linked list, not counting the head node if it does not contain any data.

def size(self) -> int:
    # Skip head node which doesn't hold any element.
    node = self.next

    n = 0
    while node:
        n += 1
        node = node.next

    return n

3.4. Equality

equal compares our current linked list with another linked list, by comparing every single node in each list. Specifically, we check both that the elements in each linked list are the same, but also that there are an equal number of nodes.

def __eq__(self, b) -> bool:
    if not isinstance(b, LinkedList):
        return NotImplemented
    a = self
    while a.next or b.next:
        if a.next and b.next and a.next.elt == b.next.elt:
            a = a.next
            b = b.next
            continue
        return False
    return True

3.5. Lookup

In order to look up an element, we have to traverse through the list until we find it. We skip the head node though because it doesn't hold any data.

Strictly speaking, the head node could also hold data. In languages with directo pointer access we would use a pointer type for elt and store NULL (invalid memory address) for the head node's element; this way, we would know if we were looking at a head node by comparing with NULL.

However, Python doesn't really have this concept so we choose to avoid storing data in the head node. One benefit of doing this is that the list behaves like a stack, even for the very first element. See the discussion around insertion below.

def lookup(self, elt: Any) -> Optional[LinkedList]:

    current_node = self.next

    while current_node and current_node.elt != elt:
        current_node = current_node.next

    # This will be None if we failed to find the given element.
    return current_node

3.5.1. Complexity

  • Time: \(O(n)\) where \(n\) is the number of nodes in the linked list.
  • Space: \(O(1)\).

3.6. Deletion

This deletes the node after the current one. In a way, it's the mirror of the first step of insert() above where we were in a state where both nodes were pointing to the same next element. Here, we make the current node point to the node after the next node.

We take special care to do nothing if the list is empty (as there is nothing to delete).

If we repeatedly delete nodes in the manner described above (by passing in the head node), the effect would be to delete the nodes in the opposite order from the order they were inserted — much like a stack data structure.

def delete_after(self) -> None:
    if self.next is None:
        return

    self.next = self.next.next

3.6.1. Complexity

  • Time: \(O(1)\).
  • Space: \(O(1)\).

4. Tests

The from __future__ import annotations line is to enable self-referencing type annotations. See 5.

🔗
from __future__ import annotations
from hypothesis import given, strategies as st
import unittest
import random

from .linked_list import LinkedList

class Test(unittest.TestCase):
    test_cases

if __name__ == "__main__":
    unittest.main(exit=False)

4.1. Initialization

Here are some basic checks regarding initialization.

def test_init_empty(self):
    linked_list = LinkedList()
    self.assertEqual(linked_list.size(), 0)

def test_init_singleton(self):
    linked_list = LinkedList(1)
    self.assertEqual(linked_list.size(), 1)

def test_init_multiple(self):
    linked_list = LinkedList(1, 2, 3, 4, 5)
    self.assertEqual(linked_list.size(), 5)
    # Head node holds nothing.
    self.assertEqual(linked_list.elt, None)
    self.assertEqual(linked_list.next.elt, 1)

4.2. Insertion

We check that insertion results in items being prepended to the list. So the head node always points to the last-inserted node (or is the last-inserted node itself if we are using head nodes that hold data).

def test_insertion_head_without_data(self):
    linked_list = LinkedList(1, 2, 3)
    self.assertEqual(linked_list, LinkedList(1, 2, 3))

@given(st.lists(st.integers(min_value=1, max_value=100),
                min_size=0,
                max_size=20))
def test_insertion_random(self, given_elts: list[int]):
    linked_list = LinkedList(*given_elts)
    self.assertEqual(linked_list.size(), len(given_elts))

4.3. Size

def test_size(self):
    cases = [
        ([], 0),
        ([1], 1),
        ([1, 2], 2),
    ]
    for elts, expected in cases:
        linked_list = LinkedList(*elts)
        self.assertEqual(linked_list.size(), expected,
                         msg=f'{expected=}')

4.4. Equality

def test_equality(self):
    cases = [
        ([], [], True),
        ([1], [], False),
        ([1], [1], True),
        ([1], [1, 2], False),
    ]
    for xs, ys, expected in cases:
        x = LinkedList(*xs)
        y = LinkedList(*ys)
        self.assertEqual(x == y, expected,
                         msg=f'{x=} {y=}')

4.5. Lookup

def test_lookup(self):
    cases = [
        ([], None, lambda exp: exp is None),
        ([None], None, lambda exp: exp.elt is None),
        ([], 0, lambda exp: exp is None),
        ([], 1, lambda exp: exp is None),
        ([1], 1, lambda exp: exp.elt == 1),
        ([1, 2], 2, lambda exp: exp.elt == 2),
        ([1, 2], 3, lambda exp: exp is None),
    ]
    for elts, key, expected_func in cases:
        linked_list = LinkedList(*elts)
        self.assertTrue(expected_func(linked_list.lookup(key)),
                         msg=f'{key=}')

4.6. Deletion

def test_deletion_basic(self):
    linked_list = LinkedList(1, 2, 3)
    # If we delete element 1, we're gonna be pointing at 2, then 3.
    linked_list.delete_after()
    self.assertEqual(linked_list, LinkedList(2, 3))

@given(st.lists(st.integers(min_value=1, max_value=100),
                min_size=1,
                max_size=20))
def test_deletion_random(self, given_elts: list[int]):
    linked_list = LinkedList(*given_elts)
    deletions = random.randint(0, len(given_elts))
    for _ in range(deletions):
        linked_list.delete_after()
    self.assertEqual(linked_list.size(), len(given_elts) - deletions)

def test_deletion_nop(self):
    linked_list = LinkedList()
    linked_list.delete_after()
    self.assertEqual(linked_list.size(), 0)

5. Export

This linked list library is used in other problems. So we export it here to a separate reusable file (aka Python module).

The line from __future__ import annotations is so that the class methods can refer to itself (the enclosing class). See this discussion.

from __future__ import annotations
from typing import Any, Optional
code

6. References

Aziz, A., Lee, T.-H., & Prakash, A. (2018). Elements of Programming Interviews in Python: The Insiders’ Guide. CreateSpace Independent Publishing Platform (25 July. 2018).