Codex
1. Introduction
This project aims to capture my notes for studying various programming problems (data structures and algorithms). Have a look at the problems in Section 2.
For those curious about how these notes were created, check out Section 4.
As for why this project is named Codex, I just think it's a cool word that has the word "code" in it as a substring.
1.1. How to read this document
Please do not use the GitHub-rendered view of this file, as many things like links and citations simply do not work. Instead go to https://funloop.org/codex for the best experience, on a desktop or laptop screen (mobile devices don't work very well, not to mention the missing Table of Contents sidebar).
2. Problems
Every problem gets its own README.org
file in its own subfolder. The solutions
are all in Python. All solutions are "standalone" in that none of them use any
libraries other than what's provided by Python's standard libraries.
The problems are drawn mainly from [1]. Other reference materials are cited where applicable. Below is a table of every problem, with tags that give a brief description of each one, and references.
2.1. Data structures
Some problems can only be solved in an elegant way if we use particular data structures. See below for introductory discussions about some of these.
Name | Tags | References |
---|---|---|
Linked list | linked list | [1, p. 91] |
Stack (with "max" method) | stack | [1, p. 106] |
Binary tree | tree | [1, p. 123] |
Binary search tree | tree | [4, p. 396] |
Heap | tree, heap, priority queue | [4, p. 308] |
2.2. Appendix
- Python tricks
- There are some Python-language-specific tricks available for programming problems. You might want to skim over this if your Python skills are rusty.
- Mathematics
- Some (some would argue all) topics in programming have mathematical underpinnings.
3. Tests
Dependency | Why |
---|---|
Ruff | for linting |
Mypy | for enforcing type hints |
Hypothesis | for property-based tests |
All solutions to the problems are implemented in Python, and tested with basic unit tests and the Hyothesis property-based testing framework. Each problem's discussion comes with its own test suite. All source code samples are linted as well with ruff and mypy. Testing has been extremely valuable in checking the correctness of the puzzle solutions collected in this work.
4. Literate Programming Build System
We use Lilac for literate programming. To build everything just run make
shell
to drop inside the development environment shell then run make
inside
it.
4.1. Weaving (generating the docs)
4.1.1. Use lilac.theme
file
First we track Lilac as a submodule within this project, and then symlink to the CSS, JS, and theme files to the project root. Symlinking adds a level of indirection such that if we ever decide to move around the submodule location to somewhere else, we won't have to update all of our Org files and can instead just update these symlinks.
Then in all of our published Org files, we do
to get the CSS/JS that comes with Lilac.
4.1.2. Custom CSS and HTML <head> content
We tweak Lilac's default CSS a bit.
Make all HTML files we generate try to pull in a local file called
codex.css
, as well as a custom font.
4.1.2.1. Main CSS file
The default codex.css
file has some miscellaneous customizations.
4.1.2.1.1. Title font
This makes the title font bigger and uses a more ornate font for it.
4.1.2.1.2. Tables
4.1.2.2. Secondary CSS file
This is a second CSS file where we customize the title text of the pages for the
problems. We just (manually) create a symlink from the problem page's
codex.css
to this one (because we don't want to use the regular codex.css
from above).
4.1.3. Ignore woven HTML from git diff
Typically we only need to look at the rendered HTML output in a web browser as
the raw HTML diff output is extremely difficult to parse as a human. So by
default we ask Git to exclude it from git diff
by treating them as binary
data.
In order to still show the HTML textual diff, we can run git diff --text
.
4.1.3.1. git add -p
Note that the above setting to treat HTML files as binary data prevents them
from being considered for git add -p
. In order to add them, use git add -u
instead.
4.1.4. gitignore
4.2. Tangling (generating the source code)
Tangling is simply the act of collecting the #+begin_src ... #+end_src
blocks
and arranging them into the various target (source code) files. Every source
code block is given a unique name.
We simply tangle all major *.org
files in the toplevel Makefile.
4.3. Linting
4.3.1. Spell checker
We use typos-cli to check for spelling errors. Below we configure it to only check the original source material — Org files.
Here we have the Makefile rules for linting, which include this spellchecker.
4.3.2. Detect long lines
For code we tangle, we want lines to be roughly 80 characters. This limit is actually a bit difficult to enforce because sometimes the source code blocks we edit get placed into an indented location, and from the source code block itself we cannot tell how much this indentation is exactly. So set the maximum line length for tangled text to be 90 characters.
We have to wrap the find ...
invocation with || true
because xargs
will
exit with 123
if the last grep
call can't find a match. That is, in our case
not finding a match is a good thing but the inner grep
doesn't know that.
This code detects which files to look at by looking at lines in Org files that
start with #+begin_src
and end with :tangle foo
, where foo
is the last
word in the line.
4.3.3. Link checker
4.4. Development environment (Nix shell)
This is taken from https://github.com/tweag/haskell-stack-nix-example/blob/b9383e35416a2b0e21fbc97ed079538f9f395b6a/shell.nix#L1.
This is the main development shell and brings in all of our dependencies to build all of our code. It's great for development and testing things out (such as running "make" to re-run any Python tests that have been updated when adding new problems).
4.4.1. Update Nix dependencies
This is based on Lilac's own code for updating Nix dependencies with niv
.