Codex
1. Introduction
This project aims to capture my notes for studying various programming problems (data structures and algorithms). Have a look at the problems in Section 2.
For those curious about how these notes were created, check out Section 4.
As for why this project is named Codex, I just think it's a cool word that has the word "code" in it as a substring.
1.1. How to read this document
Please do not use the GitHub-rendered view of this file, as many things like links and citations simply do not work. Instead go to https://funloop.org/codex for the best experience, on a desktop or laptop screen (mobile devices don't work very well, not to mention the missing Table of Contents sidebar).
2. Problems
Every problem gets its own README.org
file in its own subfolder. The solutions
are all in Python. All solutions are "standalone" in that none of them use any
libraries other than what's provided by Python's standard libraries.
The problems are drawn mainly from [1]. Other reference materials are cited where applicable. Below is a table of every problem, with tags that give a brief description of each one, and references.
2.1. Data structures
Some problems can only be solved in an elegant way if we use particular data structures. See below for introductory discussions about some of these.
Name | Tags | References |
---|---|---|
Linked list | linked list | [1, p. 91] |
Stack (with "max" method) | stack | [1, p. 106] |
Binary tree | tree | [1, p. 123] |
Binary search tree | tree | [4, p. 396] |
Heap | tree, heap, priority queue | [4, p. 308] |
2.2. Appendix
- Python tricks
- There are some Python-language-specific tricks available for programming problems. You might want to skim over this if your Python skills are rusty.
- Mathematics
- Some (some would argue all) topics in programming have mathematical underpinnings.
3. Tests
Dependency | Why |
---|---|
Ruff | for linting |
Mypy | for enforcing type hints |
Hypothesis | for property-based tests |
All solutions to the problems are implemented in Python, and tested with basic unit tests and the Hyothesis property-based testing framework. Each problem's discussion comes with its own test suite. All source code samples are linted as well with ruff and mypy. Testing has been extremely valuable in checking the correctness of the puzzle solutions collected in this work.
4. Literate Programming Build System
We use Lilac for literate programming. To build everything just run make
shell
to drop inside the development environment shell then run make
inside
it.
4.1. Weaving (generating the docs)
PROJ_ROOT := $(shell git rev-parse --show-toplevel) LILAC_ROOT := $(PROJ_ROOT)/deps/elisp/lilac PROCS := $(shell nproc) define run_emacs LILAC_ROOT=$(LILAC_ROOT) emacs $(2) --quick --batch --kill \ --load $(LILAC_ROOT)/lilac.el \ --load $(PROJ_ROOT)/codex.el \ --eval="$(1)" endef define run_emacs_nobatch LILAC_ROOT=$(LILAC_ROOT) emacs $(2) --quick --kill \ --load $(LILAC_ROOT)/lilac.el \ --load $(PROJ_ROOT)/codex.el \ --eval="$(1)" endef src_problem = $(shell find problem/ -type f -name 'README.org') woven_html = \ $(patsubst \ problem/%/README.org, \ problem/%/README.html, \ $(src_problem)) \ appendix/mathematics/README.html \ appendix/python_tricks/README.html \ README.html problem_dirs = $(shell find problem -maxdepth 1 -type d | sort | tail -n+2) problem_dirs_without_prefix = $(subst problem/,,$(problem_dirs)) test_dirs = $(shell find . -type f -name '__init__.py' \ | sed 's|/__init__.py||') # All problem dirs with Tikz-generated images. image_dirs = $(shell find . -type f -name 'images.org' \ | sed -e 's|/[^/]*$$||' -e 's|^problem/||' | sort -u) all_tests_verified = $(patsubst %.py, %.py.verified, \ $(shell find . -type f -name 'test.py')) define weave_org $(1): $(2) build-literate.org \ $(shell find $(3) -type f -name 'images.org' \ | xargs grep 'img.pdf' \ | sed \ -e 's|images.org:#+header: :file ||' \ -e 's|.img.pdf|.svg|') @echo weaving $(2) $(call run_emacs,(lilac-publish),$(2)) endef all: check.verified weave .PHONY: all Makefile-tangle weave: $(woven_html) touch weave README.html: build-literate.org README.org citations.bib $(call run_emacs,(lilac-publish),README.org) $(foreach d,$(problem_dirs_without_prefix),\ $(eval $(call weave_org,\ problem/$(d)/README.html,\ problem/$(d)/README.org,problem))) appendix/mathematics/README.html: appendix/mathematics/README.org \ appendix/mathematics/twos-complement.org $(call run_emacs,(lilac-publish),appendix/mathematics/README.org) appendix/python_tricks/README.html: appendix/python_tricks/README.org $(call run_emacs,(lilac-publish),appendix/python_tricks/README.org) check.verified: lint.verified test touch check.verified test: $(all_tests_verified) touch test lint.verified: \ mypy.verified \ ruff.verified \ spellcheck.verified \ linelength.verified \ linkcheck.verified touch lint.verified mypy.verified: $(all_tests_verified) mypy problem appendix touch mypy.verified ruff.verified: $(all_tests_verified) ruff problem appendix touch ruff.verified Makefile-spellcheck Makefile-linelength Makefile-linkcheck # Enter development environment. shell: nix-shell --pure Makefile-update-deps
4.1.1. Use lilac.theme
file
First we track Lilac as a submodule within this project, and then symlink to the CSS, JS, and theme files to the project root. Symlinking adds a level of indirection such that if we ever decide to move around the submodule location to somewhere else, we won't have to update all of our Org files and can instead just update these symlinks.
Then in all of our published Org files, we do
to get the CSS/JS that comes with Lilac.
4.1.2. Custom CSS and HTML <head> content
We tweak Lilac's default CSS a bit.
Make all HTML files we generate try to pull in a local file called
codex.css
, as well as a custom font.
; See https://stackoverflow.com/a/27285582/437583. (setq lilac-html-head (concat "<link rel=\"stylesheet\" type=\"text/css\" href=\"codex.css\" />\n" "<link rel=\"stylesheet\" href=" "\"https://fonts.googleapis.com/css2" "?family=Bungee+Shade:wght@400" "\">" ))
4.1.2.1. Main CSS file
The default codex.css
file has some miscellaneous customizations.
4.1.2.1.1. Title font
This makes the title font bigger and uses a more ornate font for it.
h1.title { font-family: "Bungee Shade", serif; font-size: 100pt; } @media (any-pointer: coarse) { h1.title { font-family: "Bungee Shade", serif; font-size: 60pt; } }
4.1.2.1.2. Tables
table.monospace-except-header { font-family: monospace; } table.monospace-except-header thead { font-family: var(--font-serif); }
4.1.2.2. Secondary CSS file
This is a second CSS file where we customize the title text of the pages for the
problems. We just (manually) create a symlink from the problem page's
codex.css
to this one (because we don't want to use the regular codex.css
from above).
h1.title { font-family: "Source Serif Pro", serif; } codex-css-tables
4.1.3. Ignore woven HTML from git diff
Typically we only need to look at the rendered HTML output in a web browser as
the raw HTML diff output is extremely difficult to parse as a human. So by
default we ask Git to exclude it from git diff
by treating them as binary
data.
* -diff **/*.org diff **/.gitattributes diff **/.gitmodules diff **/.gitignore diff package/nix/sources.json diff COPYRIGHT diff LICENSE diff
In order to still show the HTML textual diff, we can run git diff --text
.
4.1.3.1. git add -p
Note that the above setting to treat HTML files as binary data prevents them
from being considered for git add -p
. In order to add them, use git add -u
instead.
4.1.4. gitignore
**/__pycache__ **/*.auctex-auto **/*.hypothesis **/*.pdf **/*.log **/*.py.verified check.verified lint.verified linkcheck.verified linelength.verified mypy.verified ruff.verified spellcheck.verified test update-deps weave
4.2. Tangling (generating the source code)
Tangling is simply the act of collecting the #+begin_src ... #+end_src
blocks
and arranging them into the various target (source code) files. Every source
code block is given a unique name.
We simply tangle all major *.org
files in the toplevel Makefile.
build_literate_tangled = \ .gitattributes \ .gitignore \ _typos.toml \ codex.css \ codex.problem.css \ codex.el \ Makefile \ shell.nix $(build_literate_tangled) &: build-literate.org $(call run_emacs,(org-babel-tangle),build-literate.org) touch $(build_literate_tangled) citations.bib: README.org $(call run_emacs,(org-babel-tangle),README.org) touch citations.bib define generate_img_pdfs $(shell grep 'img.pdf' $(1) \ | sed \ -e 's|^|$(1):|' \ -e 's|images.org:#+header: :file ||') &: $(1) @echo generating images from $(1) $(call run_emacs,(org-html-export-to-html),$(1)) rm -f $(patsubst %/images.org, %/images.html, $(1)) endef define generate_img_svgs $(1:.img.pdf=.svg): $(1) @echo generating svg from $(1) @echo $(shell pwd) pdf2svg $(1) $(1).uncropped.svg inkscape \ --export-plain-svg \ --export-margin=5 \ --export-filename=$(1:.img.pdf=.svg) \ --export-area-drawing \ $(1).uncropped.svg rm $(1).uncropped.svg endef $(foreach p,$(image_dirs),\ $(eval $(call generate_img_pdfs,\ $(p)/images.org))) all_img_pdfs = $(shell find . -type f -name 'images.org' \ | xargs grep 'img.pdf' \ | sed 's|images.org:#+header: :file ||') $(foreach img,$(all_img_pdfs),\ $(eval $(call generate_img_svgs,\ $(img)))) define tangle_tests $(1)/__init__.py $(1)/test.py &: $(1)/README.org @echo tangling $(1)/README.org $(call run_emacs,(org-babel-tangle),$(1)/README.org) find $(1) -type f -name '*.py' \ -execdir sed -i 's/[[:blank:]]*$$$$//' {} + endef # See https://stackoverflow.com/a/9694782/437583. $(foreach d,$(test_dirs),\ $(eval $(call tangle_tests,$(d)))) define verify_tests $(1)/test.py.verified: $(1)/test.py python -m unittest discover \ --failfast --start-directory $(1) \ --top $(shell echo $(1) | sed -e 's|./||' -e 's|/.\+||') touch $(1)/test.py.verified endef $(foreach d,$(test_dirs),\ $(eval $(call verify_tests,$(d))))
4.3. Linting
4.3.1. Spell checker
We use typos-cli to check for spelling errors. Below we configure it to only check the original source material — Org files.
[files] extend-exclude = [ "*.html", "deps/*", ]
Here we have the Makefile rules for linting, which include this spellchecker.
ORG_FILES = $(shell find . -type f -name '*.org') spellcheck.verified: $(ORG_FILES) typos touch spellcheck.verified
4.3.2. Detect long lines
For code we tangle, we want lines to be roughly 80 characters. This limit is actually a bit difficult to enforce because sometimes the source code blocks we edit get placed into an indented location, and from the source code block itself we cannot tell how much this indentation is exactly. So set the maximum line length for tangled text to be 90 characters.
We have to wrap the find ...
invocation with || true
because xargs
will
exit with 123
if the last grep
call can't find a match. That is, in our case
not finding a match is a good thing but the inner grep
doesn't know that.
This code detects which files to look at by looking at lines in Org files that
start with #+begin_src
and end with :tangle foo
, where foo
is the last
word in the line.
linelength.verified: $(ORG_FILES) `find . -type f -name '*.org' \ | grep -v '^./deps' \ | xargs grep '^#+begin_src.\+ :tangle ' \ | sed 's,[^/]\+.org.\+:tangle ,,' \ | grep -v citations.bib \ | xargs grep -n '^.\{90\}' > linelength_offenders.log` || true test `wc --bytes linelength_offenders.log | cut -d\ -f1` -eq 0 touch linelength.verified
4.3.3. Link checker
HTML_FILES = $(shell find . -type f -name '*.html' | grep -v '^./deps') linkcheck.verified: $(HTML_FILES) lychee --offline $(HTML_FILES) touch linkcheck.verified
4.4. Development environment (Nix shell)
This is taken from https://github.com/tweag/haskell-stack-nix-example/blob/b9383e35416a2b0e21fbc97ed079538f9f395b6a/shell.nix#L1.
This is the main development shell and brings in all of our dependencies to build all of our code. It's great for development and testing things out (such as running "make" to re-run any Python tests that have been updated when adding new problems).
let # Nixpkgs snapshot. sources = import ./package/nix/sources.nix; # The final "pkgs" attribute with all the bells and whistles of our overlays. pkgs = import sources.nixpkgs {}; # This minimalist latex setup is adapted from https://nixos.wiki/wiki/TexLive. tex_for_orgmode = (pkgs.texlive.combine { # Start with scheme-basic. inherit (pkgs.texlive) scheme-basic # Add in additional TeX packages (think CTAN package names). wrapfig amsmath ulem hyperref capt-of # TikZ. pgf xkeyval fontspec tikz-qtree # Source Sans Pro font. sourcesanspro # Source Code Pro font. sourcecodepro ; }); in # This is our development shell. pkgs.mkShell ({ buildInputs = [ # Tangling and weaving for Literate Programming. pkgs.emacs29-nox # Diagrams. pkgs.inkscape pkgs.pdf2svg tex_for_orgmode # Misc pkgs.git pkgs.less # Update deps (bootstrap). pkgs.niv pkgs.nix pkgs.cacert # Spell checking. pkgs.typos # Link checker. pkgs.lychee # Python testing and linting. pkgs.python3Packages.hypothesis pkgs.python3Packages.mypy pkgs.ruff ]; })
4.4.1. Update Nix dependencies
This is based on Lilac's own code for updating Nix dependencies with niv
.
nixpkgs_stable_channel := nixos-23.11 update-deps: package/nix/sources.json package/nix/sources.nix cd package && niv update nixpkgs --branch $(nixpkgs_stable_channel) cd package && niv update touch update-deps
4.5. Elisp
(setq org-cite-csl-styles-dir (concat (getenv "LILAC_ROOT") "/deps/styles/")) (setq org-latex-pdf-process '("lualatex --shell-escape --interaction nonstopmode --output-directory=%o %f")) codex-html-head