Linus’s Blog http://funloop.org/atom.xml Linus Arver 2021-03-15T00:00:00Z Bresenham's Circle Drawing Algorithm http://funloop.org/post/2021-03-15-bresenham-circle-drawing-algorithm.html 2021-03-15T00:00:00Z 2021-03-15T00:00:00Z 2021-03-15
programming, math

Once upon a time I was given the following problem for a technical programming interview:

Write a function draw_circle(r) that draws a circle with radius r. Use the given method draw_pixel(x, y) which takes a 2-dimensional point (x, y) and colors it in on the computer screen.

For the solution, you can either collect all pixels (tuples) of $$x$$ and $$y$$ coordinate pairs, or just call draw_pixel() on them during the “search” for those pixels that must be filled in.

This post goes over several solutions, ultimately arriving at Bresenham’s algorithm. The content of this post is merely a distillation of Section 3.3 from the book “Computer Graphics: Principles and Practice (1996)”.1 The authors of the book state that their implementation results in code “essentially the same as that specified in patent 4,371,933 [a.k.a. Bresenham’s algorithm].”2

I’ve gone all out and converted the “reference” implementations found in the book and translated them into Rust and Python. The Python was written first, and I used a text-based drawing system to test the correctness. However I became dissatisfied with the non-square “aspect ratio” of most monospaced fonts out there, which distorted the circles to look more like ellipses. To fix this, I decided to port the Python code to Rust, and then target WASM so that I can use it to draw on the HTML5 <canvas> elements (and to eliminate the “aspect ratio” problem). All of the drawings in this document are powered by the Rust code.

Constraints

Drawable canvas

Before we start, let’s define the drawable surface (canvas) of pixels for this problem. The pixels are arranged in a 2-dimensional grid. The important thing here is the grid or coordinate system, with the pixel at the center of the grid having the traditional (0, 0) Cartesian coordinate.

Below is a sample grid to give you a sense of what this will look like. There is a central (0, 0) origin pixel, and 15 pixels to the north, south, east, and west, and everything in-between. Pixels that lie on interesting points of symmetry are highlighted in green.

Mathematical definitions

The exact definition of a circle (given infinite precision, as on the traditional Cartesian plane) centered at the origin is

$\begin{equation} \label{eq:circle} x^2 + y^2 = r^2. \end{equation}$

This resembles the Pythagorean Theorem

$a^2 + b^2 = c^2,$

for any right-angled triangle with sides $$a$$ and $$b$$ and hypotenuse $$c$$. The resemblance is not a coincidence, because an infinite number of such triangles exists within the top right quadrant of the plane (that is, Quadrant I3, or the part of the plane such that $$x \geq 0$$ and $$y \geq 0$$); in Quadrant I, for all points $$(x,y)$$ that make up this portion (or arc) of the circle, their radii is the same as the hypotenuses of these triangles (whose sides are $$x$$ and $$y$$). Later in this post, this will become relevant again when we discuss Pythagorean Triples.

Anyway, solving for $$y$$ in Equation $$\ref{eq:circle}$$ gives

$\begin{equation} \label{eq:circle-y} y = \pm\sqrt{r^2 - x^2} \end{equation}$

to get 2 functions for the top-half and bottom-half of the circle (that’s what the $$\pm$$ symbol means). Consider the function $$y = x$$. This function has slope 1 and is a diagonal line where all values of $$x = y$$. Now consider how this line intersects the quarter-arc of the circle in Quadrant I. This intersection point evenly divides the arc into 2 halves, and is where

$x = y = \tfrac{r}{\sqrt{2}},$

or simply the point

$\begin{equation} (\tfrac{r}{\sqrt{2}}, \tfrac{r}{\sqrt{2}}). \end{equation}$

This is because if $$x = y$$, then Equation $$\ref{eq:circle}$$ becomes

\begin{align} x^2 + y^2 &= r^2 \\ x^2 + x^2 &= r^2 \\ 2x^2 &= r^2 \\ \tfrac{2x^2}{2} &= \tfrac{r^2}{2} \\ x^2 &= \tfrac{r^2}{2} \\ \sqrt{x^2} &= \tfrac{\sqrt{r^2}}{\sqrt{2}} \\ x &= \tfrac{r}{\sqrt{2}}. \label{eq:arc-intersection} \end{align}

This is not that interesting for purposes of the algorithms in this post, but is something that is glossed over in the book.

Symmetry

Because of symmetry, we can mirror the solution $$(x,y)$$ pairs we get in Quadrant I into the other quadrants. This gives us 4-way symmetry because there are 4 quadrants.

def mirror_points_4(x, y):
""" Return 4-way symmetry of points. """
return [( x,  y),
(-x,  y),
( x, -y),
(-x, -y)]

Note, however, that there is actually 8-way symmetry at hand because (1) we can swap $$x$$ and $$y$$, and (2) because of the way we can distribute the negative sign:

1 ( x, y) I
2 ( y, x) I
3 (-x, y) II
4 (-y, x) II
6 (-x,-y) III
5 (-y,-x) III
7 ( x,-y) IV
8 ( y,-x) IV
def mirror_points_8(x, y):
""" Return 8-way symmetry of points. """
return [( x,  y),
( y,  x),
(-x,  y),
(-y,  x),
( x, -y),
( y, -x),
(-x, -y),
(-y, -x)]

Fun fact: the exact point at which $$x$$ and $$y$$ get “swapped” in Quadrant I is when $$x = y = \tfrac{r}{\sqrt{2}}$$ (Equation $$\ref{eq:arc-intersection}$$).

Naive solutions

When in doubt, brute force is always a great answer, because at least it gets you started on something that works given enough time and/or memory.4 Because we already have clear mathematical definitions, we can just translate them (albeit mechanically) to code.

def get_circle_points_naive_4(r):
""" Draw a circle by pairing up each Y value with an X value that lie on a
circle with radius 'r'. This has a bug because some Y values get skipped.
Can you see why?
"""
points = []
for x in range(r + 1):
# isqrt() gets the integer square root.
y = isqrt((r * r) - (x * x))
points.extend(mirror_points_4(x, y))
return points

get_circle_points_naive_4() is the simplest translation, although there is a bug, which is obvious when we visualize it (in this case, for $$r = 15$$):

The get_circle_points_naive_4() is based on Equation $$\ref{eq:circle-y}$$. We iterate $$x$$ from $$0$$ to $$r$$ 5, and at each $$x$$ try to find the best value for $$y$$. The problem is that we’re only solving for 1 $$y$$ value for every $$x$$ value we increment by. As we get near the left and right sides of the circle, we need to calculate more than just 1 $$y$$ value for every $$x$$.6.

The get_circle_points_naive_8() function gets around this $$y$$-skip bug by invoking 8-way symmetry instead:

def get_circle_points_naive_8(r):
""" Better than get_circle_points_naive_4, but wastes CPU cycles because
the 8-way symmetry overcorrects and we draw some pixels more than once.
"""
points = []
for x in range(r + 1):
y = isqrt((r * r) - (x * x))
points.extend(mirror_points_8(x, y))
return points

However the downside is that it results in multiple points that will be drawn 2 times, wasting CPU cycles.7 To be more precise, all points around the gappy area in Quadrant I are redundant because that part of the arc is already mirrored nicely by the contiguous points from $$x = 0$$ to $$x = y$$.

The get_circle_points_naive_8_faster() function avoids drawing the gappy areas by just breaking the loop when $$x > y$$, but is otherwise the same:

def get_circle_points_naive_8_faster(r):
""" Slightly faster than get_circle_points_naive_8, because of the break
condition at the middle of the arc. However this is still inefficient due
to the square root calculation with isqrt().
"""
points = []
for x in range(r + 1):
y = isqrt((r * r) - (x * x))
# When we cross the middle of the arc, stop, because we're already
# invoking 8-way symmetry.
if x > y:
break
points.extend(mirror_points_8(x, y))
return points

This is the best we can do with the simple mathematical translations to code. Note that in all of these implementations we are still forced to calculate square roots in every iteration, which is certainly suboptimal.

Bresenham’s Algorithm

This as also known as the “Midpoint Circle Algorithm,” where the name “midpoint” comes from the mathematical calculations that are done by considering the midpoint between pixels. The gist of the algorithm is that instead of using Equation $$\ref{eq:circle-y}$$ to calculate $$y$$ for every $$x$$, instead you try to move along the arc of the circle, pixel-to-pixel, staying as close as possible to the true arc:

1. Start out from the top of the circle (color in pixel $$(0, r)$$). Note that because of symmetry, we could start out from $$(0, -r)$$, $$(r, 0)$$, or even $$(-r, 0)$$ as Bresenham did in his paper.8
2. Move right (east (E)) or down-right (southeast (SE)), whichever is closer to the circle.
3. Stop when $$x = y$$ (just like in get_circle_points_naive_8_faster()).

The hard part is Step 2, where we just need to figure out which direction to move (E or SE) from the current pixel. The brute force way here is to just calculate the distance away from the center of the circle for the E and SE pixels (using Euclidean distance, which is just a variation of Equation $$\ref{eq:circle}$$ or the Pythagorean Theorem), and just choose the pixel that is closest to the arc of the circle. This makes sense, but with the power of mathematics, we can do better.

Inside, on, or outside the circle?

In order to figure out whether some point $$(x, y)$$ is inside, on, or outside of the circle depends on the definition of the circle from Equation $$\ref{eq:circle}$$. We can tweak it in terms of any $$(x, y)$$ pair:

$\begin{equation} \label{eq:error-margin} F(x,y) = x^2 + y^2 - r^2 = \text{distance from true circle line}. \end{equation}$

Note that if $$F(x,y) = 0$$, then the point $$(x, y)$$ is exactly on the circle. If $$F(x,y) > 0$$, then the point is outside of the circle, and if $$F(x,y) < 0$$ then the point is inside of it. In other words, given any point $$(x, y)$$, $$F(x, y)$$ is the distance from the true circle line.

Choosing between E or SE

Let’s remind ourselves that we’ll always be moving E or SE. One critical (pragmatic) property here is that we’re dealing with a pixel grid with integer increments. There is a very high chance that neither the E or SE pixels we’re moving to is exactly on the circle. This is because the only time that the point $$(x,y)$$ will exactly be on the line of the circle is if the $$x$$, $$y$$, and $$r$$ values (as integers) form a so-called Pythagorean Triple. For $$r < 100$$, there are only 50 such triples:

( 3, 4, 5)  (18,24,30)  (24,45,51)  (16,63,65)  (51,68,85)
( 6, 8,10)  (16,30,34)  (20,48,52)  (32,60,68)  (40,75,85)
( 5,12,13)  (21,28,35)  (28,45,53)  (42,56,70)  (36,77,85)
( 9,12,15)  (12,35,37)  (33,44,55)  (48,55,73)  (13,84,85)
( 8,15,17)  (15,36,39)  (40,42,58)  (24,70,74)  (60,63,87)
(12,16,20)  (24,32,40)  (36,48,60)  (45,60,75)  (39,80,89)
(15,20,25)  ( 9,40,41)  (11,60,61)  (21,72,75)  (54,72,90)
( 7,24,25)  (27,36,45)  (39,52,65)  (30,72,78)  (35,84,91)
(10,24,26)  (30,40,50)  (33,56,65)  (48,64,80)  (57,76,95)
(20,21,29)  (14,48,50)  (25,60,65)  (18,80,82)  (65,72,97)

In other words, for all practical purposes, there will always be some error and we’ll always be outside or inside the circle and never directly on it. It’s sort of like driving a car and trying to stay within your designated lane: if you think you’re moving too much to the right, you turn your wheel left to stay “within” the lane (or some acceptable amount within the lane), and vice versa.

The idea is the same for moving along the circle: if we think we’re moving too far outside the circle, we try to move into it. On the other hand, if we think we’re moving into the circle, we move out of it. And so imagine yourself standing on point $$(0, r)$$, our starting point. The line of the circle is our “lane” we want to stay “on” as much as possible. Choosing to go E is the same as turning “left”. Choosing to go SE is the same as turning “right”. Using this metaphor, if we were not to turn at all (go “straight”), we would be heading to the virtual “in-between” pixel between E and SE, the midpoint between them.

And so here’s the basic idea behind choosing E or SE:

1. If going “straight” would mean going into the circle (i.e., we’re currently veering too much to the right!), we course-correct by turning left (E).
2. Conversely, if going “straight” would mean going outside the circle (i.e., we’re currently veering too much to the left), we course-correct by turning right (SE).
3. Lastly, if going “straight” would mean staying exactly on the circle (we hit a Pythagorean Triple), we turn SE (from an engineering perspective it doesn’t really matter which way we turn in this case, as both E and SE result in some amount of error — although see “Final tweaks” below for a note on aesthetics).

Let’s convert this idea into pseudocode:

Let M be the midpoint (going "straight").

Then, F(M) tells us what direction we're headed relative to the true circle line.

If F(M) is < 0, we're moving "into" the circle (veering right), so turn left by moving E.

Otherwise move SE.

Note that we only have to calculate $$F(...)$$ for the midpoint $$M$$. Isn’t this cool? It is much better than calculating $$F(E)$$ and $$F(SE)$$ and having to compare them!

# This F() function is the same as the mathematical F(...) function
# discussed above (Equation 11).
def F(x, y, r):
return (x * x) + (y * y) - (r * r)

def get_circle_points_bresenham_WIP1(r):
points = []
x = 0
y = r
# Calculate F(M) for the very first time. That is, if we were to go
# "straight" from (0, r), would we be inside or outside the circle?
xE, yE = (1, r)
xSE, ySE = (1, r - 1)
xM, yM = (1, r - 0.5)
F_M = F(xM, yM, r)
points.extend(mirror_points_8(x, y))
while x < y:
# If going straight would go "into" the circle (too much to the
# right), try to move out of it by turning left by moving E.
if F_M < 0:
x += 1
F_M = F(x, y, r)
# Otherwise move SE.
else:
x += 1
y -= 1
F_M = F(x, y, r)
points.extend(mirror_points_8(x, y))
return points

We can refactor the above slightly. We can simplify the initial calculation of F_M to avoid calling F(), and also move out some of the redundant bits. The very first midpoint we have to consider is $$(1, r - \tfrac{1}{2})$$; plugging this into $$F()$$ gets us

\begin{align} F(1, r - \tfrac{1}{2}) &= 1^2 + (r - \tfrac{1}{2})^2 - r^2 \\ &= 1 + (r^2 - r + \tfrac{1}{4}) - r^2 \\ &= 1 + r^2 - r^2 - r + \tfrac{1}{4} \\ &= 1 - r + \tfrac{1}{4} \\ &= \tfrac{5}{4} - r. \end{align}

With that said, we can get this:

def get_circle_points_bresenham_WIP2(r):
points = []
x = 0
y = r
F_M = 5/4 - r
points.extend(mirror_points_8(x, y))
while x < y:
# If going straight would go "into" the circle (too much to the
# right), try to move out of it by turning left by moving E.
if F_M < 0:
pass
# Otherwise move SE.
else:
y -= 1
x += 1
F_M = F(x, y, r)
points.extend(mirror_points_8(x, y))
return points

The annoying bit is the call to F(). Surprisingly, the call to F() can be elimitated entirely, because we can calculate it once, and then merely adjust it thereafter.

We can just calculate $$F(x,y)$$ once when we start out at $$(0, r)$$, and then just adjust it depending on whether we move E or SE. The key is that this “adjustment” computation is cheaper than calculating the full $$F(x,y)$$ distance function all over again.

Let $$M$$ be the midpoint $$(x + 1, y - \tfrac{1}{2})$$ between the E $$(x + 1, y)$$ and SE $$(x + 1, y - 1)$$ pixels. Then $$F(M)$$ is the result of going “straight” and tells us the direction we’re veering off from the circle line:

$\begin{equation} F(M) = F(x + 1, y - \tfrac{1}{2}) = (x + 1)^2 + (y - \tfrac{1}{2})^2 - r^2. \end{equation}$

The values for $$x$$ and $$y$$ are unknown, however they change in only 2 possible ways — by moving E or SE!

If we move E, then $$M$$ will change from $$(x + 1, y - \tfrac{1}{2})$$ to $$(x + 2, y - \tfrac{1}{2})$$ because we add 1 to $$x$$ to move 1 pixel east; the new value of $$F(M)$$ at this pixel, which we can call $$F(M_E)$$, will then be:

$\begin{equation} F(M_{E}) = F(x + 2, y - \tfrac{1}{2}) = (x + 2)^2 + (y - \tfrac{1}{2})^2 - r^2. \end{equation}$

Now we can take the difference between these two full calculations. That is, if we were to move E, how would $$F(M)$$ change? Simple, we just look at the change in $$x$$ ($$\Delta_{x}$$) (we don’t care about the change in $$y$$ or $$r$$, because they stay constant in this case).

\begin{align} \Delta_{E} &= F(M_{E}) - F(M) \\ &= [(x + 2)^2 + (y - \tfrac{1}{2})^2 - r^2] - [(x + 1)^2 + (y - \tfrac{1}{2})^2 - r^2] \\ &= \Delta_{x} \\ &= (x + 2)^2 - (x + 1)^2 \label{eq:de1} \\ &= (x^2 + 4x + 4) - (x^2 + 2x + 1) \\ &= x^2 + 4x + 4 - x^2 - 2x - 1 \\ &= x^2 - x^2 + 4x - 2x + 4 - 1 \\ &= 2x + 3. \label{eq:de2} \end{align}

So $$F(M)$$ will change by $$2x + 3$$ if we move E. So at any given point, if we move E, $$F(M)$$ will always change by $$2x + 3$$.

How about for moving SE? If we move SE, the new value of $$M$$ will change from $$(x + 1, y - \tfrac{1}{2})$$ to $$(x + 2, y - \tfrac{3}{2})$$ because we add 1 to $$x$$ and subtract 1 from $$y$$ to move 1 pixel southeast; the new value of $$F(M)$$ for this case, which we call $$F(M_{SE})$$, will then be:

$\begin{equation} F(M_{SE}) = F(x + 2, y - \tfrac{3}{2}) = (x + 2)^2 + (y - \tfrac{3}{2})^2 - r^2. \end{equation}$

We can do the same difference analysis here, but with the addition that we have to consider the change in $$y$$ ($$\Delta_{y}$$) as well (because of the 1 we subtracted from $$y$$):

\begin{align} \Delta_{SE} &= F(M_{SE}) - F(M) \\ &= [(x + 2)^2 + (y - \tfrac{3}{2})^2 - r^2] - [(x + 1)^2 + (y - \tfrac{1}{2})^2 - r^2] \\ &= \Delta_{x} + \Delta_{y} \\ &= [(x + 2)^2 - (x + 1)^2] + [(y - \tfrac{3}{2})^2 - (y - \tfrac{1}{2})^2] \\ &= (2x + 3) + [(y^2 - \tfrac{6y}{2} + \tfrac{9}{4}) - (y^2 - y + \tfrac{1}{4})] \\ &= (2x + 3) + (y^2 - 3y + \tfrac{9}{4} - y^2 + y - \tfrac{1}{4}) \\ &= (2x + 3) + (y^2 - y^2 - 3y + y + \tfrac{9}{4} - \tfrac{1}{4}) \\ &= (2x + 3) + (- 2y + \tfrac{8}{4}) \\ &= (2x + 3) + (-2y + 2) \\ &= 2x + 3 - 2y + 2 \\ &= 2x - 2y + 5 \\ &= 2(x - y) + 5. \label{eq:se1} \end{align}

And so when moving SE, the new $$F(M)$$ must change by $$2(x - y) + 5$$.

Now we have all the pieces to derive the complete algorithm!

def get_circle_points_bresenham_float_ese(r):
""" Draw a circle using a floating point variable, F_M. Draw by moving E or
SE."""
points = []
x = 0
y = r
# F_M is a float.
F_M = 5 / 4 - r
points.extend(mirror_points_8(x, y))
while x < y:
if F_M < 0:
F_M += 2.0 * x + 3.0
else:
F_M += 2.0 * (x - y) + 5.0
y -= 1
x += 1
points.extend(mirror_points_8(x, y))
return points

Integer-only optimization

The initial value of F_M ($$F(M)$$) is $$\tfrac{5}{4} - r$$. Notice how this is the only place where we have to perform division in the whole algorithm. We can avoid this initial division (and subsequent floating point arithmetic) by initializing it to $$1 - r$$ instead, which is a difference of $$\tfrac{1}{4}$$ vs the original.

Because we tweaked the initialization by $$\tfrac{1}{4}$$, we have to do the same for all comparisons of $$F(M)$$ moving forward. That is, the comparison $$F(M) < 0$$ actually becomes $$F(M) < -\tfrac{1}{4}$$. However, this fractional comparison is unnecessary because we only deal with integer increments and decrements in the rest of the code, so we can just keep the same $$F(M) < 0$$ as before. In other words, our algorithm only cares about whole numbers, so worrying about this extra $$\tfrac{1}{4}$$ difference is meaningless.

def get_circle_points_bresenham_integer_ese(r):
""" Like draw_circle_bresenham_float_ese, but F_M is an integer variable.
"""
points = []
x = 0
y = r
# F_M is an integer!
F_M = 1 - r
points.extend(mirror_points_8(x, y))
while x < y:
if F_M < 0:
# We can use a bit-shift safely because 2*n is the same as n << 1
# in binary, and also because F_M is an integer.
F_M += (x << 1) + 3
else:
F_M += ((x - y) << 1) + 5
y -= 1
x += 1
points.extend(mirror_points_8(x, y))
return points

Second-order differences

There is a final optimization we can do.9 In the “Calculate once, adjust thereafter” section we avoided calculating $$F(M)$$ from scratch on every iteration. We can do the same thing for the differences themselves!

That is, we can avoid calculating $$\Delta_{E} = (2x + 3)$$ and $$\Delta_{SE} = 2(x - y) + 5$$ on every iteration, and instead just calculate them once and make adjustments to them, just like we did earlier for $$F(M)$$.

Let’s first consider how $$\Delta_{E} = 2x + 3$$ changes. First, we initialize $$\Delta_{E}$$ by plugging in $$(0, r)$$ into Equation $$\ref{eq:de2}$$, our starting point. Because there is no $$y$$ variable in here, we get an initial value of

$\begin{equation} \label{eq:de-2ord-initial} 2(0) + 3 = 3. \end{equation}$

If we go E, $$\Delta_{E}$$ changes like this: \begin{align} \Delta_{E_{new}} = \Delta_{E_(x+1,y)} - \Delta_{E_(x,y)} &= [2(x+1) + 3] - (2x + 3) \label{eq:de-2ord-e} \\ &= 2x + 2 + 3 - 2x - 3 \\ &= 2x - 2x + 3 - 3 + 2 \\ &= 2. \label{eq:e2ord} \end{align}

If we go SE, $$\Delta_{E}$$ changes in the exact same way, because even though our new point is at $$(x+1, y-1)$$, there is no $$y$$ in $$\Delta_{E} = 2x + 3$$, so it doesn’t matter and $$\Delta_{E_{new}} = 2$$ again.

Now let’s consider how $$\Delta_{SE}$$ changes. For the initial value, we again plug in $$(0, r)$$ into $$2(x-y) + 5$$, to get

$\begin{equation} \label{eq:dse-2ord-initial} 2(0-r) + 5 = -2r + 5. \end{equation}$

If we go E, $$\Delta_{SE}$$ changes like this:

\begin{align} \Delta_{SE_{new}} = \Delta_{SE_(x+1,y)} - \Delta_{SE_(x,y)} &= [2((x + 1)-y) + 5] - [2(x - y) + 5] \label{eq:dse-2ord-e} \\ &= (2x + 2 - 2y + 5) - (2x - 2y + 5) \\ &= 2x - 2y + 7 - 2x + 2y - 5 \\ &= 2x - 2x + 2y - 2y + 7 - 5 \\ &= 2. \label{eq:se2ord1} \end{align}

If we go SE, $$\Delta_{SE}$$ changes like this:

\begin{align} \Delta_{SE_{new}} = \Delta_{SE_(x+1,y-1)} - \Delta_{SE_(x,y)} &= [2((x + 1)-(y - 1)) + 5] - [2(x - y) + 5] \label{eq:dse-2ord-se} \\ &= [2(x + 1 - y + 1) + 5] - (2x - 2y + 5) \\ &= (2x + 2 - 2y + 2 + 5) - 2x + 2y - 5 \\ &= 2x- 2x + 2y - 2y + 5 - 5 + 2 + 2 \\ &= 2 + 2 \\ &= 4. \label{eq:se2ord2} \end{align}

The code should then look like this:

def get_circle_points_bresenham_2order(r):
points = []
x = 0
y = r
F_M = 1 - r
d_e = 3 # Equation 40
d_se = -(2 * r) + 5 # Equation 45
points.extend(mirror_points_8(x, y))
while x < y:
if F_M < 0:
F_M += d_e
d_e += 2  # Equation 44
d_se += 2 # Equation 50
else:
F_M += d_se
d_e += 2  # Equation 44
d_se += 4 # Equation 56
y -= 1
x += 1
points.extend(mirror_points_8(x, y))
return points

With a little refactoring, we can arrive at a slightly simpler version:

def get_circle_points_bresenham_integer_ese_2order(r):
""" Like draw_circle_bresenham_integer_ese, but use 2nd-order differences
to remove multiplication from the inner loop. """
points = []
x = 0
y = r
F_M = 1 - r
# Initial value for (0,r) for 2x + 3 = 0x + 3 = 3.
d_e = 3
# Initial value for (0,r) for 2(x - y) + 5 = 0 - 2y + 5 = -2y + 5.
d_se = -(r << 1) + 5
points.extend(mirror_points_8(x, y))
while x < y:
if F_M < 0:
F_M += d_e
else:
F_M += d_se
# Increment d_se by 2 (total 4) if we go southeast.
d_se += 2
y -= 1
# Always increment d_e and d_se by 2!
d_e += 2
d_se += 2
x += 1
points.extend(mirror_points_8(x, y))
return points

The “purist” in me felt that the decrementing of $$y$$ stood out like a sore thumb, and so I created a tweaked version that moves E and NE, starting out from $$(0, -r)$$ instead. The mathematical techniques are the same, and due to symmetry the behavior of the algorithm does not change.

def get_circle_points_bresenham_integer_ene_2order(r):
""" Like draw_circle_bresenham_integer_ene, but start from (0, -r) and move
E or NE. Notice how we only need the addition instruction in the while loop
(y is incremented, not decremented). """
points = []
x = 0
y = -r
F_M = 1 - r
# Initial value for (0,-r) for 2x + 3 = 0x + 3 = 3.
d_e = 3
# Initial value for (0,-r) for 2(x + y) + 5 = 0 - 2y + 5 = -2y + 5.
d_ne = -(r << 1) + 5
points.extend(mirror_points_8(x, y))
while x < -y:
if F_M < 0:
F_M += d_e
else:
F_M += d_ne
d_ne += 2
y += 1
d_e += 2
d_ne += 2
x += 1
points.extend(mirror_points_8(x, y))
return points

Here are a couple drawings using Bresenham’s algorithm. This one is for $$r = 15$$:

And for $$r = 60$$:

Comparisons vs naive algorithm

Here are some side-by-side comparisons for $$0 \leq r \leq 10$$.

0
1
2
3
4
5
6
7
8
9
10

Final tweaks

It has been kindly pointed out that the naive algorithm is aesthetically more pleasing if the calculations involving $$r$$ is done with $$r + \tfrac{1}{2}$$ instead of just $$r$$ itself, like this:

""" This is much closer to Bresenham's algorithm aesthetically, by simply
using 'r + 0.5' for the square root calculation instead of 'r' directly.
"""
points = []
# In the square root calculation, we just use (r + 0.5) instead of just r.
# This is more pleasing to the eye and makes the lines a bit smoother.
r_tweaked = r + 0.5
for x in range(r + 1):
y = sqrt((r_tweaked * r_tweaked) - (x * x))
if x > y:
break
points.extend(mirror_points_8(x, floor(y)))
return points

Indeed, the small tweak seems to do wonders to the output for low values of $$r$$.

At the same time, there is a tweak we can do as well for the Bresenham algorithm. Instead of turning E (“left”, or away from the circle) when $$F(M) < 0$$, we can do so when $$F(M) \leq 0$$.

def get_circle_points_bresenham_integer_ene_2order_leq(r):
""" Like draw_circle_bresenham_integer_ene_2order, but use 'f_m <= 0'
"""
points = []
x = 0
y = -r
F_M = 1 - r
d_e = 3
d_ne = -(r << 1) + 5
points.extend(mirror_points_8(x, y))
while x < -y:
if F_M <= 0:
F_M += d_e
else:
F_M += d_ne
d_ne += 2
y += 1
d_e += 2
d_ne += 2
x += 1
points.extend(mirror_points_8(x, y))
return points

This makes us turn “left” slightly more often, and intuitively, should give us a slightly larger circle.

Anyway, see for yourself how the tweaks play out for $$0 \leq r \leq 10$$:

Bresenham Bresenham
(tweaked conditional)
0
1
2
3
4
5
6
7
8
9
10

It appears to me that the most aesthetically pleasing algorithm is the tweaked version of the Bresenham algorithm.10 When given equally bad choices (the case where $$F(M) = 0$$), this version draws a pixel away from the origin by choosing to go E, thereby drawing a slightly bigger circle. You can see this play out in the above table for when $$r = 6$$ and especially $$r = 1$$. It’s a bit unfortunate that the authors of the book did not choose this version, as it seems to do a better job for small values of $$r$$.

We can carry over the same intuition over to the tweak to increase $$r$$ by $$\tfrac{1}{2}$$ for the naive algorithm — increasing $$r$$ should result in a larger value of $$y$$, thereby resulting in drawing a larger circle (and in the process improving the aesthetics). Neat!

Conclusion

To me, Bresenham’s algorithm is interesting because it does not try to be “perfect”. Instead it merely does its best to reduce the amount of error, and in doing so, gets the job done remarkably well.

The technique of avoiding the full polynomial calculation behind $$F(M)$$ (referred by the book as finding the first and second-order differences) took some time to get used to, but is intuitive enough in the end. You just need to consider differences in terms of variables. There’s also a connection to calculus because we’re dealing in terms of differences to “cut down” on the polynomial degrees — we go from the squares in Equation $$\ref{eq:circle}$$ to just linear functions in Equations $$\ref{eq:de2}$$ and $$\ref{eq:se1}$$, and again go one more step to just constant functions in Equations $$\ref{eq:e2ord}$$, $$\ref{eq:se2ord1}$$, and $$\ref{eq:se2ord2}$$.

I hope you learned something!

Happy hacking!

1. Foley, J. D., van Dam, A., Feiner, S. K., Hughes, J. F. (1996). Basic Raster Graphics Algorithms for Drawing 2D Primitives, Scan Converting Circles. Computer Graphics: Principles and Practice (pp. 81–87). Addison-Wesley. ISBN: 0201848406↩︎

2. Bresenham, J.E., D.G. Grice, and S.C. Pi, “Bi-Directional Display of Circular Arcs,” US Patent 4,371,933. February 1, 1983. Note: unfortunately, trying to understand the original text of the patent is perhaps equally as difficult as inventing the algorithm on your own from scratch. Hence this blog post.↩︎

3. There are 4 such quadrants: I, II, III, and IV.↩︎

4. In some sense, all great algorithms are mere optimizations of brute force approaches.↩︎

5. In code, we have to write range(r + 1) because the range() function does not include the last integer. Such “fence-post” or “off by one” logic is the bane of computer programmers.↩︎

6. Mathematically, this is because the slope of the arc in Equation $$\ref{eq:circle-y}$$ approach positive and negative infinity around these areas.↩︎

7. In the Rust WASM implementation that is used for the graphics in this blog post, we actually use a bitmap such that we only draw a particular pixel just once. However, we still end up setting the a pixel as “on” more than once.↩︎

8. Bresenham, Jack. “A Linear Algorithm for Incremental Digital Display of Circular Arcs.” Communications of the ACM, vol. 20, no. 2, 1977, pp. 100–106., doi:10.1145/359423.359432.↩︎

9. It is not clear to me if this change runs faster on modern CPUs, because I recall reading that multiplication can sometimes be faster than adding. But it’s still interesting.↩︎

10. This version looks slightly better than the tweaked naive one for $$r = 8$$.↩︎

]]>
Using MPD for ReplayGain http://funloop.org/post/2021-01-05-mpd-and-replaygain.html 2021-01-05T00:00:00Z 2021-01-05T00:00:00Z 2021-01-05
linux, audio

Something like ~10 years ago, there was no easy way to apply ReplayGain to various audio files with different formats (e.g., flac vs mp3). Over the holiday break I discovered r128gain which is exactly the tool I wanted for many years. You just run

r128gain -r <folder>

and it will recursively tag all music files with ReplayGain information — in parallel, no less!

The only downside is that neither cmus nor mpv currently support the R128_TRACK_GAIN tag that r128gain generates (at least for *.opus files).1 However, I discovered that MPD (Music Player Daemon) supports R128_TRACK_GAIN.2 MPD is easy to start up and the basic minimal configuration was small enough:

music_directory     "~/Music"
# Automatically prune new/deleted files in the music_directory.
auto_update         "yes"

# Allow saving playlists from vimpc.
playlist_directory  "~/Music/playlists"

audio_output {
type            "pulse"
name            "My Pulse Output"
}

# Enable replay gain.
replaygain          "track"

As far as actually controlling MPD, I settled on vimpc — because Vi-like keybindings are too good to pass up.

Cheers!

1. To be precise, cmus has a commit in master that adds support, and mpv has an open issue for it. And I’m too lazy to compile cmus from source.↩︎

2. I had actually used MPD back in the day, but switched to cmus because it was simpler. And because cross-format ReplayGain tagging software was not available, MPD’s support for ReplayGain wasn’t very meaningful for me.↩︎

]]>

When I was a child growing up in Korea, I remember seeing newspapers with pictures of Baduk positions with professional commentary. Unfortunately, no one in my immediate family had any interest in the game, so my initial curiosity of the game came and went.2

Later in the United States, I picked up chess during high school. Chess had the advantage that there were many more people who already know how to play the game. And also, carrying around a lightweight board with some pieces wasn’t difficult at all. I recall whipping out my chess board for a quick game during lunch, recess, and any other time I could find an opponent.

I took a step back from chess during college and later years. I started to lose interest after I kept playing the same openings.

Around 2016 I re-learned how to play Baduk. One reason I picked it up again was that at this time, Baduk was still played best by humans (a computer AI had not yet defeated a master human player on an even game). This was just before the great “AlphaGo” match with Lee Sedol, and at the time most people believed that AI supremacy in this game was still another decade away.

I started playing and losing a ton of games against the AI on 9x9 at online-go.com, especially around 2019 when sometimes I played marathon 9x9 games, over and over against the computer. The most memorable game from this period is this one where I won by 0.5 points against a 7-stone handicap. That victory was a pleasant surprise, but it also left me with a sense of obligation to study the game with a little more seriousness.

After a brief pause, I returned to the game in the fall of 2020. During this time I watched some videos from this YouTube channel, which I was able to roughly understand thanks to my knowledge of Korean. I began to realize large gaps in my style of play, and it was only after this realization that I started improving my results.

Having a working knowledge of both chess and Baduk, I would have to say that the biggest difference between them for me is that there is far, far more room for strategy in Baduk. This is because you can ignore threats and play for bigger moves on the more much more often than in chess, especially in the opening and middlegame. There is more wiggle room for creativity!

Speaking of openings, because the Baduk board has 4 symmetrical corners, there are actually 4 areas of openings in each game. You can have 4 different “openings” in each game. In chess there is only 1 “center” of the board where the vast majority of opening theory takes place.

The handicap system is far more elegant in Baduk as well. In chess, the handicap is usually to remove a pawn (or two), but this drastically alters the nature of the game. Not so for Baduk! The weaker player gets up to 9 extra stones on the board before the start of the game. This way you can play with opponents at different levels without getting completely crushed from the very beginning.

Perhaps the best part of the game is that each game is decided by a score (where the score is the amount of “territory” you control). A win is technically a win, sure, but the “wins” can be judged against each other by their score.

Finally, the game is more forgiving in terms of errors. In chess if you lose your queen (without adequate compensation), the game is pretty much over. In baduk, even if you lose a sizable group, you can still come back. Actually, the bigger your group of stones, the harder it is to get them captured outright, and so there is a natural, automatic tendency for your strongest “pieces”, if you will, to resist capture.3 Brilliant!

Conclusion

If you haven’t learned the game yet, I strongly recommend this game! I used the book “Go: A Complete Introduction to the Game” to get a nice overview.4

Have fun!

1. I use the Korean word “Baduk” (바둑) because the usual Japanese loanword “Go” overlaps with the name “Go” of the Go programming language.↩︎

2. Years later I learned that my uncle is an amateur 5-dan.↩︎

3. In Baduk as long as a group gets 2 “eyes”, it becomes uncapturable — and the bigger the group, the easier it is to make such eyes.↩︎

4. The author of this book is Cho Chikun, one of the top players of the 20th century.↩︎

]]>
The Two Sum Problem Explained http://funloop.org/post/2020-12-05-twosum-problem-explained.html 2020-12-05T00:00:00Z 2020-12-05T00:00:00Z 2020-12-05
algorithms, math

Just over three years ago, I watched this video that goes over the so-called “Two Sum” problem for the first time. The problem statement is as follows:

Given a sorted list of integers (unimaginitively called numbers), determine if any 2 integers in the list sum up to a number N.

To be honest I did not understand why the proposed optimal solution that uses 2 pointers works the way it does, without missing any possible pairs. The explanation given by the interview candidate in the video always struck me as way too hand-wavy for my tastes.

And to be really honest I never bothered to convince myself that the 2-pointer approach is correct. Until today. This post is about the correctness behind the 2-pointer method, because I have yet to see a clear explanation about this topic.

Brute force approach

First let’s look at the brute-force solution. The brute-force solution looks at every single possible pair of numbers by using a double-for-loop. This is a very common pattern (nested looping) whenever one wants to consider all possible combinations, where each for-loop corresponds to a single “dimension” we want to exhaustively pore over. In this case there are 2 dimensions because there are 2 numbers we need to look at, so we must use 2 for-loops.

Here is the basic pseudocode 1:

for i in numbers:
for j in numbers:
if i + j == N:
return i, j

I think even beginner programmers can see that the brute force approach works. We just look at every possible 2-number combination to see if they will add up to N, and if so, we stop the search. Simple! 2

2-pointer method

The 2-pointer method boils down to the following observation:

Remove numbers from the pairwise search if they cannot be used (with any other number) to sum up to N.

Although the solution posted in countless places online involve pointers, it is more intuitive to think of modifying the list after each pairwise inspection. Below is the algorithm in plain English:

1. Construct a pair of numbers (a, b) such that a is the smallest number and b is the biggest number in the list. That is, these are the leftmost and rightmost ends of the sorted list, respectively.
2. If the sum of a + b is equal to N, of course we’re done.
3. If the sum of a + b is bigger than N, delete b from the list. Go back to Step 1.
4. If the sum of a + b is smaller than N, delete a from the list. Go back to Step 1.
5. If the list becomes smaller than 2 elements, stop (obviously, because there are no more pairs to consider). Optionally return an error.

The algorithm described above can be equivalently coded with pointers, so there is no material difference to discuss in terms of implementation.

Anyway, we just need to make sense of the critical Steps, namely Steps 3, 4, and 5, and that should be enough to quell any worries about correctness.

Step 3

This is the step that removes the largest element b in the list from consideration for all future iterations. How can this be correct?

Well, let’s consider an example. If N is 50 but a + b is 85, we must look for a smaller sum. This much is obvious.

We just used a and b to get to 85, but because we must get to a smaller sum, we would like to swap out either a or b (or both, eventually) with another number from the list. The question is, which one do we swap out?

We can’t replace a with the next bigger number (or any other number between a and b), because doing so will result in a sum that is at least as big as 85 (or bigger). So a has to stay — we can’t rule out other combinations of a with some number other than b (maybe a and its next-biggest neighbor, etc).

That leaves us with b. We throw out b and replace it with the next biggest number, which is guaranteed to be less than or equal to the just-thrown-out b, because the list is sorted.

In other words, all pairs of b and every other element in the list already sums up to 85 or some other higher number. So b is a red herring that’s leading us astray. We must throw it out.

Step 4

This is the “mirror” of Step 3. Here we throw out the smallest number out of future pairwise searches, because we know that a, no matter which number it is paired with (even with the biggest one, b), is too small to meet the target N. In other words, a fails to give enough of a “boost” to any other number to reach N. It is very much useless to the remaining other candidates, and so we throw it out.

Step 5

This Step’s analogy when using pointers is to consider the condition when the pointers “cross”. The pointers “crossing”, in and of itself, doesn’t seem particularly significant. However when we view this condition by looking at the dwindling size of the overall list (by chopping off the left and right ends in Steps 4 and 3), the point becomes obvious. We must stop when the list becomes too small to make Step 1 impossible to fulfill (namely, the construction of the pair (a, b)), due to the fact that there aren’t enough elements in the list (singleton or empty list).

2-pointer method, in pseudocode

For sake of completeness, here is the pseudocode for the same algorithm. You will see how using pointers (instead of deleting elements outright as described in Steps 3 and 4) doesn’t change the algorithm at all.

# Partial implementation of Step 5. Early exit if list is too small to begin with.
if length(numbers) < 2:
return error

# Step 1.
a_idx = 0
b_idx = length(numbers) - 1
sum = numbers[a_idx] + numbers[b_idx]

# Begin search, but only if we have to search.
while sum != N:
# Step 3
if sum > N:
b_idx -= 1
# Step 4
elif sum < N:
a_idx += 1

# Step 5
if a_idx == b_idx:
return error

# Step 1 (again, because we didn't find a match above).
sum = numbers[a_idx] + numbers[b_idx]

# Step 2
return numbers[a_idx], numbers[b_idx]

It may be of interest to readers who are fairly new to programming that Step 2 comes in at the very end. Getting the “feel” for converting plain-English algorithms into actual code is something that requires experience, and can only be acquired with practice over time.

Do the pointers ever skip over each other?

It is worth pointing out that the condition a_idx == b_idx is well-formed. That is, there will never be a case where a_idx and b_idx will somehow “skip over” each other, rendering the if-condition useless. This is because we only ever increment a_idx or decrement b_idx, exclusively — that is, we never modify both of them within the same iteration. So, the variables only ever change by ±1, and at some point, if the search goes on long enough, the indices are bound to converge at the same numerical value.

Conclusion

I think the beauty of this problem is that it’s so simple, and yet it is also a very cool way of looking at the problem of search. Steps 3 and 4 are essentially very aggressive (and correct!) eliminations of bad search paths. There’s just something refreshing about eliminating entire branches of a search tree to speed things up.

If you compare the 2-pointer method with the brute force approach, it is in essence doing the same logical thing, with fewer steps. Whereas the brute force approach performs a pairwise comparison across all possible combinations, the 2-pointer method preemptively discards many combinations by removing elements outright from future consideration. That’s the kind of power you need to go from $$O(n^2)$$ to $$O(n)$$!

Hope this helps!

1. Of course, this pseudocode ignores edge-cases, but I didn’t want to clutter the code listing with non-essential ideas.↩︎

2. As an added benefit, the brute-force approach works even if the input list is not sorted.↩︎

]]>
The Esrille Nisse: Three Years Later http://funloop.org/post/2019-11-13-esrille-nisse-three-years-later.html 2019-11-13T00:00:00Z 2019-11-13T00:00:00Z 2019-11-13
hardware, esrille, cherry mx

Over three years ago, I wrote a post describing the Esrille Nisse keyboard. This post is a reflection on the keyboard, more than 3 years later.

Layout

Ultimately I settled on a different layout than the one described in the old blog post. This was a result of many hands-on trial-and-error sessions over a period of weeks which turned into many months. In my old post I described writing a program to help find the optimal layout. This proved very difficult in practice, because encoding the optimization rules turned out to be non-trivial. One aspect that was particularly difficult was that the actualy physical shape of my own fingers played a part (some fingers were not as versatile as others, for example the pinky finger, and so the key-distance for certain fingers had to have different “weights”, and this was too much to translate into code).

Anyway, I read this post by Peter Norvig forwards and backwards, and used the values there to guide the design of my layout. One big realization after actual usage was that I could not let go of the QWERTY hjkl keys on the home row. There was just so much muscle memory built into these four keys (the only other key I could not let go of was the spacebar key that I used my left thumb for), that I had to “fix” them on the layout first. I then focused on getting the commonly-used keys right.

All that being said, here is my current layout.

LEFT-SIDE     RIGHT-SIDE
---------------------------
□ □ □ □ □ □     □ □ □ □ □ □
□ □ 0 □ □         □ □ 0 □ 1
□ □ □ y o p z 1     2 f d t r □ □ □
2 / a i e u w ;     " h j k l n 4 : <--- Home row
3 . x q v '         b m g c s 3
8 5 6 7 4     5 , 6 7 8 <--------- Thumb row

Left-side legend
0) Escape
1) PgDn
2) Enter
3) Shift
4) Control
5) Super (Windows key)
6) Space
7) Caps Lock (remapped with xmodmap to Hyper key)
8) Right Alt (aka "AltGr" for US International Layout)

Right-side legend
0) Tab
1) Delete
2) PgUp
3) Shift
4) Backspace
5) FN2
6) FN
7) Alt
8) Right Alt (aka "AltGr" for US International Layout)

The main thing to note is the reduced number of keys that are mapped at all. I like this aspect a lot (not having to move my fingers around much at all) — I never have to reach for a particular key because everything is just so close.

I also dedicated a key just for the colon symbol (as a “Shift + semicolon” macro), because it comes up often enough in programming.

I should also note that the function keys (F1-F12) are situated on the topmost row, left-to-right. I just didn’t bother adding them to the legend because of symbol space constraints.

FN layer.

LEFT-SIDE     RIGHT-SIDE
---------------------------
□ □ □ □ □ □     □ □ □ □ □ □
□ □ a □ □         □ □ □ □ □
□ □ □ 7 8 9 □ □     □ □ \ _ = □ □ □
□ □ 0 4 5 6 □ b     b - { ( ) } a : <--- Home row
c . 1 2 3          □ [ < > ] c
□ □ □ □ □     □ □ □ □ □ <--------- Thumb row

Left-side legend
a) ~/ (a macro that inserts the string "~/")
b) End
c) Shift

Right-side legend
a) Backspace
b) Home
c) Shift

Git Book

I started writing a short (informal) book on Git. I am using LuaTeX to write it; I started in March 2018 but have yet to cross the 1/2 way mark. Hopefully I’ll get it done before March 2019 rolls around.

Back in 2016’s status update I said that I still planned to finish the Haskell book I was working on. That project is definitely dead. One reason is that due to the rising popularity of the language, I feel that other people have already said what I had meant to say in my book.

Shen, Rust, Erlang, Idris, Factor

I’ve grown interested in these languages because well, I feel like they are important. My hope is to find some interesting problems that can be solved idiomatically in each language. That might take years, but, it is my hope that in the future I’ll be able to write about these languages.

HTTPS for this site

Apparently HTTPS support for custom domains on Github have been a thing since earlier this year. I never got around to it but thanks to this post I finally enabled it.

]]>
Useful Manpages http://funloop.org/post/2017-11-11-useful-manpages.html 2017-11-11T00:00:00Z 2017-11-11T00:00:00Z 2017-11-11
linux, git

A while ago I discovered that there is a manpage for the ASCII character set. It got a bunch of upvotes, and since then I wondered what other manpages were worth knowing about. Below is a small table of manpages that I found interesting.

Manpage Description
ascii(7) the ASCII character set (in octal, decimal, and hex)
units(7) megabytes vs mebibytes, etc.
hier(7) traditional filesystem hierarchy (e.g., /bin vs /usr/bin)
file-hierarchy(7) (systemd) filesystem hierarchy
operator(7) C operator precedence rules (listed in descending order)
console_codes(4) Linux console escape and control sequences
terminal-colors.d(5) among other things, ANSI color sequences
boot(7) UNIX System V Release 4 bootup process
daemon(7) (systemd) how to write/package daemons
proc(5) proc filesystem (/proc)
ip(7) Linux IPv4 protocol implementation (a bit low-level, but still useful)
ipv6(7) Linux IPv6 protocol implementation
socket(7) Linux socket interface
unix(7) UNIX domain sockets
fifo(7) named pipes

Note that you need to run

sudo mandb

to be able to invoke apropos <SEARCH_TERM> or man -k <SEARCH_TERM> (man -k is equivalent to apropos — see man(1)).

Git-specific

You probably knew already that Git has many manpages dedicated to each of its subcommands, such as git-clone(1) or git-commit(1), but did you know that it also comes with a suite of tutorials? Behold!

Manpage Description
giteveryday(7) the top ~20 useful git commands you should know
gitglossary(7) a glossary of all git concepts (blob object, working tree, etc.)
gittutorial(7) a high-level view of using git
gittutorial-2(7) explains the object database and index file (git architecture internals)
gitcore-tutorial(7) like gittutorial-2(7), but much more detailed
gitworkflows(7) recommended workflows, esp. branching strategies for maintainers

Happy hacking!

]]>
The Math Behind the Tower of Hanoi Problem http://funloop.org/post/2017-05-13-tower-of-hanoi.html 2017-05-13T00:00:00Z 2017-05-13T00:00:00Z 2017-05-13
math

In the very first chapter of the book Concrete Mathematics 2ed there is a discussion about the Tower of Hanoi. This post is a distillation of that discussion.

The Problem

There are 3 rods, with 8 discs (with holes) resting on one rod; the discs are sorted in size like a pyramid, with the smallest disc on top. We want to move all discs to another rod, but with the following rules: (1) a move consists of moving a single disc onto a rod; (2) you may never place a bigger disc on top of a smaller one. A question arises — how many steps are required to move the entire tower of disks onto another rod?

Finding the Recurrence

First consider the simplest case, without any discs. Because there are no discs to move, we cannot make any moves, and so the number of steps required is 0. We can write this as

$S_0 = 0$

with $$S$$ meaning the number of steps and the subscript representing the number of discs in the tower.

Now let’s consider how the problem scales. With 1 disc, the answer is a single step since the one disc is itself the entire tower. With 2 discs, the answer is three steps — one step to move the top (small) disc to another rod, one step to move the big disc to the destination rod, and lastly one step to move the small disc on top of the big disc. With 3 discs, the answer is seven steps — the insight here is that we treat the top two discs exactly the same as the previous problem; so we need 3 moves to move the top two to another rod, then one move to move the biggest disc to the destination rod, then again 3 moves to move the 2-disc sub-tower to the destination rod.

The example with 3 discs is quite telling. We can use the insights gained there to set an upper bound to the number of steps required for the general case of $$n$$ discs; if we take more steps than this upper bound, we would know that we made mistakes. For a tower of size $$n$$, we require $$S_{n - 1}$$ steps to move all discs except the biggest one, then move the biggest disc, then move the sub-tower on top of that disc with (again) $$S_{n - 1}$$ steps. So the upper bound is

$\begin{equation} \label{eq:recurrence} S_n = \begin{cases} 0 & \text{if } n = 0 \\ 2 * (S_{n - 1}) + 1 & \text{if } n > 0. \end{cases} \end{equation}$

If that’s the upper bound, then is there a separate formula for the lower bound (optimal solution)? Nope! It’s because there must come a time in solving the puzzle where we move the biggest disc to the destination rod. To get to the biggest disc, we must have moved all discs on top of it to another rod (the sub-tower); and, after having moved the biggest disc, we must move this sub-tower back on top of that rod (back onto the biggest disc). Because of these constraints stemming the definition of the puzzle itself, we know that for $$n$$ > 0 we must take at least $$2 * (S_{n - 1}) + 1$$ steps.

The upper and lower bounds agree in their formulation, and this formulation (Equation $$\ref{eq:recurrence}$$) is our recurrence. In mathematics, a recurrence relation is basically a recursively-defined equation, where a base case in the recurrence defines the starting point. In Equation $$\ref{eq:recurrence}$$, the base case is $$n = 0$$; for $$n > 0$$, we define the number of steps required in a recursive manner.

In our discussion of finding the upper and lower bounds, there were two key concepts — the need to move the biggest disc, and the need to move the sub-tower twice (before and after moving the biggest disc). Our recurrence clearly agrees with these two concepts. The “$$+ 1$$” in the non-base case is the step of moving the biggest disc, whereas the $$2 * (S_{n - 1})$$ is the number of steps required to move the sub-tower twice.

Simplifying the Recurrence

Recurrences are great, but they are painful to compute. For example, it’s not immediately clear what $$S_{11}$$ or $$S_{54}$$ evaluates to. It would be really nice if we could avoid defining $$S_n$$ recursively.

And this is where math meets science. In the scientific method, we have to come up with a hypothesis and then test that hypothesis with one or more experiments. We can do the same thing here by trying to guess the solution to the recurrence.

For one thing, we know that $$S_n$$ grows as $$n$$ grows (it will never be the case that $$S_n$$ somehow plateaus or decreases down the road). The more discs there are, the more work we have to do, right? So let’s look at small cases to see how the numbers grow, and see if there is a pattern to the growth rate of $$S_n$$.

$$n$$ $$S_n$$
0 0
1 1
2 3
3 7
4 15
5 31
6 63
7 127
8 255

We don’t have to actually simulate the puzzle to derive these values; using the recurrence Equation $$\ref{eq:recurrence}$$ we start off from the first row (the base case) and then calculate our way down, reusing $$S_n$$ from the previous row as $$S_{n - 1}$$. 1

Anyway, the values of $$S_n$$ sure look familiar — especially if we use base 2.

$$n$$ binary($$S_n$$)
0 $$0_2$$
1 $$1_2$$
2 $$11_2$$
3 $$111_2$$
4 $$1111_2$$
5 $$11111_2$$
6 $$111111_2$$
7 $$1111111_2$$
8 $$11111111_2$$

It looks like our recurrence simplifies to just

$\begin{equation} \label{eq:solution} S_n = 2^n - 1 \quad \text{for } n \geq 0, \end{equation}$

except it is no longer a recurrence as there is no need to define a base case. We’ll call it a solution to the recurrence.

Proving the Solution

Although the empirical evidence looks very good, we have not formally proved that the solution (Equation $$\ref{eq:solution}$$) holds for all $$n$$. It’s one thing to say that something is true for all observed cases (scientific experiment), and quite another to say that something is true for all cases (mathematical proof).

Can we prove it? Yes! Fortunately for us, Equation $$\ref{eq:recurrence}$$ lends itself to proof by induction. Induction requires you to first prove some number $$k_0$$ as a starting point (the base case) using some proposition $$P$$. Then you prove that $$P$$ holds for $$k + 1$$ (the next number); i.e., show that going from $$k$$ to $$k + 1$$ does not change $$P$$. This is the inductive step. In this way, we prove the “totality” of $$P$$ as it applies to all numbers in the range $$[k_0, k_{m}]$$ and we are done. 2

Here we want to prove that Equation $$\ref{eq:solution}$$ holds for all $$n$$ (all natural numbers). 3 For this proof let’s rewrite Equation $$\ref{eq:solution}$$ to use $$k$$ instead of $$n$$:

$\begin{equation} \label{eq:proposition} S_k = 2^k - 1 \quad \text{for } k \geq 0. \end{equation}$

Equation $$\ref{eq:proposition}$$ is our proposition $$P$$. The base case is easy enough to prove: $$S_0 = 0$$ because there are no disks to move. For the inductive step, we use the non-base part of our recurrence from Equation $$\ref{eq:recurrence}$$ to get

\begin{align} S_k &= 2 * (S_{k - 1}) + 1 \label{eq:induct1} \end{align}

and rewrite it in terms of $$k + 1$$:

\begin{align} S_{k + 1} &= 2 * (S_{k}) + 1. \label{eq:induct2} \end{align}

Now the critical part: we replace $$S_k$$ with Equation $$\ref{eq:proposition}$$ (our proposition), because we assume that our proposition is true for all steps up to $$k$$ (but not $$k + 1$$, which is what we’re trying to prove):

\begin{align} S_{k + 1} &= 2 * (2^k - 1) + 1. \end{align}

In case you forgot algebra, $$2 * 2^k = 2^1 * 2^k = 2^{k + 1}$$ and we can use this to simplify our equation.

\begin{align} S_{k + 1} &= 2 * (2^k - 1) + 1\\ &= [2 * (2^k - 1)] + 1\\ &= [(2 * 2^k - 2)] + 1\\ &= (2^{k + 1} - 2) + 1\\ &= 2^{k + 1} - 1 \label{eq:induct3}. \end{align}

And now we can see that Equation $$\ref{eq:induct3}$$ (our “evolved” proposition $$P$$, if you will) is the same as our solution (Equation $$\ref{eq:solution}$$), even though we increased $$k$$ to $$k + 1$$! This is because simple substitution allows us to replace “$$k + 1$$” with “$$n$$”. We have completed our proof by induction. 4

Alternate Recurrence and Solution

The book goes on to offer an alternate recurrence to Equation $$\ref{eq:recurrence}$$, by adding 1 to both sides:

\begin{align} (S_n) + 1 &= \begin{cases} 0 + 1 & \text{if } n = 0 \\ 2 * (S_{n - 1}) + 1 + 1 & \text{if } n > 0 \\ \end{cases}\\ &= \begin{cases} 1 & \text{if } n = 0 \\ 2 * (S_{n - 1}) + 2 & \text{if } n > 0. \label{eq:recurrence2} \end{cases} \end{align}

This recurrence is the same as the original, except that it adds 1 to the answer. Now we let $$W_n = (S_n) + 1$$ and $$W_{n - 1} = (S_{n - 1}) + 1$$ and rewrite everything in terms of $$W$$:

\begin{align} W_n &= \begin{cases} 1 & \text{if } n = 0 \\ 2 * (W_{n - 1}) & \text{if } n > 0. \label{eq:recurrence3} \end{cases} \end{align}

Notice how the “$$+ 2$$” in Equation $$\ref{eq:recurrence2}$$ goes away, because the coefficient $$2$$ in Equation $$\ref{eq:recurrence3}$$ will multiply with the “$$+ 1$$” from $$W_{n - 1}$$ to get it back. Using this alternate recurrence, it’s easy to see that the solution is just $$W_n = 2^n$$, because $$W$$ can only grow by multiplying $$2$$ to itself! Hence

\begin{align} W_n = (S_n) + 1 = 2^n \end{align}

and subtracting 1 from all sides gives us

\begin{align} (W_n) - 1 =S_n = 2^n - 1. \end{align}

The lesson here is that if it is difficult to find the solution to a recurrence, we can use basic algebra rules to transform the recurrence to something more amenable. In this case, all it took was adding 1 to the original recurrence.

Conclusion

I thoroughly enjoyed figuring this stuff out because possibly for the first time in my life I used my programming experience (recurrence/recursion, memoization) to help myself understand mathematics — not the other way around. The other way around was never enjoyable — calculating what i was in some $$n$$th iteration of a for-loop never really excited me.

I hope this explanation helps you better understand the first few pages of Concrete Mathematics; I had to read that part three times over to really “get it” (never having learned what induction is). And henceforth, I will never look at a string of consecutive 1’s in binary the same way again. 😃

1. In computer science, this process of avoiding the recalculation of previously known values is called memoization and is useful in generating the first N values of a recursive algorithm in $$O(N)$$ (linear) time.↩︎

2. Note that if $$k_0 = 0$$, then $$[k_0, k_{m}]$$ is the set of all natural numbers (zero plus the positive integers).↩︎

3. There is no need to prove the recurrence (Equation $$\ref{eq:recurrence}$$) as we have already proved it in the process of deriving it.↩︎

4. In Concrete Mathematics 2 ed. p. 3 (where the book uses $$T_n$$ instead of $$S_n$$), the proof is simply a one-liner: $T_n = 2(T_{n - 1}) + 1 = 2(2^{n - 1} - 1) + 1 = 2^n - 1.$ But I find it a bit too terse for my tastes.↩︎

]]>
The Fastest Way to Compute the Nth Fibonacci Number: The Doubling Method http://funloop.org/post/2017-04-14-computing-fibonacci-numbers.html 2017-04-14T00:00:00Z 2017-04-14T00:00:00Z 2017-04-14
math, programming, python

Introduction

The Fibonacci Sequence is defined as follows:

\begin{align} \mathrm{F}_{0} = 0\\ \mathrm{F}_{1} = 1\\ \mathrm{F}_{n} = \mathrm{F}_{n - 2} + \mathrm{F}_{n - 1}. \end{align}

That is, each Fibonacci number $$\mathrm{F}_{n}$$ is the sum of the previous two Fibonacci numbers, except the very first two numbers which are defined to be 0 and 1. 1

From the definition above, it appears that computing $$\mathrm{F}_{n}$$ requires one to always compute $$\mathrm{F}_{n - 2}$$ and $$\mathrm{F}_{n - 1}$$. This is false: enter the “doubling method”. 2 3

The Genesis of the Doubling Method

The doubling method uses a couple of mathematical formulas derived from matrix multiplication as it applies to calculating Fibonacci numbers; it can be seen as an improvement over the matrix multiplication method, although it does not use matrix multplication itself. The matrix multiplication method uses the following formula:

$\begin{equation} \begin{bmatrix} 1 & 1\\ 1 & 0 \end{bmatrix}^n = \begin{bmatrix} \mathrm{F}_{n + 1} & \mathrm{F}_{n}\\ \mathrm{F}_{n} & \mathrm{F}_{n - 1} \end{bmatrix}. \end{equation}$

This result is quite interesting in its own right; to find $$\mathrm{F}_{n}$$ you only need to raise the matrix

$\begin{bmatrix} 1 & 1\\ 1 & 0 \end{bmatrix}$

to the $$n$$th power. To be more precise, this method is matrix exponentiation. The only downside is that much of the answer is wasted — we don’t care about $$\mathrm{F}_{n - 1}$$, not to mention how $$\mathrm{F}_{n}$$ is redundantly computed twice.

Thinking in Terms of $$\mathrm{F}_{n}$$

What if we could find $$\mathrm{F}_{n}$$ not by multiplying or adding some numbers, but by multiplying and adding other Fibonacci terms? Of course, we’re not talking about adding $$\mathrm{F}_{n - 2}$$ and $$\mathrm{F}_{n - 1}$$ because that would be too slow. Let’s have a look at the matrix identity again (reversed for easier reading):

$\begin{equation} \begin{bmatrix} \mathrm{F}_{n + 1} & \mathrm{F}_{n}\\ \mathrm{F}_{n} & \mathrm{F}_{n - 1} \end{bmatrix} = \begin{bmatrix} 1 & 1\\ 1 & 0 \end{bmatrix}^n. \end{equation}$

If we substitute in $$2n$$ for $$n$$, we get

\begin{align} \begin{bmatrix} \mathrm{F}_{2n + 1} & \mathrm{F}_{2n}\\ \mathrm{F}_{2n} & \mathrm{F}_{2n - 1} \end{bmatrix} & = \begin{bmatrix} 1 & 1\\ 1 & 0 \end{bmatrix}^{2n} \\ & = \bigg(\begin{bmatrix} 1 & 1\\ 1 & 0 \end{bmatrix}^{n}\bigg)^2 \end{align}

and we can substitute in our matrix identity from above to rewrite this as

\begin{align} & = \bigg(\begin{bmatrix} \mathrm{F}_{n + 1} & \mathrm{F}_{n}\\ \mathrm{F}_{n} & \mathrm{F}_{n - 1} \end{bmatrix}\bigg)^2 \end{align}

and carry out the squaring to get

\begin{align} & = \begin{bmatrix} {{\mathrm{F}_{n + 1}}^2 + {\mathrm{F}_{n}}^2} & {{\mathrm{F}_{n + 1}\mathrm{F}_{n}} + {\mathrm{F}_{n}\mathrm{F}_{n - 1}}}\\ {{\mathrm{F}_{n}\mathrm{F}_{n + 1}} + {\mathrm{F}_{n - 1}\mathrm{F}_{n}}} & {{\mathrm{F}_{n}}^2 + {\mathrm{F}_{n - 1}}^2} \end{bmatrix}. \end{align}

The top right and bottom left terms are identical; we can also rewrite them to be a bit simpler.

\begin{align} {{\mathrm{F}_{n + 1}\mathrm{F}_{n}} + {\mathrm{F}_{n}\mathrm{F}_{n - 1}}} & = \mathrm{F}_{n}(\mathrm{F}_{n + 1} + \mathrm{F}_{n - 1}) \\ & = \mathrm{F}_{n}[\mathrm{F}_{n + 1} + (\mathrm{F}_{n + 1} - \mathrm{F}_{n})] \\ & = \mathrm{F}_{n}(2\mathrm{F}_{n + 1} - \mathrm{F}_{n}). \end{align}

This simplication achieves an important task — it obviates $$\mathrm{F}_{n - 1}$$ by cleverly defining it as $$\mathrm{F}_{n + 1} - \mathrm{F}_{n}$$. Putting everything together, whe have

\begin{align} \begin{bmatrix} \mathrm{F}_{2n + 1} & \mathrm{F}_{2n}\\ \mathrm{F}_{2n} & \mathrm{F}_{2n - 1} \end{bmatrix} & = \begin{bmatrix} {{\mathrm{F}_{n + 1}}^2 + {\mathrm{F}_{n}}^2} & {\mathrm{F}_{n}(2\mathrm{F}_{n + 1} - \mathrm{F}_{n})}\\ {\mathrm{F}_{n}(2\mathrm{F}_{n + 1} - \mathrm{F}_{n})} & {{\mathrm{F}_{n}}^2 + {\mathrm{F}_{n - 1}}^2} \end{bmatrix} \end{align}

where the first row (or column) gives us two very useful identities

\begin{align} \mathrm{F}_{2n} & = {\mathrm{F}_{n}(2\mathrm{F}_{n + 1} - \mathrm{F}_{n})} \\ \mathrm{F}_{2n + 1} & = {{\mathrm{F}_{n}}^2 + {\mathrm{F}_{n + 1}}^2}. \end{align}

As these identities form the heart of the doubling method, let’s call them the doubling identities.

And now we just need one more piece to formulate our doubling method; we need to borrow an idea from number theory. Given any positive integer $$n$$, it is the same as either $$2m$$ (even) or $$2m + 1$$ (odd), where $$m = \lfloor\frac{n}{2}\rfloor$$; for our purposes, let us call this property the “halving property”.

Whereas the doubling identities allow us to “double” our way into bigger numbers, the halving property allows us to halve our way down to smaller and smaller numbers. The marriage of these two concepts gives rise to the doubling method.

The Doubling Method

To compute the $$n$$th Fibonacci term we break $$n$$ itself down into its halves ($$2m$$) recursively, until we go down to $$n = 0$$. At this point we multiply our way back up using the doubling identities. Because halving and doubling by themselves always calculate $$\mathrm{F}_{2m}$$, we have to manually return $$\mathrm{F}_{2m + 1}$$ if our current sequence index number $$n$$ is odd.

def fibonacci_doubling(n):
""" Calculate the Nth Fibonacci number using the doubling method. """
return _fibonacci_doubling(n)

def _fibonacci_doubling(n):
""" Calculate Nth Fibonacci number using the doubling method. Return the
tuple (F(n), F(n+1))."""
if n == 0:
return (0, 1)
else:
a, b = _fibonacci_doubling(n >> 1)
c = a * ((b << 1) - a)
d = a * a + b * b
if n & 1:
return (d, c + d)
else:
return (c, d)

if __name__ == "__main__":
for n in range(20):
print(fibonacci_doubling(n))
# As a demonstration of this algorithm's speed, here is a large n.
print(fibonacci_doubling(10000))

Line 12 is where we do the halving. We use the right-shift operator to do this. Lines 13 and 14 are our doubling identities (I use the left-shift operator here because it feels more natural to me). The if-condition on line 15 returns $$\mathrm{F}_{2m + 1}$$ if $$n$$ was odd, and $$\mathrm{F}_{2m}$$ otherwise.

For comparison, here is an iterative version. On the one hand it avoids Python’s recursion limit, but the downside is a small loss of elegance (we have to loop twice — first to build up the halving/doubling points, and again for the main loop).

def fibonacci_doubling_iter(n):
""" Calculate Nth Fibonacci number using the doubling method, using
iteration. """
ns = []
while n:
ns.extend([n])
n >>= 1

a, b = 0, 1

while ns:
n = ns.pop()
c = a * ((b << 1) - a)
d = a * a + b * b
if n & 1:
a, b = d, c + d
else:
a, b = c, d

return a

Conclusion

I hope you enjoyed reading about this method of calculationg Fibonacci numbers as much as I enjoyed learning the math behind it. This algorithm can be sped up if it uses a faster multiplication algorithm as a and b get very large (e.g., Karatsuba multiplication). 4 Time complexity is $$\Theta(\log{n})$$; it reminds me of the binary search algorithm, in how the problem space is halved repeatedly. Neat!

1. We can choose to define the first two terms as 1 and 1 instead, but this distinction is needlessly arbitrary.↩︎

2. There is actually a known formula for our purposes, where $\mathrm{F}_{n} = \frac{\varphi^n - (-\varphi)^{-n}}{2\varphi - 1}$ and $$\varphi = \frac{1 + \sqrt{5}}{2} \approx 1.6180339887\cdots$$ (the golden ratio). Unfortunately this requires arbitrary-precision floating point calculations.↩︎

3. For more discussion, see https://www.nayuki.io/page/fast-fibonacci-algorithms.↩︎

4. Python already uses Karatsuba multiplication natively for large integers.↩︎

]]>