Mathematics

1. Introduction

This section discusses some mathematical topics related to programming.

2. Two's Complement

"Complement" here comes from the mathematical notion of complements. The "two" here essentially means base 2 (binary). Two numbers \(a\) and \(b\) are each others' two's complement when adding them equals a power-of-2 number (\(2^N\)):

\[ a + b = 2^N. \]

2.1. Overview

Because examples are worth a million words, let's see some examples of numbers and their two's complements for unsigned 4-bit numbers (the idea extends to any bit width, but we'll limit ourselves to 4 for simplicity). The table below lists numbers in binary and decimal form, and their complements.

Number X Two's Complement of X X + Y
0000 (0) 10000 (16) 10000 (16)
0001 (1) 1111 (15) 10000 (16)
0010 (2) 1110 (14) 10000 (16)
0011 (3) 1101 (13) 10000 (16)
0100 (4) 1100 (12) 10000 (16)
0101 (5) 1011 (11) 10000 (16)
0110 (6) 1010 (10) 10000 (16)
0111 (7) 1001 (9) 10000 (16)
1000 (8) 1000 (8) 10000 (16)
1001 (9) 0111 (7) 10000 (16)
1010 (10) 0110 (6) 10000 (16)
1011 (11) 0101 (5) 10000 (16)
1100 (12) 0100 (4) 10000 (16)
1101 (13) 0011 (3) 10000 (16)
1110 (14) 0010 (2) 10000 (16)
1111 (15) 0001 (1) 10000 (16)

Again, note that the complement goes both ways — 1 is 15's two's complement and vice versa. The table above lists all two's complements for \(2^4\) (4 bit-word).

You may be wondering how to compute the complement. The obvious way is to use a word that is 1 extra bit wide (in the example above, 10000 is 5 bits wide), and then subtract the 4-bit number you have. Hence to compute the two's complement of 0111 (7), you can just do 10000 (16) minus 0111 (7), which is 1001 (9).

But hold on — that's too easy right? So let's now consider the following constraint: what if the maximum word size of your CPU architecture is 4 bits? In this scenario you can no longer load a 5-bit word into a single CPU register because each register is only 4 bits wide. Annoying!

Is there a way to only use 4-bits to still compute the two's complement? Indeed there's a neat bitwise trick to do this, involving 2 steps:

  1. apply binary NOT to the word, and
  2. add 1.

In Python and C, the ~ (tilde) symbol is used to get the binary NOT value of a word. This just means flipping the bits so that they are the opposite of what they are in the original word. So the binary NOT of 0111 (7) is 1000 (8). So if we add 1 to it, we get 8 + 1 = 9, what we got using plain subtraction previously.

You may now be wondering how the above trick even works. It's actually trivial, if we tweak the problem so that we want to get \(2^N - 1\) (one away from our answer) instead of \(2^N\) directly.

For a 4-bit word, 0xf (hex), or 1111 in binary is the value for \(2^4 - 1\). So now if you wanted to arrive at 1111 by adding two binary words together, at least one thing must be true — there must be no carries. That is, wherever there is a 1 bit set in one word, the other word has to have a 0 set in the same bit index, and vice versa. The two words cannot step on each other's toes. This is exactly the property of the binary NOT operation — it gives us 1-bits wherever there are 0-bits. Below is a table of examples.

Number X NOT X = Y X + Y
0000 (0) 1111 (15) 1111 (15)
0001 (1) 1110 (14) 1111 (15)
0010 (2) 1101 (13) 1111 (15)
0011 (3) 1100 (12) 1111 (15)
0100 (4) 1011 (11) 1111 (15)
0101 (5) 1010 (10) 1111 (15)
0110 (6) 1001 (9) 1111 (15)
0111 (7) 1000 (8) 1111 (15)
1000 (8) 0111 (7) 1111 (15)
1001 (9) 0110 (6) 1111 (15)
1010 (10) 0101 (5) 1111 (15)
1011 (11) 0100 (4) 1111 (15)
1100 (12) 0011 (3) 1111 (15)
1101 (13) 0010 (2) 1111 (15)
1110 (14) 0001 (1) 1111 (15)
1111 (15) 0000 (0) 1111 (15)

And now all that remains to get the complement is just adding 1 to NOT X so that the total on the right is 16 instead of 15.

2.2. Using the two's complement to represent negative numbers (signed number system)

In the discussion above, we only talked about unsigned numbers. That is, the decimal equivalent of the binary number assumes that this binary number is an unsigned (always positive) word.

We can use two's complement to represent negative numbers in a super convenient way: just make the leading high-order bit equal to \(-(2^{N - 1})\). So using the 4-bit words above, the leading bit, if on, represents \(-(2^{4 - 1}) = -2^3 = -8\). Then we treat all other bits as their regular "unsigned" part, and add these two pieces together to arrive at our final number.

Below are all possible values of signed 4-bit words.

Signed number Breakdown
0000 (0) N/A
0001 (1) N/A
0010 (2) N/A
0011 (3) N/A
0100 (4) N/A
0101 (5) N/A
0110 (6) N/A
0111 (7) N/A
1000 (-8) 1000 (-8) + 000 (0) = 1000 (-8)
1001 (-7) 1000 (-8) + 001 (1) = 1001 (-7)
1010 (-6) 1000 (-8) + 010 (2) = 1010 (-6)
1011 (-5) 1000 (-8) + 011 (3) = 1011 (-5)
1100 (-4) 1000 (-8) + 100 (4) = 1100 (-4)
1101 (-3) 1000 (-8) + 101 (5) = 1101 (-3)
1110 (-2) 1000 (-8) + 110 (6) = 1110 (-2)
1111 (-1) 1000 (-8) + 111 (7) = 1111 (-1)

This is all well and good, but where/how should we use the two's complement? Technically speaking, we're already using two's complement in the table above, although it's implied. Let's make it explicit in the table below:

Signed number, with unsigned equivalent Two's complement of unsigned equivalent Two's complement in action
0000 (0) 10000 (16) 0 + 16 = 16
0001 (1) 1111 (15) 1 + 15 = 16
0010 (2) 1110 (14) 2 + 14 = 16
0011 (3) 1101 (13) 3 + 13 = 16
0100 (4) 1100 (12) 4 + 12 = 16
0101 (5) 1011 (11) 5 + 11 = 16
0110 (6) 1010 (10) 6 + 10 = 16
0111 (7) 1001 (9) 7 + 9 = 16
1000 (-8, or 8) 1000 (8) 8 + 8 = 16
1001 (-7, or 9) 0111 (7) 9 + 7 = 16
1010 (-6, or 10) 0110 (6) 10 + 6 = 16
1011 (-5, or 11) 0101 (5) 11 + 5 = 16
1100 (-4, or 12) 0100 (4) 12 + 4 = 16
1101 (-3, or 13) 0011 (3) 13 + 3 = 16
1110 (-2, or 14) 0010 (2) 14 + 2 = 16
1111 (-1, or 15) 0001 (1) 15 + 1 = 16

If we pretend we are using just unsigned values, the first two columns are the same as in 1. So if we add them together, we always get 16 (third column). The interesting thing to note is that the negative numbers are exactly that — they are literally the two's complement of the original number, but with a negative sign in front. For example, the unsigned value \(4\) (0100) in the middle column above is represented as \(-4\) via its two's complement 1100 in the first column. Another way to say this is that for any number with a leading high-order bit turned on (as in number 1000 to 1111), we will make the computer interpret it as a negative number, where the value of this negative number is its two's complement. So for 1100 (12), we know it's a negative number because of the leading high-order bit, and to find the value we get the two's complement (0011 plus 1) 0100 (4). So the signed representation of 1100 is \(-4\).

2.2.1. Negating a positive number, or the "paper folding" analogy

In the example above we got some arbitrary word with the leading bit turned on, and tried to figure out its value. What if we already have a value, but just want to make it negative? That is, what if you want to flip the sign so that if you have positive \(x\), you would get \(-x\) (in binary)?

For example, if you have 7 (0111), how would you find the binary representation of -7? You'd simply get the two's complement:

  1. Get the binary NOT of 7, which is 1000 (-8), then
  2. add 1 to get 1001 (-7).

As another example, how about 5 and -5? We have 0101 (5), and binary NOT of 5 is 1010 (-6), and we add 1 to get 1011 (-5). Looks like two's complement works here too!

But now you may be wondering why this works the way it does. You may have also noticed some time earlier that while the positive numbers count up, the negative numbers count "backwards", down from -8 to -1 as we increment our binary representation. Let's see why.

Imagine that the table below is a piece of paper, and that there is a horizontal crease between the rows for 7 and -8.

Signed number
0000 (0)
0001 (1)
0010 (2)
0011 (3)
0100 (4)
0101 (5)
0110 (6)
0111 (7)
CREASE
1000 (-8)
1001 (-7)
1010 (-6)
1011 (-5)
1100 (-4)
1101 (-3)
1110 (-2)
1111 (-1)

If you were to fold this piece of paper on this crease, the rows would overlap one another in pairs, like this:

Number pairs
0000 (0), 1111 (-1)
0001 (1), 1110 (-2)
0010 (2), 1101 (-3)
0011 (3), 1100 (-4)
0100 (4), 1011 (-5)
0101 (5), 1010 (-6)
0110 (6), 1001 (-7)
0111 (7), 1000 (-8)

and you'd notice that the pairs are just the positive number and binary NOT version of the same number (the negative number). For example, 0101 (5) is paired with 1010 (-6). In other words, all of the negative numbers are off by 1 — so if we just add 1 to them (two's complement!), we get:

Number pairs Add 1 to paired number
0000 (0), 1111 (-1) 10000 (16), or 0 in 4-bits
0001 (1), 1110 (-2) 1111 (15), or -1
0010 (2), 1101 (-3) 1110 (14), or -2
0011 (3), 1100 (-4) 1101 (13), or -3
0100 (4), 1011 (-5) 1100 (12), or -4
0101 (5), 1010 (-6) 1011 (11), or -5
0110 (6), 1001 (-7) 1010 (10), or -6
0111 (7), 1000 (-8) 1001 (9), or -7

and we managed to negate the original number into its negative version.

Lastly, the two's complement operation is reversible, such that you can use the same method to get the positive version of a negative number. So if you have \(-7\) (1001), the two's complement is 0110 (6) + 1 or 0111 (7).

2.3. Other notes

This "paper folding" idea presented above is essentially the same idea for figuring out how to add 1 to 100 quickly where you take pairs of number from both ends — e.g. 1 and 100 (sums to 101), 2 and 99 (sums to 101), etc. — and multiply how many pairs you have. This is attributed to the story of Carl Frederich Gauss. The key to Gauss's formula, where you take pairs of numbers that all add up to the same number (101), is strikingly similar to two's complement where all of these complementary pairs add up to the same \(2^N\) number.

You may have noticed that some of the examples above spill over to 5 bits, because of our desire to stay as true to the mathematical notion of complement as possible. For the computer, when we spill over to using 5 bits, the high-order (leftmost) gets discarded. So for example the 2's complement of 0 for a 4-bit word (0000) is 10000 (16), but this just reduced back down to 0000 because the leading bit is necessarily discarded as the word only allows for 4 bits. For this discussion though, it doesn't matter because we only care about taking the two's complement when we want to represent negative numbers, and for 0 there is no need to get the negative of this number.