In an earlier post, I discussed how to remove modulo bias from your C code. I would like to add a different version of removing modulo bias, inspired by the code from the PCG family of random number generators.

# The PCG Version

In the official minimal C implementation of PCG, there is a function called
`pcg32_boundedrand_r`

, which takes a `bound`

variable. The code for that
function is as follows:

```
uint32_t pcg32_boundedrand_r(pcg32_random_t* rng, uint32_t bound)
{
uint32_t threshold = -bound % bound;
for (;;) {
uint32_t r = pcg32_random_r(rng);
if (r >= threshold)
return r % bound;
}
}
```

. If `bound`

is 5, then the function will return a uniform range from 0
to 4. I.e., the return value `r`

is such that \(0 \leq r < bound\).

# Explanation of the PCG Version

Although I generally prefer to explain code in the order they appear,
top to bottom, for this case I would like to explain the
`pcg32_boundedrand_r()`

function by talking about the usual broken
version found in the wild.

## Broken Version

The broken version is like this:

```
r = some_rng();
return r % bound;
```

. Unless `r`

is a multiple of `bound`

, we will incur the wrath of modulo
bias in the above code. Why is this? Well, consider what the modulo
operator does. All it does is chop off any excess range of values that
is not a multiple of `bound`

. Consider the following ASCII diagram,
where the maximum value returned by our `some_rng()`

function is just
11:

```
0 1 2 3 4 5 6 7 8 9 10 11
```

. So, `some_rng()`

has 12 possible unique values, 0 through 11. If our
bound was `4`

, then doing `r % bound`

is perfectly fine, because we can
chop the RNG range into three equal parts of length 4 each, like this:

```
0 1 2 3 | 4 5 6 7 | 8 9 10 11
```

. If each of the 12 values 0 through 11 occurs uniformly^{1}, then we
can assume that each of the values occurs \(\frac{1}{12}\) times. If we
apply `% 4`

to the values, then our output range looks like this:

```
0 1 2 3 | 0 1 2 3 | 0 1 2 3
```

. Can you see how the three subparts are the same? What’s more, 0 will
occur \(\frac{3}{12} = \frac{1}{4}\) times. The same goes for `1`

, `2`

,
and `3`

. Because all 4 possible values, 0 through 4, occur an equal
\(\frac{1}{4}\) times, there is *no modulo bias here*!

### What if the modulus is not a nice number with respect to

`RAND_MAX`

?

This is where the problem occurs. `RAND_MAX`

is the highest value
returned by our RNG, and thus defines the range of the possible values
output by the RNG.^{2} Continuing with our example above, if instead
of `bound = 4`

, we used another value like 5, we will get this instead:

```
0 1 2 3 4 | 5 6 7 8 9 | 10 11
```

— or essentially, these values are output if we just naively use
`r % bound`

:

```
0 1 2 3 4 | 0 1 2 3 4 | 0 1
```

. Can you see how `0`

and `1`

will occur \(\frac{3}{12}\) times, but `2`

,
`3`

, and `4`

will occur only \(\frac{2}{12}\) times?

### The fix — adjust the range!

Now, we can fix the above example by simply throwing out certain values.
The approach I used in the
old blog post was to discard the right hand side values. So, in our
example with `bound = 5`

, where we have

```
0 1 2 3 4 | 0 1 2 3 4 | 0 1
```

, the old example tried to discard the last 2 values, like this:

```
0 1 2 3 4 | 0 1 2 3 4 | x x
```

. In other words, if `r`

fell within the range 10 through 11, then we’d
simply discard it and call `some_rand()`

again.

But it doesn’t have to be this way. Instead of throwing out the values on the right, we can throw out the values on the left! So, instead of

```
0 1 2 3 4 | 0 1 2 3 4 | x x
```

we can instead do

```
x x 2 3 4 | 0 1 2 3 4 | 0 1
```

. Can you see how all of the values 0 through 4 occur exactly 2 times? No more bias!

## PCG’s approach

The approach in PCG is the same; here is the code again:

```
uint32_t pcg32_boundedrand_r(pcg32_random_t* rng, uint32_t bound)
{
uint32_t threshold = -bound % bound;
for (;;) {
uint32_t r = pcg32_random_r(rng);
if (r >= threshold)
return r % bound;
}
}
```

. The `threshold`

value is the initial range that must be discarded. So
we can visualize it like this:

```
x x 2 3 4 | 0 1 2 3 4 | 0 1
// The `x x` here is the initial threshold range, that must be discarded.
```

On line 3, we determine the value of `threshold`

, and then on line 4 we
enter a `for`

loop that repeatedly calls the RNG until we get a value
outside of this threshold range (well, technically,
*greater-than-or-equal-to* the threshold value). If we do get such a
value, then we return the modulo of it. If we visualize it, it’s like
this:

```
0 1 2 3 4 | 5 6 7 8 9 | 10 11
- Discard 0 and 1 (the threshold area).
x x 2 3 4 | 5 6 7 8 9 | 10 11
- Return (r % bound).
x x 2 3 4 | 0 1 2 3 4 | 0 1
```

.

### How is `threshold`

calculated?

The above high-level explanation should be sufficient for you, dear
reader. But if you want to go down to the innards of C, to see how
`pcg32_boundedrand_r`

works, read on.

The cornerstone of the PCG approach is to use a variable called
`threshold`

. The code to calculate `threshold`

is somewhat complicated:

`uint32_t threshold = -bound % bound;`

. Now, let’s remind ourselves that the whole point of `threshold`

is to
be a minimum value that sets a cutoff of values to be discarded — if
the generated value `r`

is too low (lower than our threshold), then we
discard it.

Consider the following diagram, with `RAND_MAX`

set to `11`

, and `bound`

set to 5:

```
0 1 2 3 4 | 5 6 7 8 9 | 10 11
```

. We can visualze the above like this instead:

```
R R R R R | _ _ R R R | R R
```

. The underscores represent the values that must be *skipped* over, in
order to eliminate modulo bias. The question then, is to figure out how
to count the number of underscores. In our case, it is 2, and so as long
as we skip the first two values 0 and 1, we should be fine.

The first step is to count backwards from the right-hand edge:

```
|<---count|
R R R R R | _ _ R R R | R R
t |
\-> RAND_MAX
```

. We end up where `t`

is on the diagram above, by counting backwards
from `RAND_MAX`

. If we then take the modulo of this by `count`

itself,
then we end up with `t`

being the value we want — the number of
underscores. This is the essence of `-bound % bound`

— we first take
`-bound`

which is obtained by counting backwards from `RAND_MAX`

, and
then we take the modulo of this number by `bound`

itself, to get what we
need. Using the ASCII diagram again, we get

```
|<-----count|
0 1 2 3 4 | 5 6 7 8 9 | 10 11
t |
\-> RAND_MAX
```

where `t`

is 7, and now applying % 5, we get:

```
0 1 2 3 4 | 0 1 2 3 4 | 0 1
t
```

`t = 2`

, the correct answer! You can try out different values for
`RAND_MAX`

and `bound`

, but you will get the right answer each time
using `threshold = -bound % bound`

.

### But why is `-bound`

the way it is?

The
C standard says that a negative unsigned value is stored as a positive
value. Without getting too technical, here are the values of `bound`

as
it becomes “negative”:

```
Bound | Actual value
------+-------------
2 | 2
1 | 1
0 | 0
-1 | (2^32) - 1 (same as RAND_MAX)
-2 | (2^32) - 2 (same as RAND_MAX - 1)
-3 | (2^32) - 3 (same as RAND_MAX - 2)
... and so on
```

. In our case, if our `RAND_MAX`

is 11, and `bound`

is 5, then `-bound`

is indeed 7.

```
Bound | Actual value
------+-------------
0 | 0
-1 | 11 - 0 (same as RAND_MAX)
-2 | 11 - 1 (same as RAND_MAX - 1)
-3 | 11 - 2 (same as RAND_MAX - 2)
-4 | 11 - 3 (same as RAND_MAX - 3)
-5 | 11 - 4 (same as RAND_MAX - 4) = 7
... and so on
```

# Conclusion

I thoroughly enjoyed looking at the source code in PCG, only to discover an elegant solution around removing modulo bias. Unfortunately, I do not know the true origin of this approach; it is possible that the authors of the PCG code invented it, but I find this improbable. Meanwhile, I strongly recommend the following code for anyone using a low-level generator that does not come with a bounded version:

```
/*
* Uniformly return an integer from 0 to (bound - 1). We assume that rand()
* returns a 32-bit unsigned integer, so we use uint32_t.
*/
uint32_t bound = some_arbitrary_bound;
uint32_t r;
uint32_t threshold = -bound % bound;
while (r = rand() < threshold) {};
return r % bound;
```

`pcg-style.c`

`[GitHub]`

`[Download]`

Any RNG worth their salt will return a uniformly distributed value, typically from 0 to

`RAND_MAX`

. In the example here, our`RAND_MAX`

is 11.↩︎For a 32-bit unsigned integer RNG, \(2^{32} - 1\) (all 1 bits set) is the highest value that can be returned. That is, our RNG returns a value from 0 (no bits set) to

`RAND_MAX`

(all bits set). This means that our RNG generates everything from all 0s to all 1s and everything in between.↩︎