A Use for Octal: Calculating Modulo 36 from Modulo 9
(I posted an earlier version of this in December 2004 on my old technical blog. A discussion at work last week about 36-bit computers at the Living Computers Museum prompted me to write an updated post with improved explanations and much better typography.)
I’ve been programming in C since 1985 and C++ since 1991, but I’ve never found a use for octal representation until [2004], aside from the permissions argument for chmod. Octal has always seemed as vestigial as a human appendix, a leftover from the early days of computers, when word sizes were often a multiple of three: 6-, 12-, 24-, or 36-bits wide. All modern computers use word sizes that are powers of two—16-, 32-, or 64-bits wide—with 8-bit bytes, so octal is less useful than hex, which evenly subdivides bytes and words. I’ve done a lot of bit twiddling and hexadecimal has always been indispensable, while octal has remained a curiosity.
The other day [in 2004], a mathematician friend described to me a problem that he had solved at a previous company. They were designing hardware that emulated some old 36-bit computers. For backward compatibility, the various shift instructions had to accept an arbitrarily large shift count, k, and shift left or right by (k mod 36). Now, divisions are not cheap to implement in hardware, so they needed to come up with an alternate approach to calculate the modulus.
My friend tried to do something with the factors of 36: 4×9. Four and nine are relatively prime: they have no common factors other than one. By the Chinese Remainder Theorem therefore, the combination of k mod 4 and k mod 9 is enough to uniquely determine k mod 36. By inspection, this is true for the following table of “residues”. All the integers in the range [0, 36) appear exactly once.
4 \ 9 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
---|---|---|---|---|---|---|---|---|---|
0 | 0 | 28 | 20 | 12 | 4 | 32 | 24 | 16 | 8 |
1 | 9 | 1 | 29 | 21 | 13 | 5 | 33 | 25 | 17 |
2 | 18 | 10 | 2 | 30 | 22 | 14 | 6 | 34 | 26 |
3 | 27 | 19 | 11 | 3 | 31 | 23 | 15 | 7 | 35 |
Calculating k mod 4 is easy in hardware: it’s the two least-significant bits.
How to calculate k mod 9 in hardware is not so obvious.
Shifting and Masking
Several programming languages now provide a 0b prefix for binary literals to go along with the 0x prefix for hex literals and the 0o prefix for octal literals. (Older languages, such as C, use a 0 prefix for octal and have no 0b prefix.) See the discussion in Go number literals for more detail on 0b, including a list of languages that now support this notation.
2n, written in binary, looks like 1 followed by n 0s. For example, 23 = 10002. In C-like languages, 2n can be written as 1 << n.
Similarly, 2n − 1, (1 << n) - 1, written in binary, looks like n 1s. For example, 25 − 1 = 3110 = 111112.
We can multiply an unsigned integer, u, by 2n by shifting u left by n bits, u << n, introducing n zeroes as the low-order bits. For example, using 8-bit numbers without loss of generality, written as modern Go/Rust number literals:
0b_0001_0101 << 3 == 0b_1010_1000
Similarly, we can divide u by 2n by shifting u right by n bits, u >>> n, which drops the n low-order bits and introduces n zeroes as the high-order bits.
0b_0101_0110 >>> 3 == 0b_0000_1010
A sign-extending or arithmetic right shift introduces n copies of the sign bit as the high-order bits. In some languages, such as Java and JavaScript, >> means an arithmetic right shift and >>> means a zero-extending right shift. In other languages, including C, C++, and Go, there is only a >> operator and sign-extension generally depends upon the type of the left operand, signed or unsigned. However, sign extension is not guaranteed in C/C++.
0b_0101_0110 >> 2 == 0b_0001_0101 0b_1001_0110 >> 2 == 0b_1110_0101
Finally, we can find the remainder of dividing u by 2n by masking u with 2n − 1, that is, bitwise-and with (1 << n) - 1, to extract the n low-order bits:
0b_0101_0110 & 0b_0000_0111 = 0b_0000_0110
In other words, u % 8 == u & 7 and u / 8 == u >> 3.
Read bitwise operations for more background.
Casting Out Nines
There’s an old trick for checking the results of arithmetic operations, known as casting out nines or dropping nines.
Add up the decimal digits of each number. Apply the arithmetic operation to these digit sums. They should be congruent, modulo 9.
For example, 12, 345×8, 765 = 108, 203, 925.
To check the multiplication, compute the digit sum of each number, by adding up each decimal digit:
and
Take the first two digit sums, modulo 9, and multiply them:
Check against the sum of the digits of the product:
This works because 10n ≡ 1 ( mod 9).
Consider 758:
Dropping the nines from each term leaves the digit sum, which is congruent to the original number modulo nine:
Checking: 758 mod 9 = 2.
Congruences have a number of useful properties.
Casting Out Elevens
Let’s use 11, instead of 9. Since 10 = 11 − 1, then 10n ≡ − 1n (mod 11).
Consider 5234:
Dropping the elevens from each term leaves the alternating digit sum:
It’s more convenient to proceed rightwards from the least significant digit, 4 − 3 + 2 − 5.
Checking: 5234 mod 11 = 9.
To cast out elevens, we calculate the alternating sum from right to left.
Casting out elevens catches some transposition errors, unlike casting out nines. For more, see divisibility rule for 11 and proof for alternating sum.
Modulo 9
At last, we turn to base 8, octal. Nine bears the same relationship to eight in octal, as eleven does to ten in decimal: 910 = 118, base plus one, and 8n ≡ − 1n (mod 9).
We can calculate k mod 9 in base 8 by alternately adding and subtracting the octal digits, from right to left. For example, 12348 mod 9 = 4 − 3 + 2 − 1 = 2. This gives the right answer.
Here’s a simple, albeit incomplete, algorithm in Go. We’re masking and shifting three bits at a time, which is tantamount to working with the octal representation of k.
func Mod9(k uint) uint { var m int = 0 sign := +1 for t := k; 0 != t; t >>= 3 { r := int(t & 7) m += sign * r sign = -sign } return uint(m) }
What about 6178?
And 61728?
Almost there!
Casting out “octal-elevens” (118 = 910) in octal, by an alternating sum of the base-eight digits, computes a small number congruent to the original number number modulo nine.
The algorithm above is calculating numbers that are congruent to the correct answer modulo nine, but which may be outside the desired range. If the intermediate sum dips below zero or rises above eight, we have to add nine or subtract nine respectively to keep the running total in the range [0, 9).
Here’s a complete algorithm for Modulo 9 in Go, computing the alternating sum of the octal digits:
func Mod9(k uint) uint { var m int = 0 var negative bool = false for t := k; 0 != t; t >>= 3 { r := int(t & 7) if negative { m -= r if m < 0 { m += 9 } } else { m += r if m >= 9 { m -= 9 } } // assert(0 <= m && m < 9) negative = !negative } return uint(m) }
Clearly, this algorithm can be implemented in much simpler circuitry than that required to compute a remainder through full-blown division.
Modulo 36
We now have enough to calculate k mod 36 from Mod9 and the Chinese Remainder Theorem:
func Mod36(k uint) uint { Residues := [4][9]uint{ { 0, 28, 20, 12, 4, 32, 24, 16, 8}, { 9, 1, 29, 21, 13, 5, 33, 25, 17}, {18, 10, 2, 30, 22, 14, 6, 34, 26}, {27, 19, 11, 3, 31, 23, 15, 7, 35}, } return Residues[k & 3][Mod9(k)] }
My friend says that he later learned that similar tricks were used in classic 36-bit hardware.
I looked everywhere I could think of to see if I could find this algorithm to calculate modulo 9 described. I found something that hinted at it in Knuth’s Seminumerical Algorithms, §4.4.C, discussing converting octal integers to decimal by hand, where he mentions using casting out nines in octal and in decimal to check the result. There was no mention of it in Warren’s marvelous Hacker’s Delight or in HAKMEM.
I tried to come up with an analytic way to calculate the elements of the 9x4 table. The best that I found is (72 − 8×(k mod 9) + 9×(k mod 4)) mod 36! The inner expression yields a number in the range [0, 99], which can be reduced to [0, 36) by subtracting 36 at most twice. From Concrete Mathematics, mod 36 can be derived from mod 4 and mod 9 by looking at the [0][1] and [1][0] elements of the table: (9×(k mod 4) + 28×(k mod 9)) mod 36. It works, but it’s even worse. A table lookup is clearly more efficient.
Most, if not all, of the computer architectures designed in the last forty years use a word size that is a power of two. Useful relationships like shifting and masking are one big reason why non-power-of-two word sizes have gone out of fashion.
Another big reason is the success of C and Unix, which have a bias towards 8-bit bytes. C doesn’t require 8-bit bytes, but there’s a lot of software which tacitly assumes that char has exactly 8 bits.
On systems with 9-bit bytes, like the 36-bit computers, octal is useful, since a 9-bit byte can hold all values up to 7778 and the word size is a multiple of three.
And there you have it: an unexpected use for octal notation. It’s not exactly an important use, but then 36-bit computers aren’t exactly important any more either.