Heptavintimal Encoding of Ternary Values

Part of http://www.cs.uiowa.edu/~jones/ternary/
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Disclaimer: Nobody but the author endorses the use of the notation described here, but if you need to use base 27, this is as good as any.

Abstract

Users of binary computers have long used octal and hexadecimal to encode groups of 3 or 4 bits into single digits for compact textual representation. In particular, hexadecimal has become a near universal encoding for the values of 4-bit nybbles. There is a need for a similar encoding for ternary computers. Heptavintimal (base 27) meets this need, offering a natural encoding for 3-trit trybbles. The choice of characters used to represent digits above 9 was not a great problem with hexadecimal, since the letters A through F cannot be easily confused with digits. In the case of Heptavintimal, however, the letters I an O pose challenges because they are easily confused with the digits 1 and 0. These and other potentially confusing letters are therefore eliminated from the set of heptavintimal digits.

  1. The Basic Scheme
  2. Examples
  3. The Name
  4. The Digits
  5. A Heptavintimal to Decimal Converter


1. The Basic Scheme

Just as it is easier for people to use octal (base 8) or hexadecimal (base 16) instead of binary, it is easier to use nonary (base 9) or heptavintimal (base 27) instead of ternary (base 3). In base 9, pairs of ternary digits are encoded by single nonary digits. In base 27, triplets of ternary digits, also called trybbles, are encoded by single heptavintimal digits. The following standard heptavintimal encoding will be used:

Heptavintimal trybble encodings
Weight: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Ternary: 000001002 010011012 020021022 100101102 110111112 120121122 200201202 210211212 220221222
Digits: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H K M N P R T V X Z

Just as people frequently refer to hexadecimal numbers as hex numbers, heptavintimal numbers may be referred to as hept numbers in contexts where polysyllabic names get in the way.

Note that the digits used in this system are a superset of those used to encode hexadecimal numbers. The rationalle for the omission of certain letters from this set is discussed later, as is the rationalle for the name used.

2. Examples

A single ternary (base 3) digit is a trit, which may take on the values 0, 1 and 2. Unsigned integers are written in the usual way. Here are some common values in decimal, ternary, nonary and heptavintimal:

Decimal Ternary Nonary Hept
0 = 0 = 0 = 0
3 = 10 = 3 = 3
9 = 100 = 10 = 9
27 = 1000 = 30 = 10
81 = 10000 = 100 = 30
243 = 100000 = 300 = 90
729 = 1000000 = 1000 = 100
Decimal Ternary Nonary Hept
1 = 1 = 1 = 1
10 = 101 = 11 = A
100 = 10201 = 121 = 3M
1000 = 1101001 = 1331 = 1A1
   
Decimal Ternary Nonary Hept
1 = 1 = 1 = 1
2 = 2 = 2 = 2
4 = 11 = 4 = 4
8 = 22 = 8 = 8
16 = 121 = 17 = G
32 = 1012 = 35 = 15
64 = 2101 = 71 = 2A
128 = 11202 = 152 = 4N
256 = 100111 = 314 = 9D
512 = 200222 = 628 = KZ
1024 = 1101221 = 1412 = 1AX

3. The Name

The name heptavintimal is composed of the Greek prefix hepta, meaning seven, followed by the Latin root vinti meaning twenty, with the suffix mal added, to indicate that it is a number base. The mixing of Greek and Latin exactly follows the formation of the word hexadecimal, where the prefix hexi comes from Greek and the root deci is from the Latin. This follows naturally from the form of the word decimal, formed from deci, which is from the Latin for ten.

One could argue that the Latin viginti should have been used instead of vinti. Words like vintiner and vintage are awfully similar and refer to wine, not twenty, so the word heptavintimal might be confused with something having to do with seven types of wine. Unfortunately, heptavigintimal is a bit of a tongue twister, and most English speakers are unclear about whether the letter g should be pronounced as a hard g (as in lag) or a soft g (as in age).

A purist might argue that even the words like octal, decimal and hexadecimal are improperly formed and ought to be replaced by octonary, deanery and senidenary. This naming was used consistently in Alfred B. Taylor's mid 19th century study of alternative number bases. [Taylor 1859] The word hexadecimal has been particularly offensive to linguistic purists because of its free mixture of Latin and Greek roots. In addition to senidenary, purists have proposed such alternatives as sexadecimal and hexadecadic.

Following Taylor's logic, we ought to refer to base-27 numbers as being in the septivicenary system. We reject this! The word hexadecimal provides an adequate precedent for our choice of heptavintimal. Hexadecimal has survived years of challenges from linguistically pure alternatives and has become solidly established as the standard term for base 16. We expect heptavintimal will become equally established as ternary computing rises to its ascendency over binary computing.

4. The Digits

The usual way to extend to number bases above 10 is to use consecutive letters of the Roman alphabet for the consecutive digits above 9. This poses no problem for hexadecimal, where the letters A through F work quite well. For number bases above 18, however, the letter I causes problems. It is very easy to confuse the digit 1 with the capital I and lower case l. In fact, some early typewriters did not include the numeral 1 at all. Typists were expected to use lower-case l as a numeral, relying on context to distinguish between the numeral and letter. In some fonts, the letter J causes similar problems. The same problem applies to the numeral 0 and the capital O.

A second class of problems emerges when numbers are dictated verbally. It is very common to read 501 as "five oh one" instead of "five zero one". This provides a second argument for omitting the letter O from the heptavintimal numerals. When native speakers of different languages interact, there are additional problems with dictating numbers. Germans pronounce the name of the letter W in essentially the same way that English speakeers pronounce the name of the letter V.

When the font size is small or the reader has poor eyesight, additional problems emerge. The capital Q begins to look very similar to capital O and the numeral 0, and capital S can look like the numeral 5. In fact, many of the letter number substitutions that are common in Leet hint as this class of problems. Finaly, there is some benefit in omitting enough letters that Z becomes the maximum digit.

In order to minimize problems with misread input, all input routines that accept heptavintimal should fold upper and lower case together and make the following mappings for the omitted letters:

 
Input mappings
input   equivalent
i, j, l, y   1
o, q   0
s   5
u, w   V

As a consequence, of these mappings, typing the author's name would be equivalent to "D0VG1A5 10NE5" which represents the two decimal numbers 5,049,536,873 and 546,404.

The decision to omit certain letters from the heptavintimal digit set was inspired, in large part, by the base-32 encoding of [Crockford, 2002] that omits the letters I, L, O and U while folding upper and lower case. Crockford also proposed permitting hyphens within numbers and appending a checksum; these are outside the scope of this discussion, although it is worth noting that his checksum scheme, using the number modulo the next prime larger than the radix, is sound. For base 27, the appropriate modulus is 29, requiring 2 extra characters. Adding the letters W, Y to the set of digits may be the appropriate choice to encode such a checksum.

5. A Heptavintimal to Decimal Converter

In order to encourage people to experiment with heptavintimal encodings for numbers (and various words) the following marginally functional conversion tool is provided coded in comment-free C:

#include <stdio.h>
main () {
        char digits[128]={
                99,99,99,99, 99,99,99,99, 99,99,99,99, 99,99,99,99,
                99,99,99,99, 99,99,99,99, 99,99,99,99, 99,99,99,99,
                99,99,99,99, 99,99,99,99, 99,99,99,99, 99,99,99,99,
                 0, 1, 2, 3,  4, 5, 6, 7,  8, 9,99,99, 99,99,99,99,
                99,10,11,12, 13,14,15,16, 17, 1, 1,18,  1,19,20, 0,
                21, 0,22, 5, 23,24,24,24, 25, 1 26,99, 99,99,99,99,
                99,10,11,12, 13,14,15,16, 17, 1, 1,18,  1,19,20, 0,
                21, 0,22, 5, 23,24,24,24, 25, 1,26,99, 99,99,99,99,
        };
        long int acc;
        do {
                unsigned char ch;
                acc = 0;
                puts("hept: ");
                for (;;) {
                        int digit = 99;
                        ch = getchar();
                        if (ch == '\n') break;
                        if (ch < 0x80) digit = digits[ch];
                        if (digit > 26) puts("illegal digit\n");
                        acc = (acc * 27) + digit;
                };
                printf("= %ld\n",acc);
        } while (acc != 0);
}