Binary Coded Ternary and Its Inverse

Part of the Ternary Manifesto
by Douglas W. Jones
THE UNIVERSITY OF IOWA Department of Computer Science

Work started Jan. 6, 2016.

If binary computers and ternary computers are to communicate, we will need to represent binary data on ternary machines, and visa versa. Binary-coded ternary (BCT) is a representation allowing binary computers to manipulate ternary data, while ternary-coded binary (TCB) allows ternary computers to manipulate binary data.

If ternary computers are to survive in a binary world, they will almost certainly include interfaces to binary peripheral devices. Ternary software that mainpulates such interfaces will almost certainly use some form of ternary-coded binary for both input and output.

During the development of ternary computers, binary computers will almost certainly need to simulate ternary operations. For this, and for an eventual future when binary computers may actually need to communicate with ternary devices, we will need binary-coded ternary.

Contents

Binary-Coded Ternary

In binary, 2 bits encode 4 values, while in ternary, one trit encodes 3 values. As such, if we use 2 bits per binary-encoded ternary trit, there are 4C3 (4 things taken 3 at a time) distinct binary encodings of ternary. Several of these have been explored in the past.

Frieder and Luk, 1975 used an ingenious encoding where each balanced ternary number was represented by a pair of binary numbers [a,b], where the one bits in a represented the positive trits of the ternary number, while the ones in b represented the negative trits. Both balanced and unsigned ternary addition using that representation require 20 arithmetic and logic operations. This representation was used in the TERNAC computer, a ternary architecture implemented by microcode on a conventional binary microengine. In Ternac, each ternary word was composed of a pair of binary words, one holding all of the positive half-trits and one holding all of the negative half-trits. If we undo this packing and place the two bits side by side, this encoding can be describes as follows:

Balanced   BCT  
–1 10
0 00
+1 01

Parhami and McKeown, 2013 They recognized the utility of multiple encodings. The goal of their work was direct implementation of binary-coded ternary operations using binary logic gates, so different encodings could be used at different points in a circuit. Their standard encoding was a superset of the encoding used by Frieder and Luk, where both 00 and 11 were used to represent 0.

There are several binary-coded decimal (BCD) representations. By far the most common uses the ten binary values from 0000 to 1001 to represent the digits from 0 to 9. The second best known, excess-3 BCD, uses the consecutive values 0011 to 1100 in binary order to represent the digits from 0 to 9. Here, we suggest that the natural binary encoding of ternary should use the binary values from 00 to 10, in order, to represent the digits from 0 to 2:

Unsigned Balanced   BCT  
0 –1 00
1 0 01
2 +1 10

Ternary-Coded Binary

Ternary-coded binary (TCB) has been mentioned in passing by several authors, but specific definitions are scarce. There are at least three natural ternary-encodings of binary values, one relating primarily to logical values and two to numeric values. In each of these mappings, each bit of the binary value is encoded as one trit of the ternary value.

For ternary logical values, we map true to true and false to false, without using the intermediate unknown value. For numeric values, we use the ternary digits 0 and 1, noting that the encodings of 0 and 1 in balanced ternary is different from the encoding in unsigned and 3's complement ternary:

TernaryBinary
Unsigned Balanced   Logical Unsigned Balanced 
0 –1 0 0
1 0 1 0
2 +1 1 1

Given a ternary-coded binary value using the unsigned encoding, we can convert it to the logical form by doubling it or adding it to itself using unsigned addition. Similarly, we can convert the value to balanced form by adding one to each trit, using either an unsigned or balanced add.

If we start with a balanced ternary-coded binary number, conversion to unsigned form is easy, just subtract one from each trit. Conversion to logical form with common operators is more awkward using familiar operators. Similarly, if we start with logical form, conversions to the other forms are awkward if we confine ourselves to familiar operators.

The above considerations suggest that ternary-coded binary data should, by default, be introduced in unsigned form and then converted as needed to other forms.

BCT to Binary Conversion

Given an unsigned integer encoded in binary-coded ternary, conversion to binary is straightforward, but the optimal code for this conversion is quite obscure.

The following iterative code assumes a global constant TRITS giving the number of trits in the ternary word to be converted, and it assumes that the default integer size is sufficient to hold both the binary coded ternary value being converted and the result of the conversion.

unsigned int bct_to_bin( unsigned int bct ) {
        unsigned int acc = 0;
        int trit;
        const int shift = 2 * TRITS;
        const int mask = 0x3 << shift;

        for ( trit = 0; trit != TRITS; trit++ ) {
                acc = (acc * 3) + ((bct & mask)) >> shift);
                bct = bct << 2;
        }
        return acc;
}

This code is relatively slow because it processes all of the trits sequentially. If conversions are common, this can become a bottleneck, so it is interesting to look for faster alternatives. The key to faster code is to take advantage of the fact that we have a wide data path in our arithmetic unit that can handle multiple binary-coded ternary trits in parallel. The following code demonstrates this, assuming that unsigned integers hold 32-bits or 16 BCT trits.

unsigned int bct_to_bin( unsigned int bct ) {
        unsigned int high, low, acc;

        high = (bct  >> 2) & 0x33333333; /* high trits */
        low  =  bct        & 0x33333333; /* low trits */
        acc  =  low + (   3 * high);     /* binary-coded base 9 */

        high = (acc  >> 4) & 0x0F0F0F0F; /* high nibbles */
        low  =  acc        & 0x0F0F0F0F; /* low nibbles */
        acc  =  low + (   9 * high);     /* binary-coded base 81 */

        high = (acc  >> 8) & 0x00FF00FF; /* high bytes */
        low  =  acc        & 0x00FF00FF; /* low bytes */
        acc  =  low + (  81 * high);     /* binary-coded base 6561 */

        high = (acc  >> 8) & 0x0000FFFF; /* high bytes */
        low  =  acc        & 0x0000FFFF; /* low bytes */
        acc  =  low + (6561 * high);     /* now in binary */

        return acc;
}

The pattern illustrated above can be continued to higher precisions, with the total number of operations growing as O(log n) for converting n digits, so long as the underlying arithmetic unit can handle operands of that size. The multiplier in each step is the square of the multiplier in the previous step.

The above code is not at the limit of optimization. Massalin, 1987 used an exhaustive search algorithm to find the optimal code sequence to convert binary-coded decimal to binary, and the algorithm "discovered" by his superoptimizer generalizes. Here is the resulting code, assuming we are converting a 16 trit unsigned BCT number to a 32 bit unsigned integer:

unsigned int bct_to_bin( unsigned int bct ) {
        unsigned int acc = bct;

        acc = acc - (((acc >>  2) & 0x33333333) * (    4 -    3));
        acc = acc - (((acc >>  4) & 0x0F0F0F0F) * (   16 -    9));
        acc = acc - (((acc >>  8) & 0x00FF00FF) * (  256 -   81));
        acc = acc - (((acc >> 16) & 0x0000FFFF) * (65536 - 6561));

        return acc;
}

We assume, of course, that the constant expressions used to compute the multipliers are evaluated by the compiler and that the multiplications are optimized using sequences of shift and add operations where such sequnces are faster than native multiply instructions. The magic constants on each successive line above are the squares of the constants on the previous line.

Performance: Some simple benchmarks illustrate the performance of this code. These tests were done on an Intel Core i3 3.06 GHz computer under the GCC 4.2 compiler with the default compilation options, using the 27-trit binary-coded ternary implementation from libter27.h.

Generating and converting all integer values from 1 to 4,000,000 from binary-coded-ternary to binary using the naïve for-loop algorithm given first above is 3.94 times slower than the optimized version. The overhead of incrementing the integer value over the tested range is the same in both cases, subtracting an approximation of this from both run times sugests that the actual speed ratio is closer to 7.4.

TCB to Ternary Conversion

Given an unsigned ternary-coded binary integer, and writing programs in an environment where all operations are naturally carried out in ternary, the algorithms for converting binary to ternary mirror the algorithms discussed above. Here, we assume that the data type unsigned ternary forces that data to be naturally represented in ternary, and that arithmetic operations on values of this type are the usual unsigned ternary operations. In addition, we assume that the & and | operators, and and or for binary numbers, are used for the min and max operators when applied to ternary operands. The subscript 3 on shift operators such as <<3 indicates shifts applied to the trits of of a ternary operand, and constants used for masking ternary values are given here in heptavintimal, indicated by a leading 0t (a horrible notation, but since we are using a C-like syntax here, it will suffice).

unsigned ternary tcb_to_ter( unsigned ternayr tcb ) {
        unsigned ternary acc = 0;
        int bit;
        const int shift = BITS;
        const int mask = 0t1 <<3 shift;

        for ( bit = 0; bit != BITS; bit++ ) {
                acc = (acc * 2) + ((tcb & mask)) >>3 shift);
                tcb = tcb <<3 1;
        }
        return acc;
}

As with the BCT to binary conversion code, this code processes the digits sequentially, presenting a possible bottleneck. Again, we can construct faster code by using the full parallel data path through the arithmetic unit. in the following, we assume that unsigned ternary operands are represented in 27 trits:

unsigned ternary tcb_to_ter( unsigned ternary tcb ) {
        unsigned ternary high, low, acc;

        high = (tcb >>3  1) & 0tN6N6N6N6N; /* high bits */
        low  =  tcb         & 0tN6N6N6N6N; /* low bits */
        acc  =  low + (     2 * high);     /* ternary-coded base 4 */

        high = (acc >>3  2) & 0t82KV82KV8; /* high 2-bits */
        low  =  acc         & 0t82KV82KV8; /* low 2-bits */
        acc  =  low + (     4 * high);     /* ternary-coded base 16 */

        high = (acc >>3  4) & 0tZ08V0ZK2Z; /* high nibbles */
        low  =  acc         & 0tZ08V0ZK2Z; /* low nibbles */
        acc  =  low + (    16 * high);     /* ternary-coded base 256 */

        high = (acc >>3  8) & 0t0ZZV008ZZ; /* high bytes */
        low  =  acc         & 0t0ZZV008ZZ; /* low bytes */
        acc  =  low + (   256 * high);     /* ternary-coded base 65536 */

        high = (acc >>3 16) & 0t0002ZZZZZ; /* high bytes */
        low  =  acc         & 0t0002ZZZZZ; /* low bytes */
        acc  =  low + ( 65536 * high);     /* now in ternary */

        return acc;

This code can be superoptimized as follows:

unsigned ternary tcb_to_ter( unsigned ternary tcb ) {
        unsigned ternary acc = tcb;

        acc = acc - (((acc >>3  1) & 0tN6N6N6N6N) * (       3 -     2));
        acc = acc - (((acc >>3  2) & 0t82KV82KV8) * (       9 -     4));
        acc = acc - (((acc >>3  4) & 0tZ08V0ZK2Z) * (      81 -    16));
        acc = acc - (((acc >>3  8) & 0t0ZZV008ZZ) * (    6561 -   256));
        acc = acc - (((acc >>3 16) & 0t0002ZZZZZ) * (43046721 - 65536));

        return acc;
}