Chen–Ho encoding
Chen–Ho encoding is an efficient alternate system of binary encoding for decimal digits.
The traditional system of binary encoding for decimal digits, known as binary-coded decimal (BCD), uses four bits to encode each digit, resulting in significant wastage of binary data bandwidth (since four bits can store 16 states and are being used to store only 10).[1]
The concepts behind Chen–Ho encoding were first introduced in a memo from Tien Chi Chen[2] to Irving T. Ho[3] in 1971. Both men were working for IBM at the time, although in different locations.[4][5] Tien Chi Chen also consulted with Frank C. Tung[6] to verify the results of his theories independently.[4]
Tien Chi Chen noted that the digits zero through seven were simply encoded using three binary digits. He also postulated that one could use a flag to identify a different encoding for the digits eight and nine, which would be encoded using a single bit.
Application
In practice, a series of boolean transformations are applied to the stream of input bits, compressing BCD encoded digits from 12 bits per three digits to 10 bits per three digits. Reversed transformations are used to decode the resulting coded stream to BCD. Equivalent results can also be achieved by the use of a look-up table.
The final version of Chen–Ho encoding was published in 1975 in the journal Communications of the Association for Computing Machinery (CACM).[7][8] This version included several refinements, primarily related to the application of the encoding system. It constitutes a Huffman-like prefix code.
Chen–Ho encoding is limited to encoding sets of three decimal digits into groups of 10 bits (so called declets).[1] Of the 1024 states possible by using 10 bits, it leaves only 24 states unused[1] (with don't care bits typically set to 0 on write and ignored on read). With only 0.34% wastage it gives a 20% more efficient encoding than BCD with one digit in 4 bits.[4][8]
One prominent application uses a 128-bit register to store 33 decimal digits with a three digit exponent, effectively not less than what could be achieved using binary encoding (whereas BCD encoding would need 144 bits to store the same amount of digits).
Binary encoding | Decimal digits | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Code space (1024 states) | b9 | b8 | b7 | b6 | b5 | b4 | b3 | b2 | b1 | b0 | d2 | d1 | d0 | Values encoded | Description | Possibilities (1000 states) | |
50.0% (512 states) | 0 | a | b | c | d | e | f | g | h | i | 0abc | 0def | 0ghi | (0–7) (0–7) (0–7) | Three lower digits | 51.2% (512 states) | |
37.5% (384 states) | 1 | 0 | 0 | c | d | e | f | g | h | i | 100c | 0def | 0ghi | (8–9) (0–7) (0–7) | Two lower digits, one higher digit | 38.4% (384 states) | |
1 | 0 | 1 | c | a | b | f | g | h | i | 0abc | 100f | 0ghi | (0–7) (8–9) (0–7) | ||||
1 | 1 | 0 | c | d | e | f | a | b | i | 0abc | 0def | 100i | (0–7) (0–7) (8–9) | ||||
9.375% (96 states) | 1 | 1 | 1 | c | 0 | 0 | f | a | b | i | 0abc | 100f | 100i | (0–7) (8–9) (8–9) | One lower digit, two higher digits | 9.6% (96 states) | |
1 | 1 | 1 | c | 0 | 1 | f | d | e | i | 100c | 0def | 100i | (8–9) (0–7) (8–9) | ||||
1 | 1 | 1 | c | 1 | 0 | f | g | h | i | 100c | 100f | 0ghi | (8–9) (8–9) (0–7) | ||||
3.125% (32 states, 8 used) | 1 | 1 | 1 | c | 1 | 1 | f | (0) | (0) | i | 100c | 100f | 100i | (8–9) (8–9) (8–9) | Three higher digits, bits b2 and b1 are don't care | 0.8% (8 states) |
Related encoding schemes
Chen also proposed a similar, but less efficient encoding scheme to compress sets of two decimal digits (requiring 8 bits in BCD) into groups of 7 bit.[4]
In 2002, Mike F. Cowlishaw published a further refinement of Chen–Ho encoding known as Densely Packed Decimal encoding (DPD) in IEE Proceedings – Computers and Digital Techniques.[9][10] Densely Packed Decimal has subsequently been adopted as the decimal encoding used in the IEEE 754-2008 and ISO/IEC/IEEE 60559:2011 floating-point standards.
See also
- Binary Coded Decimal (BCD)
- Densely Packed Decimal (DPD)
- Unicode Transformation Format (UTF) (similar encoding scheme)
- Length-limited Huffman code
References
- 1 2 3 Muller, Jean-Michel; Brisebarre, Nicolas; de Dinechin, Florent; Jeannerod, Claude-Pierre; Lefèvre, Vincent; Melquiond, Guillaume; Revol, Nathalie; Stehlé, Damien; Torres, Serge (2010). Handbook of Floating-Point Arithmetic (1 ed.). Birkhäuser. doi:10.1007/978-0-8176-4705-6. ISBN 978-0-8176-4704-9. LCCN 2009939668.
- ↑ "CHEN Tien Chi". Archived from the original on 2015-10-23. Retrieved 2016-02-07.
- ↑ Tseng, Li-Ling (1988-04-01). "High-Tech Leadership: Irving T. Ho". Taiwan Info. Archived from the original on 2016-01-01. Retrieved 2016-02-08.
- 1 2 3 4 Chen, Tien Chi (1971-03-29). "Decimal Number Compression" (PDF) (Internal memo to Irving T. Ho). San Jose Research Laboratory: IBM: 1–4. Archived (PDF) from the original on 2012-10-17. Retrieved 2016-02-07.
- ↑ Chen, Tien Chi (1971-03-12). "Decimal-binary integer conversion scheme" (Internal memo to Irving T. Ho). San Jose Research Laboratory: IBM.
- ↑ "IBM资深专家Frank Tung博士8月4日来我校演讲". Archived from the original on 2004-12-08. Retrieved 2016-02-06.
- ↑ Chen, Tien Chi; Ho, Irving T. (January 1975). "Storage-Efficient Representation of Decimal Data". Communications of the Association for Computing Machinery (CACM). 18 (1): 49–52. Retrieved 2016-02-07.
- 1 2 Cowlishaw, Mike F. (2014) [2000]. "A Summary of Chen-Ho Decimal Data encoding". IBM. Archived from the original on 2015-09-24. Retrieved 2016-02-07.
- ↑ Cowlishaw, Mike F. (May 2002). "Densely Packed Decimal Encoding". IEE Proceedings – Computers and Digital Techniques. Institution of Electrical Engineers (IEE). 149 (3): 102–104. doi:10.1049/ip-cdt:20020407. ISSN 1350-2387. Retrieved 2016-02-07.
- ↑ Cowlishaw, Mike F. (2007-02-13) [2000]. "A Summary of Densely Packed Decimal encoding". IBM. Archived from the original on 2015-09-24. Retrieved 2016-02-07.