Introduction#
ECFF
1992
Pretzel, Oliver. Error-Correcting Codes and Finite Fields. Oxford University Press Applied Mathematics and Computing Science Series.
The theory of error-correcting codes deals with the general problem of transmitting messages reliably. (Transmission in space (wires) and in time (storage).) In the classical theory no assumption is made about the nature of the message to be transmitted. Thus there is no statistic on which to base a guess of the correct message. Redundancy is built into the message at the transmitter so that the receiver can use that redundancy to correct errors. This goal can be achieved with high probability but not with certainty. A weakness of the classical theory is that errors that occur before the transmitter encodes and transmits the message cannot be corrected much less detected.
Strings#
Alphabet#
Binary Alphabet#
The Channel Model#
The situation to be modeled
Information is transmitted via a channel. The channel is prone to errors. The distorted information is processed at the receiving end to restore the original message as nearly as possible. Information is passed through the channel in separate lumps (discrete channel) or in a continuously varying form (continuous channel).
Examples of channels
Modulator (translates information into electrical signals), transmitter, receiver, demodulator: radio, television, satellite
Cable: computer file transfer
Storage medium + the means of storing and retrieving information from it (errors can occur either by a failure of the storage or reading device, or by deterioriation of the medium)
language
Coding for the binary symmetric channel#
We assume the discrete symmetric random error channel.
+--------------------------+
| |
| Random error generator |
| |
+--------------------------+
| +-----------+
| | | Message
+-----------+ | +-------------------+---------------->| Decoder |----------->
Message | | Signal v Distorted signal | | code word u' | | word x'
-------->| Encoder |--------------->|+|-------------------->| Error processor | +-----------+
word x | | code word u received word v | |
+-----------+ +-------------------+----------------------------------------->
v + error signal
Example codes#
\((8, 7)\) Parity Check Code#
A byte, or \(8\) bits, is a common unit of information in computer systems which can represent values from \(0\) to \(255\) inclusive.
If we want to transmit English messages we can use the following alphabet.
\( \begin{aligned} \text{L} &= \text{\{ a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z \}} \\ \text{U} &= \text{\{ A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z \}} \\ \text{D} &= \text{\{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 \}} \\ \text{P} &= \{ \texttt{!}, \texttt{"}, \texttt{\#}, \texttt{\$}, \texttt{\%}, \texttt{\&}, \texttt{'}, \texttt{(}, \texttt{)}, \texttt{*}, \texttt{+}, \texttt{,}, \texttt{-}, \texttt{.}, \texttt{/}, \texttt{:}, \texttt{;}, \texttt{<}, \texttt{=}, \texttt{>}, \texttt{?}, \texttt{@}, \texttt{[}, \texttt{\textbackslash}, \texttt{]}, \texttt{\textasciicircum}, \texttt{\_}, \texttt{`}, \texttt{\{}, \texttt{|}, \texttt{\}}, \texttt{\textasciitilde} \} \\ \text{W} &= \{ \langle \texttt{space} \rangle, \texttt{\textbackslash t}, \texttt{\textbackslash n}, \texttt{\textbackslash r}, \texttt{\textbackslash x0b}, \texttt{\textbackslash x0c} \}\\ A &= \text{L} \cup \text{U} \cup \text{D} \cup \text{P} \\ |A| &= 94 = 26 + 26 + 10 + 32 \\ \end {aligned} \)
Observe that the number of symbols plus thirty or so more for control codes amounts to less than \(127 = 2^7 - 1\) or just \(7\) bits. Thus the eighth bit of a byte can be used as a parity bit to check that the byte has been received without error. The eighth bit is set so that the number of \(1\) s in the byte is even. This way, if a byte is transmitted and one of the bits goes wrong then the number of \(1\) s becomes odd and the receiver can ask for a retransmission. There is no way the receiver can tell which bit went wrong, and if two bits are incorrect the receiver will let the byte through. In practice, the order of the bits is reversed so that the check bit comes first.
For example
\( \begin{array}{rr|r} \text{symbol} & 7\text{-bit ASCII code} & 8\text{-bit encoding} \\ \hline 1 & \underset{2^0}{1}000\underset{2^4}{1}\underset{2^5}{1}0 & \underbrace{1000110}1 \\ \text{A} & \underset{2^0}{1}00000\underset{2^6}{1} & \underbrace{1000001}0 \\ \end {array} \)
Error-correcting capabilities
(1)
The code is economical: the encoded message is \(1/7\) th longer than the original. This means that if the original message has length \(n\) then the encoded message has length
\( \begin{aligned} n + \frac{1}{7}n = n \left( 1 + \frac{1}{7} \right) = \frac{8}{7} n \end {aligned} \)
In our case, \(n = 7\) and so the encoded message has length \(8\).
(2)
The code cannot correct errors; it is only suitable where the receiver can ask for retransmission, since errors can be detected but cannot be located
(3)
The probability of errors during transmission should be fairly low, because the code cannot cope with two errors in a byte
Implementation
Encoder: divide the message into blocks of seven, and to each block add an eighth bit to make the number of \(1\) s even
Error Processor: (assume that the receiver adopts a correction strategy where possible) count the number of \(1\) s in the received block, and emit an error signal if the number is odd
Decoder: strip the eighth bit
Triple repetition code#
Upon receiving the message \(101\) the receiver has two options: it can ask for a retransmission, or it can guess that it is more likely that a single \(1\) flipped to \(0\) rather than the case in which two \(0\) s flipped to \(1\) s.
\( \begin{array}{r|r} \text{symbol} & \text{encoding} \\ \hline 0 & 000 \\ 1 & 111 \\ \end {array} \)
the code is uneconomical: the encoded message is \(3\) times as long as the original
the code can correct single errors in a block of three (the error probability can be moderate); or alternatively where retransmission is possible, the code can be used to detect single or double errors in a block of three (the error probability can be quite high)
Encoder: repeat each bit three times
Error Processor: (assume that the receiver adopts a correction strategy where possible) take the majority vote in each block of three and make all three equal to that
Decoder: strip the last two bits
Triple check code#
The message is divided into blocks of three bits \(abc\) where each of \(a\), \(b\), and \(c\) is \(0\) or \(1\). Three check bits \(xyz\) are added where each of \(x\), \(y\), and \(z\) is also \(0\) or \(1\). This is done in such a way that the following three conditions are satisfied.
The number of \(1\) s in \(abx\) is even.
The number of \(1\) s in \(acy\) is even.
The number of \(1\) s in \(bcz\) is even.
\( \begin{aligned} a + b + x &= 2k \\ a + c + y &= 2k \\ b + c + z &= 2k \\ \end {aligned} \)
So if \(abc = 110\) then \(x = 0\), \(y = 1\), and \(z = 1\) and so the code word is \(110011\).
Single-error correction mode
\( \begin{array}{rrr} \text{message } abc & \text{encoded message } abcxyz & a \text{ error} & b \text{ error} & c \text{ error} & x \text{ error} & y \text{ error} & z \text{ error} \\ \hline 000 & 000000 & \textcolor{red}{1}00000 & 0\textcolor{red}{1}0000 & 00\textcolor{red}{1}000 & 000\textcolor{red}{1}00 & 0000\textcolor{red}{1}0 & 00000\textcolor{red}{1} \\ 001 & 001011 & \textcolor{red}{1}01011 & 0\textcolor{red}{1}1011 & 00\textcolor{red}{0}011 & 001\textcolor{red}{1}11 & 0010\textcolor{red}{0}1 & 00101\textcolor{red}{0} \\ 010 & 010101 & \textcolor{red}{1}10101 & 0\textcolor{red}{0}0101 & 01\textcolor{red}{1}101 & 010\textcolor{red}{0}01 & 0101\textcolor{red}{1}1 & 01010\textcolor{red}{0} \\ 011 & 011110 & \textcolor{red}{1}11110 & 0\textcolor{red}{0}1110 & 01\textcolor{red}{0}110 & 011\textcolor{red}{0}10 & 0111\textcolor{red}{0}0 & 01111\textcolor{red}{1} \\ 100 & 100110 & \textcolor{red}{0}00110 & 1\textcolor{red}{1}0110 & 10\textcolor{red}{1}110 & 100\textcolor{red}{0}10 & 1001\textcolor{red}{0}0 & 10011\textcolor{red}{1} \\ 101 & 101101 & \textcolor{red}{0}01101 & 1\textcolor{red}{1}1101 & 10\textcolor{red}{0}101 & 101\textcolor{red}{0}01 & 1011\textcolor{red}{1}1 & 10110\textcolor{red}{0} \\ 110 & 110011 & \textcolor{red}{0}10011 & 1\textcolor{red}{0}0011 & 11\textcolor{red}{1}011 & 110\textcolor{red}{1}11 & 1100\textcolor{red}{0}1 & 11001\textcolor{red}{0} \\ 111 & 111000 & \textcolor{red}{0}11000 & 1\textcolor{red}{0}1000 & 11\textcolor{red}{0}000 & 111\textcolor{red}{1}00 & 1110\textcolor{red}{1}0 & 11100\textcolor{red}{1} \\ \end {array} \)
If \(a\) is incorrect then conditions \(1\) and \(2\) fail.
If \(b\) is incorrect then conditions \(1\) and \(3\) fail.
If \(c\) is incorrect then conditions \(2\) and \(3\) fail.
If \(x\) is incorrect then condition \(1\) alone fails.
If \(y\) is incorrect then condition \(2\) alone fails.
If \(z\) is incorrect then condition \(3\) alone fails.
Double-error detection mode - if a single bit is in error then the above holds and at least one condition fails; if two bits are in error then there is at least one condition involving one but not the other and so at least one condition fails
Properties
the code is moderately uneconomical: the encoded message is twice as long as the original
for correction, the code can deal with one error in a block of six; for detection, the code can deal with two errors in a block of six
since the code deals with one or two errors in a block of six rather than a block of three, the code is not quite as reliable as triple repetition (if used for pure error detection, the code can do rather better than this discussion suggests)
Implementation
Encoder: divide the message into blocks of three, and to each block calculate a further three bits satisfying the three conditions; output an amalgamated block of six
Error Processor: (assume that the receiver adopts a correction strategy where possible) check the three conditions; if none fail, the word is correct; if one or two fail, correct the single bit involved in the failing conditions and not in the others; if all three fail, send an error signal
Decoder: strip the last three bits off each block unless there is an error signal