Block Codes, Weight, and Distance#

ECFF 1992 Pretzel, Oliver. Error-Correcting Codes and Finite Fields. Oxford University Press Applied Mathematics and Computing Science Series.




How to assess the performance of a code over a given channel? The basis for this assessment is the Hamming distance which is the number of places in which two words differ.

The worst-case error-processing performance of a code is determined by the minimum distance between code words. Then some elementary probability theory can be used to assess the performance of a code.




Block Codes#

                                                         +--------------------------+
                                                         |                          |
                                                         |  Random error generator  |
                                                         |                          |
                                                         +--------------------------+
                                                                      |                                                            +-----------+
                                                                      |                                                            |           |  Message
           +----------------+           +-----------+                 |                      +-------------------+---------------->|  Decoder  |----------->
  Message  |                |  Message  |           |  Signal         v   Distorted signal   |                   |  code words u'  |           |  words x'
---------->|  Preprocessor  |---------->|  Encoder  |--------------->|+|-------------------->|  Error processor  |                 +-----------+
  string   |                |  words x  |           |  code words u       received words v   |                   |
           +----------------+           +-----------+                                        +-------------------+----------------------------------------->
                                                                                                                    v + error signal

Definition: block#

The encoder’s preprocessor divides the input message of arbitrary length into blocks (words) of a fixed number of symbols \(m\) and then the encoder translates each word into a codeword of fixed length \(n\). Such a code is called a block code.

Definition: \(A\)-word

If \(A\) is an alphabet then an \(A\)-word of length \(n\) is a sequence of \(n\) symbols from \(A\).

  1. The set of \(A\)-words of length \(n\) is denoted by \(A^n\).

  2. If \(A\) has \(q\) symbols then there are \(q\) choices for the symbol in each place in an \(A\)-word of length \(n\) and so the total number of such words is \(q^n\).

  3. Thus \(|A^n| = |A|^n = q^n\).

Definition: block code#

Definition: Block Code

An \((n, m)\)-block code \(C\) over the alphabet \(A\) of size \(q\) consists of a set of precisely \(q^m\) code words in \(A^n\).

\(n\) is called the code’s block length.

\(m\) is called the code’s rank.

\(m/n\) is called the code’s rate and satisfies \(m \le n \iff m/n \le 1\)

We require precisely \(q^m\) codewords to ensure that an encoder exists. There are \(q^m\) possible message words and each must correspond to a distinct code word. Any further code words are not used and may as well be discarded.

For example, a binary code of rank \(m\) must have \(2^m\) code words.

Definition: encoder#

Definition: Encoder

An encoder \(E\) for \(C\) is a map from \(A^m\) to \(C\).

\(\boxed{E : A^m \to C \quad\quad x \mapsto u}\)

It translates any \(A\)-word \(x\) of length \(m\) into a code word \(u = E(x)\). The encoder is a bijection so that each message word corresponds to one and only one codeword and each codeword represents a unique message word.

Often, the encoder preserves the message word \(x\) as the first part of the code word \(u = E(x)\). Such an encoder is called standard or systematic, in which case the code word is divided into message symbols and check symbols and the decoder simply strips the check symbols.

Definition: decoder#

Definition: Decoder

The corresponding decoder \(D\) is the inverse map of \(E\).

\(\boxed{D : C \to A^m \quad\quad u \mapsto x}\)

It takes every code word \(u = E(x)\) back to \(x\).


In a binary symmetric channel the error-processing capabilities of the coding system do not depend on the encoder and decoder but only on the set of code words because these are all that the channel sees; the choice of encoder and decoder is thus only a matter of practical convenience.

Most of coding theory is concerned with (1) the construction of codes \(C\) and (2) efficient error processors.


Definition: received word#

When errors occur in transmission, the receiver reads a word \(v\) although the transmitter sent a word \(u\).

Definition: Received Word

If \(u = (u_1, u_2, \dotsc, u_n)\) and \(v = (v_1, v_2, \dotsc, v_n)\) are words in \(A^n\) then we refer to \(u_j\) as the entry of \(u\) in place \(j\), and we say that \(v\) differs from \(u\) in place \(j\) if \(u_j \ne v_j\).

The word \(v\) to be analyzed by the error processor is called the received word.

(In this context the words position and location are synonyms for the word place.)

Definition: error of weight \(k\)#

Definition: Error of weight \(k\)

If the received word \(v\) differs from the transmitted one in \(k\) places we say that an error of weight \(k\) occurred (or that \(k\) errors occurred).

Suppose \(u\) is transmitted and \(v\) is received.

\( \begin{aligned} u &= (1, \textcolor{red}{0}, 0, 1, \textcolor{red}{1}, 0) \\ v &= (1, \textcolor{red}{1}, 0, 1, \textcolor{red}{0}, 0) \\ \end {aligned} \)

Then an error of weight \(2\) has occurred.

Definition: Hamming distance and weight#

It is useful to formalize the notion of differing places between words by regarding the number of places in which two words differ as a distance between them.

Definition: Hamming distance and weight

The Hamming distance \(d(u, v)\) between two words \(u\) and \(v\) is the number of entries in which they differ.

The Hamming weight \(\text{wt}(u)\) of \(u\) is the number of non-null entries in \(u\).

If \(\textbf{0} = (0, \dotsc, 0)\) then \(\text{wt}(u) = d(u, \textbf{0})\).

Definition: distance function#

The Hamming distance satisfies the properties of a distance function.

Definition: Distance Function

A function \(f(x, y)\) on pairs of elements of a set \(X\) is a distance function if it satisfies the following conditions.

  1. \(f(x, y)\) is always a non-negative real number.

  2. \(f(x, y) = 0\) iff \(x = y\).

  3. \(f(x, y) = f(y, x)\)

  4. Triangle Inequality: For any three elements \(x, y, z \in X\) it is the case that \(f(x, z) \le f(x, y) + f(y, z)\)

Proof

Conditions 1, 2, and 3 follow from the definition of the Hamming distance. Condition 4:

Let

\( \begin{aligned} x &= (x_1, \dotsc, x_n) &\textcolor{#50C878}{= (1, 0, 1, 1, 0, 1, 0, 0)} \\ y &= (y_1, \dotsc, y_n) &\textcolor{#50C878}{= (0, 0, 1, 0, 1, 1, 0, 1)} \\ z &= (z_1, \dotsc, z_n) &\textcolor{#50C878}{= (0, 1, 1, 0, 0, 0, 1, 1)} \\ \end {aligned} \)

Then \(d(x, z)\) is the number of places in which \(x\) and \(z\) differ. If we denote the set of these places by \(U\) then

\(d(x, z) = |U| = |\{ i \mid x_i \ne z_i \}| \textcolor{#50C878}{= |\{ 0, 1, 3, 5, 6, 7\}| = 6}\)

Let

\( \begin{aligned} S &= \{ i \mid x_i \ne z_i \land x_i = y_i \} \textcolor{#50C878}{= \{ 1, 5, 6 \}} \\ T &= \{ i \mid x_i \ne z_i \land x_i \ne y_i \} \textcolor{#50C878}{= \{ 0, 3, 7 \}} \\ \end {aligned} \)

Then \(U\) is the disjoint union of \(S\) and \(T\). Thus

\(d(x, z) = |S| + |T|\)

It is immediate from the definition of \(d(x, y)\) that \(x\) differs from \(y\) in all the places in \(T\). Thus

\(|T| \le d(x, y)\)

On the other hand if \(i \in S\) then \(y_i = x_i \ne z_i\). Thus

\(|S| \le d(y, z)\)

\(d(x, z) = |S| + |T| \le d(x, y) + d(y, z)\)

\(\blacksquare\)

Definition: error processor (and error detector)#

An error processor \(P\) for \(C\) could just test each word \(v\) it receives to see if it is a code word or not and send an error signal for a non-code word, or it could attempt to correct certain non-code words. To give a flexible formal definition, we assume that when it receives a word \(v\) the processor puts out a word \(u\) and a signal (“good” or “bad”). The signal says whether the processor is putting out a code word or not.

Definition: Error Processor

An error processor \(P\) for \(C\) is a map that accepts a received word \(v\) of length \(n\) and produces a pair \((a, u)\) where \(a\) takes on two values (“good” or “bad”) and \(u\) is a word of length \(n\). The signal \(a\) has the value “good” when \(u\) is a code word and “bad” otherwise. An error processor that always leaves the received word unchanged is called an error detector and an error processor that always produces a code word is called perfect.

An error processor must start by testing the received word \(v\) to see if it is a code word or not. If \(v\) is a code word then the error processor’s job is done. It has no means of knowing what the message was and so it can do no better than accept the word it received as correct. It will transmit it unchanged with a “good” signal. This means that error patterns that distort one code word into another code word are undetectable and thus incorrectable.

Definition: minimum distance of a code#

Suppose that we want to be able to detect all errors of weight at most \(s\). It is necessary that any two distinct code words are at Hamming distance \(s + 1\). It makes sense to introduce the notion for the smallest possible distance between distinct code words.

Definition: Minimum Distance

Let \(C\) be an \((n, m)\)-code. The minimum distance \(d(C)\) of \(C\) is the smallest Hamming distance between distinct code words of \(C\).

This measure is so important that we sometimes call attention to it be describing \(C\) as an \((n, m, d)\)-code.

Claim: condition for error detection#

The condition for error detection derived above is both necessary and sufficient.

Claim

Let \(C\) be a code. Then it is possible for an error processor for \(C\) to detect all errors of weight \(\le s\) iff \(d(C) \ge s + 1\).

\(\boxed{\exists D_{(C, s)} \iff d(C) \ge s + 1}\)

Proof

First Direction \(\lnot Q \implies \lnot P\)

If two code words \(u\) and \(v\) are at a distance at most \(s\) then one can be distorted into the other by an error of weight at most \(s\). In that case no error processor for \(C\) can detect all errors of weight at most \(s\).

Second Direction \(Q \implies P\)

Conversely if any two code words are at distance at least \(s + 1\) then any error of weight \(s\) will distort a code word into a non-code word. An error detector can check whether a received word is a code word or not, for instance by looking it up in a table. Thus all errors of weight at most \(s\) are detectable.

\(\blacksquare\)

Claim: condition for error correction#

There is a similar condition for error correction. Suppose we wish to be able to correct all single errors. Then given a received word and the information that an error of weight one occurred there must be only one code word that could have been transmitted. In other words no two code words \(u\) and \(v\) may be at distance at most \(1\) from the same word \(w\). Thus by the triangle inequality the code must have minimum distance at least \(3\).

A similar argument works for larger errors. If we are to be able to correct all errors of weight up to \(t\) then given a word and the information that an error of weight at most \(t\) occurred there must be a unique code word that could have been transmitted. So no two code words may be at distance at most \(t\) from the same word \(w\). Thus the code has minimum distance at least \(2t + 1\).

Claim

There exists an error processor for the code \(C\) that corrects all errors of weight up to \(t\) iff \(C\) has minimum distance \(2t + 1\).

\(\boxed{\exists P_{(C, t)} \iff d(C) \ge 2t + 1}\)

Proof

First Direction \(\lnot Q \implies \lnot P\)

Suppose the code contains two code words \(u\) and \(v\) at distance at most \(2t\). Let \(w\) be a word that agrees with \(u\) at all places that \(u\) and \(v\) agree. Further let \(w\) agree with \(u\) at the first \(t\) places where \(u\) and \(v\) disagree and with \(v\) in the remaining places where \(u\) and \(v\) disagree. (If \(d(u, v) \lt t\) then take \(w = u\).) Then \(d(u, w) \le t\) and \(d(v, w) \le t\). Now suppose \(w\) is received together with the information that at most \(t\) errors occurred. Then either \(u\) or \(v\) could have been transmitted (and possibly even some other code word). There is no way that from the given information an error processor can decide with certainty which code word was transmitted. So it will fail to correct some errors of weight \(\le t\).

Second Direction \(Q \implies P\)

Conversely suppose that the code has minimum distance \(2t + 1\) and a word \(w\) is received together with the information that an error of weight at most \(t\) has occurred. If there were two code words \(u\) and \(v\) at distance at most \(t\) from \(w\) then by the triangle inequality \(d(u, v) \le 2t\), contradicting our hypothesis. Thus there is a unique code word \(u\) at distance at most \(t\) from \(w\) and we can deduce that \(u\) must have been transmitted.

\( \begin{aligned} u &= (1, 0, 1, 1, 0, 1, 0, 0) \\ v &= (0, 0, 1, 0, 1, 1, 0, 1) \\ w &= (0, 0, 1, 0, 0, 1, 0, 1) \\ \end {aligned} \)

\(\blacksquare\)

The minimum distance \(d(C)\) of code \(C\) determines the worst-case error-detecting and -correcting




Shannon’s Theorem#

the theoretical optimum for average coding performance