Principles and Analogies of Error Correction Codes

I use a very abstract visual way to explain about different error correction algorithms. Here I will over-simplify every algorithm and then use visualizations instead, just to get the principle of how they works, and why.

Linear Block Code

In linear block code, every codewords will have n-k bits of redundant checkin part, and k-bits of message.

Linear Block Code

The generator matrix would look like this, where there is some identity matrix in it, that later will be used to generate syndrome to identify errors.

Linear Block Code Table

Hamming Code
The picture below shows how to identify error with even parity bits. The bits in black is the message, the bits in green is the parity to make sure that the number of 1 in a circle is even, the bit in red is the error, that later can be identified because the overlapping circle area will show where the error occur.

Hamming Code Diagram

The Hamming Code works like shown by the table below, the parity bits (unlike the previous method) are inserted in certain positions (2^0, 2^1, 2^2 and 2^3) to make sure that each of every 4 bits of message is being screened so that later the error can be tracked.

Hamming Code Table

Cyclic Code

This is my favorite. This method use the characteristic of polynomial in GF(2^n). So that identifying errors can be done by divide the codeword with the generator g(x). If the result is not 0, there is an error. Another thing that I like from this method is that every codeword is a shift from previous codeword (that’s why this method called “cyclic”) . So the implementation will be very easy, just shifting here and there (using LFSR), which in hardware implementation, is “costless”.

Cyclic Code Table

BCH Code

This method is a little bit complicated but the point is, to make a matrix H so that this matrix can screen out every bits (that’s why there are 2 rows that are linearly redundant).

BCH Code
Reed Muller

This one is interesting. This method has orders to identify different number of errors. Look at the matrix below. R(1,3) means that the order is 1 and the message length is 3. In Reed Muller, the pattern is obvious. Each vector in the matrix (x1, x2, x3) are used to search the location of error in convergence. Each row will direct the algorithm to a smaller space of error possibility location.
Reed Muller 1
For R(2, 3), we have the same number bits of message, which is three, but this time in order 2. Means that the matrix will be expanded and thus, more error can be detected. The part of matrix in yellow is the “basic” vectors, while the green shows the “additional” vectors to help allocating more errors.

Now look at R(3,3). More rows of the matrix, and more errors to be detected.

Reed Muller 3

So oversimplifyingly, I can say that error correction code, in principle, how to encode a message (to be a codeword) to get through a channel (error correction code is in domain channel coding, while data compression is in domain source coding) so that we would be able to make the message reliable (while in source coding, the target is not reliability but efficiency), by making sure that we can locate error and fix it. And the way to decode is like the illustration of quadtree below. The methods will generate matrix, or equations, or mathematical characteristic (polynomial groups etc) to help to narrow down the search space.


Hopefully this article can be useful, I will teach Reed Solomon Code next week, and let’s see if I can add up more things here, or publish a new blogpost.