Just to complete the jpeg compression story we will briefly look at the remaining processes.
The final stage of the JPEG process is to use the loss-less Huffman compression coding to compress all of the run length compressed DCT terms. As mentioned earlier Huffman is a variable length code based on probabilities. Unlike most other codes which are fixed length block codes. For Huffman to work well there must be values in the data that occur more frequently than others. If the values are completely random then Huffman coding is pointless.
Now to run you through a typical training example that illustrates the benefits of Huffman compression.
Imagine that digital codes are used to transmit various regional weather conditions to shipping.
Let's consider there are 4 states for visibility as outlined in the table below
There are 4 state which would require a 2 bit code
To workout the Huffman code you need to know the probabilities of each state. In our case the probabilities are as follows.
The most probably events gets the shortest code and the least probably events get the longest code
Using a block code it will take 2 bits to transmit the information regardless of the visibility condition.
In the case of Huffman the average code length will be
(3bits x 0.1) + ( 2bits x 0.3) + (1bit x 0.5) + (3bits x 0.1) = 1.7 binary digits per code.
An average saving of 15%
Care is needed in choosing the Huffman codes so that the string of codes are uniquely decodable. We need to know when a code starts and finishes as they are no longer of a fixed length.
110011110 = good,moderate,very poor, poor
In some situations the probabilities of events are know in advance but in the case of image data they have to calculated for each image before coding can commence. In most cases a coding tree or dictionary has to be constructed and included in the file so that compressed Huffman data can be decoded when the image file is opened on the computer.