Huffman compression

The Huffman compression algorithm is named after its inventor, David Huffman, formerly a professor at MIT.
Huffman compression is a lossless compression algorithm that is ideal for compressing text or program files. This probably explains why it is used a lot in compression programs like ZIP or ARJ.

How Huffman compression works

Huffman compression belongs into a family of algorithms with a variable codeword length. That means that individual symbols (characters in a text file for instance) are replaced by bit sequences that have a distinct length. So symbols that occur a lot in a file are given a short sequence while other that are used seldom get a longer bit sequence.

A practical example will show you the principle: Suppose you want to compress the following piece of data:

Since these are 6 characters, this text is 6 bytes or 48 bits long. With Huffman encoding, the file is searched for the most frequently appearing symbols (in this case the character ‘A’ occurs 3 times) and then a tree is build that replaces the symbols by shorter bit sequences. In this particular case, the algorithm would use the following substitution table: A=0, B=10, C=110, D=111. If these code words are used to compress the file, the compressed data look like this:

This mean that 11 bits are used instead of 48, a compression ratio of 4 to 1 for this particular file.

Huffman encoding can be further optimized in two different ways:

  • Adaptive Huffman code dynamically changes the code words according to the change of probabilities of the symbols.
  • Extended Huffman compression can encode groups of symbols rather than single symbols.

Advantages and disadvantages

This compression algorithm is mainly efficient in compressing text or program files. Images like they are often used in prepress are better handled by other compression algorithms.

Where is Huffman compression used

Huffman compression is mainly used in compression programs like pkZIP, lha, gz, zoo and arj. It is also used within JPEG and MPEG compression.

8 August 2013

10 responses to “Huffman compression”

  1. PRACHI says:

    thanks so much….all doubts cleared by this article….

  2. Shakti says:

    Can we use dis algo for lossless image compression and dn send it on network ??

  3. lorevi igloria says:

    What are the other algorithms of huffman that will apply in the text file compression?

  4. sachin says:

    Can this algorithm be used while sending data on network?
    Please reply..

  5. guddu khan says:

    very good artical….

  6. dheeksha says:

    thank u very much.understood clearly

  7. Nabin says:

    Its very helpful and clearly understandable. Thanks.

  8. Van says:

    clearly, thanks.

  9. Lucas says:

    very helpful… thank you.

  10. arwa says:

    very good artical

    and so helpful