Huffman Code and Its Tree

Huffman Code and Its Tree



Huffman Codes are codes that are part of a compression algorithm that generates a tree called Huffman Tree. This algorithm is implemented on text, image, and video compressions. The idea is to generate codes that will hold information while it is being light or small in bit size.

In this reading, we will try to use the Huffman Code to compress a text. The idea is to calculate the frequency of each letter and then generate a Huffman Tree. Then for each left child line or left adjacent, we give 0 value. For each right child line or right adjacent, we give 1 value. The Huffman Code for each letter is the combination of the value of the path to the letter in Huffman Tree.

Let’s us consider text “DESIGN AND ANALYSIS OF ALGORITHMS” to be compressed. First, we need to create a frequency letter for each letter of the text.

S

sp

A

I

N

D

G

L

O

E

Y

F

R

T

H

M

4

4

4

3

3

2

2

2

2

1

1

1

1

1

1

1

*Note: sp = space

Then, we generate a Huffman Tree. The way we create Huffman Tree is to select the minimum frequency letter then combine its frequency as the parent node value. This step is repeated until it reaches the topmost node as its root.


Then from that Huffman Tree, we can get the Huffman Code for each letter. Please refers to this table below.

Character/Letter

Huffman Code

Bit

S

000

3 bits

sp

001

3 bits

A

010

3 bits

I

0110

4 bits

N

0111

4 bits

D

1100

4 bits

G

1000

4 bits

L

1001

4 bits

O

1010

4 bits

E

1011

4 bits

Y

11010

5 bits

F

11011

5 bits

R

11100

5 bits

T

11101

5 bits

H

11110

5 bits

M

11111

5 bits

Total Bit

67 bits

Therefore, for text “DESIGN AND ANALYSIS OF ALGORITHMS”, the Huffman Code is described below.

o  D:1100(4 bits)

o  E:1011(4 bits)

o  S:000(3 bits)

o  I:0110(4 bits)

o  G:1000(4 bits)

o  N:0111(4 bits)

o  sp:001(3 bits)

o  A:010(3 bits)

o  N:0111(4 bits)

o  D:1100(4 bits)

o  sp:001(3 bits)

o  A:010(3 bits)

o  N:0111(4 bits)

o  A:010(3 bits)

o  L:1001(4 bits)

o  Y:11010(5 bits)

o  S:000(3 bits)

o  I:0110(4 bits)

o  S:000(3 bits)

o  sp:001(3 bits)

o  O:1010(4 bits)

o  F:11011(5 bits)

o  sp:001(3 bits)

o  A:010(3 bits)

o  L:1001(4 bits)

o  G:1000(4 bits)

o  O:1010(4 bits)

o  R:11100(5 bits)

o  I:0110(4 bits)

o  T:11101(5 bits)

o  H:11110(5 bits)

o  M:11111(5 bits)

o  S:000(3 bits)

Total bits = 126 bits

126 bits is much smaller than its original size which is 264 bits. This is why Huffman is a quite popular algorithm for compressions.

COMMENTS

No comments:

Post a Comment

POPULAR POSTS