1 - O
/ \
3A - O 2 - O
/ \ / \
000010 001100 / \
/ 3B - O
/ / \
5 - O 110111 111000
/ \
100001 100011
Example find on 100001
Because the number of leaf nodes is at most one more than the number of internal nodes, we need to create an extra node at the zero level to guarantee that all our data can be put in the Patricia.
The convention we use here is if a node does not have a left child, then it's left pointer points back to itself. If it doesn't have a right child then the right pointer points to it's parent or the zero level node.
We search through a Patricia in the same manner as before, where the bit determines the direction taken. The catch is, if we see that the bit level stays the same or decreases then we do our comparison.
![[Patricia]](pat.gif)
We have a 1MB file whose entries are the ASCII letters a, b, c, d, e, and f with the following frequencies:
a | 450000 b | 130000 c | 120000 d | 160000 e | 90000 f | 50000If each of these letters is represented with 8 bits, then we end up with a 1MB file.
entry 3-bit code
a 000
b 001
c 010
d 011
e 100
f 101
Because we are using only 3-bits , we end up with a file that is 3/8
the size of the original 8-bit file. About 375KB.Now what if we try something else and represent the entries in the following manner.
entry code
a 0 Notice that each code is not a prefix of another.
b 101 For example, no other code begins with a's code of 0.
c 100 Nor does any other code begin with c's code of 100.
d 111
e 1101
f 1100
This is Huffman encoding.Now to decode 0101111110100, we just parse the string into the different codes and output the entry. Because the codes are not prefixes of other codes, when we make a match, we are guaranteed that the matched code is the correct one.
Here is how the above bit string is decoded.
code 0 101 111 1101 0 0 output a b d e a aThis results in a file size of 280KB.
/\
0 / \ 1
/ \ The 0's and 1's next to the branches
a /\ indicate the path to follow depending upon
/ \ the bit. 0, go left. 1, go right.
0 / \ 1
/ \
/\ /\
0/ \1 0/ \ 1
/ b / \
c / d
/\
0/ \ 1
f e
a: 45 b: 13 c: 12 d: 16 e: 9 f: 5We start building the tree by grouping the two least likely items to occur. We then add the frequencies to come up with a combined frequency for the combined entries.
We first see that e (9) and f (5) are the two lowest frequencies. We combine them and come up with a combined frequency of 9+5=14.
a: 45 b: 13 c: 12 d: 16 14
/ \
f e
Now b (13) and c (12) have the two lowest frequencies. We combine them
and come up with a combined frequency of 13+12=25.
a: 45 25 d: 16 14
/ \ / \
c b f e
Now d (16) and the f-e group (14) have the two lowest frequencies. We
combine them and come up with a combined frequency of 16+14=30.
a: 45 25 30
/ \ / \
c b / d
/\
/ \
f e
Now the c-b group (25) and the d-f-e group (30) have the two lowest
frequencies. We combine them and come up with a combined frequency of
25+30=55.
a: 45 55
/ \
/ \
/ \
/\ /\
/ \ / \
c b / d
/\
/ \
f e
Now we combine the remaining groups and end up with our final Huffman
tree.
/\
/ \
a /\
/ \
/ \
/ \
/\ /\
/ \ / \
c b / d
/\
/ \
f e
1. Programming is a thoughtful endeavor.