------------------------------------------------------------------------------
COMP 731 - Data Struc & Algorithms - Lecture 31 - Tuesday, September 19, 2000
------------------------------------------------------------------------------
Today:
o Huffman code
o LZ77 algorithm
o How ZIP and gzip work
Explain Huffman's algorithm and how to implement it with a heap.
Give examples, showing the amount of compression you get with
different encodings of a long file.
Explain the LZ77 algorithm and give examples.
Make sure that some of these examples show the case in which the
string that is being copied creates the characters it will subsequently
be using. Talk about how to implement LZ efficiently, using hashing.
Explain the basics of the ZIP algorithm
1. Break the string into 32 KByte chunks. Separately compress each.
2. Use LZ77 on each chunk, representing
literal 0******* -------- * = unused bit
position 1------- -------- - = information bit
length --------
Runs of longer than 256 bytes are not used -- stop at 256.
(Probably 259, actually; lengths of 0 and 1, and maybe 2, are
not used, so we can steal those values for other values).
3. Huffman encode each chunk, using one tree (one code) for the
literals and positions, and one tree for the lengths. There will
never be ambiguity in decoding because a length always follows the
position.