TD 4, Lempel-Ziv-Welsh coding
Introduction to information theory
January 26 2023 |
Outline of the encoding algorithm
-
read the text until finding a word m followed by a letter
a such that m is in the dictionary, but not m||a,
- print i where i is the index of m in the dictionary,
- add m||a to the dictionary, with its index,
- Start again.
/
/All the words of length 1 have to be put in the dictionary initially.
Outline of the decoding algorithm
-
read an index i,
- print the word m of index i
- add at the end of the dictionary m||a where a is the letter that follows m
As for the encoder, all words of length 1 have to be put in the
dictionary beforehand.
We will use
Here is what you should obtain with the file hamlet.txt
% wc -c hamlet.txt
8573 hamlet.txt
% java Coder hamlet.txt > hamlet.txt.lzw
% wc -c hamlet.txt.lzw
35580 hamlet.txt.lzw
% java Decoder hamlet.txt.lzw >! hamlet.txt.dec
% diff hamlet.txt hamlet.txt.dec
%
This document was translated from LATEX par HEVEA