Let I={0,…,n−1}
(typically n=256) be a memoryless source. For all i in I the probabilities
are given by
Pr(i)=pi/M
where pi is a positive integer
and M an integer (generally a power of $2$, equal to
65536 here). We denote by
si the sum Σj<ipj.
Outline of the encoding
Initialize the interval as [a,b[=[0,M[, and let k=0 be a counter
for each letter i output by the source
[a,b[ becomes [a + si*(b−a)/M, a + (si+pi)*(b−a)/M[
(the numbers are rounded down)
the new interval [a,b[ is reduced (see below)
when all symbols are read:
if a < M/4 then 01k+1 is added to the output, or
10k+1 is added to the output.
Reduction of the interval [a,b[ while one of these conditions is met
if b ≤ M/2, the interval becomes [2a, 2b[
01k is added to the output, k is reset to 0
else if a ≥ M/2, the interval becomes [2a − M, 2b − M[
10k is added to the output, k is reset to 0
else if M/4 ≤ a < b ≤ 3M/4, the interval becomes [2a − M/2,
2b − M/2[
the counter k is incremented
For a stream (i1,…,iL) the output is the
truncated binary representation of
S(x1,…,xn).
The class Source.java contains
the definitions of the constants and the probability distributions
and some useful methods.
Implement the arithmetic encoder. You can use the following implementation of the
decoder to check your implementation. To use the decoder, run java Decoder fichier.code
where fichier.code is your encoded file.