Since the concept of public-key cryptography appeared in 1977, searching for secure public-key cryptosystems and identification schemes has been one of the most active areas in the field of cryptology. Many public-key ciphers emerged just after the invention of RSA and their underlying problems were as varied as computing a discrete logarithm, solving a knapsack problem, inverting some polynomial equations over a finite field... But the development of some cryptanalysis methods have finally made most of them insecure. Twenty years after the fundamental paper of Diffie and Hellman, public-key cryptography has therefore become dangerously dependent on only two problems: integer factoring and discrete logarithm. However the class of public-key ciphers and identification schemes based on error-correcting codes still resists cryptanalysis. It relies on the hardness of decoding or equivalently of finding a minimum-weight codeword in a large linear code with no visible structure. The most famous of these systems are McEliece and Niederreiter ciphers --- which are equivalent from the security point of view --- and the identification schemes proposed by Stern and Véron. They are at the moment one of the few alternatives to the common public-key algorithms based on number theory. Studying their security is therefore essential in order to anticipate a possible important progress in factoring methods for example. Moreover these public-key ciphers are particularly interesting since they run much faster than any algorithm relying on number theory.
McEliece cryptosystem
A q-ary linear code of length n and dimension k is a vector subspace of GF(q^n) of dimension k, where GF(q) denotes the finite field with q elements. A linear code can then be described by an n*k matrix, called generator matrix of the code. The Hamming weight of a codeword is defined as the number of its nonzero components. Finding a low-weight codeword in a linear code is then a hard problem which can be used in public-key cryptography. For instance in the public-key cipher proposed by McEliece in 1978, the structure of the code corresponds to the secret key and the public key consists of a permuted generator matrix of this code (this permutation aims at hiding the initial structure of the secret code). When nothing is known but the public key, the only method for finding a low-weight codeword in the code consists in enumerating all q^k codewords; this becomes intractable when the dimension of the code is high (the codes used in practice contain around 2^500 words).
A new algorithm for finding low-weight codewords
In a joined work with Florent Chabaud, we develop a new probabilistic algorithm for finding low-weight words in a large linear code. For decoding a random linear binary code of length 1000 and dimension 500, it runs much that 100 times faster than the best previously-known algorithm. Using Markov chain theory we give a very precise analysis of the complexity of this new algorithm. We notably show that it constitutes an important attack against McEliece cryptosystem: the probability for decrypting a ciphertext with 3 months on 10 DEC alpha workstations is 10^{-4}. We also prove that for McEliece cipher the knowledge of 23 % of the plaintext is sufficient to recover the whole plaintext in a reasonable time. Both of theses attacks then point out some important weaknesses in the original McEliece cipher.
Our algorithm has also many applications in coding theory: it allows to decode efficiently a linear code with no particular structure. It can also be used for finding the minimum distance of a code, i.e. the lowest possible weight amongst all codewords.The minimum distance is actually a fundamental parameter since it determines the error-correcting capability of the code. With our algorithm, we found the true minimum distance of some BCH codes of length 511 for which only a lower bound was known. Here is an updated table of the minimum distance of BCH codes of length 511.