News: The architecture proposed by Alpage for parsing noisy English text from the web allowed the team to be ranked second and third at the shared task organized by Google on this topic in May.
Natural Language Processing (NLP) is a research domain involving computer science and linguistics, but also statistics and cognition. Its main goal is human language understanding and generation, on either written or oral forms. Alpage essentially focuses on automatic understanding of French texts, with however also some works on automatic generation and on other languages, including English.
The strong momentum of the Alpage team may be partially credited to the complementarity of its members: many important advances in NLP have been made possible only through a tight collaboration between computer scientists (from former ATOLL team) and NLP-specialized linguists (from former Paris 7 Talana team). Alpage, a joint team (UMR-I) between INRIA and University Paris 7, wishes to make a significant contribution to French automatic analysis. It requires a better understanding and a better formalization of linguistic phenomena, including the most complex ones, completed by their integration within lexical and grammatical models relying on advanced symbolic and/or stochastic algorithms for parsing and lexical analysis. On a longer term, automatic generation and translation are also important objectives.
NLP applications are numerous and considered important for Alpage, including for instance the development of software for information retrieval, text mining and experimental linguistics.
ALPAGE was created as an INRIA team in July 2007, then as an INRIA Project-Team (EPI) on January 1st, 2008, at the Paris-Rocquencourt INRIA Research Center. ALPAGE was a follow-up to the ex-EPI ATOLL (ATelier d'Outils Logiciels pour le Langage naturel), lead by Éric de La Clergerie, and to the Paris 7 TALANA team (Traitement Automatique de la LAngue NAturelle), head by Laurence Danlos as a Jeune Equipe (Young Team) then within the Lattice Laboratory (UMR CNRS 8094).
ALPAGE is a member of the LabEx EFL (excellency cluster on Empirical Foundations of Linguistics), which brings together a dozen of research laboratories from all domains involving linguistics, including natural language processing.