P
Patrick Paroubek
Researcher at Centre national de la recherche scientifique
Publications - 88
Citations - 3629
Patrick Paroubek is an academic researcher from Centre national de la recherche scientifique. The author has contributed to research in topics: Parsing & Sentiment analysis. The author has an hindex of 18, co-authored 80 publications receiving 3454 citations. Previous affiliations of Patrick Paroubek include University of Paris & University of Nantes.
Papers
More filters
Proceedings Article
A Protocol for Evaluating Analyzers of Syntax (PEAS)
Véronique Gendner,Gabriel Illouz,Michèle Jardino,Laura Monceaux,Patrick Paroubek,Isabelle Robba,Anne Vilnat +6 more
TL;DR: This paper presents PEAS: a Protocol for Evaluating Analyzers of Syntax (in French: Protocole d’Evaluation pour les Analyseurs Syntaxiques), based on an ongoing experiment at LIMSI which aims at developing and testing a generic quantitative black-box evaluation protocol for parsers of French.
Proceedings Article
Automatic Audio and Manual Transcripts Alignment, Time-code Transfer and Selection of Exact Transcripts
Claude Barras,Gilles Adda,Martine Adda-Decker,Benoît Habert,Philippe Boula de Mareüil,Patrick Paroubek +5 more
TL;DR: This study makes use of 10 hours of French radio interview archives with corresponding press-oriented transcripts to generate automatic transcripts of sibling resources of audio and written documents, such as available in audio archives or for parliament debates.
The Multilingual Anonymisation Toolkit for Public Administrations (MAPA) Project
E Ajausks,Victoria Arranz,Laurent Bié,A Cerdà-I-Cucó,Khalid Choukri,Montse Cuadros,Hans Degroote,Amando Estela,Thierry Etchegoyhen,Mercedes García-Martínez,Aitor García-Pablos,Manuel Herranz,Alejandro Adolfo Kohan,Maite Melero,Mike Rosner,Roberts Rozis,Patrick Paroubek,A Vasiļevskis,Pierre Zweigenbaum +18 more
TL;DR: The MAPA project, funded under the Connecting Europe Facility programme, is described, whose goal is the development of an open-source de-identification toolkit for all official European Union languages.
Proceedings Article
NLP Analytics in Finance with DoRe: A French 250M Tokens Corpus of Corporate Annual Reports.
Corentin Masson,Patrick Paroubek +1 more
TL;DR: The construction of the DoRe corpus is related, which is designed to be as modular as possible in order to allow for maximum reuse in different tasks pertaining to Economics, Finance and Regulation, and on the spectrum of possible uses of this new resource for NLP applications.
Proceedings ArticleDOI
Rediscovering 50 years of discoveries in speech and language processing: A survey
Joseph Mariani,Gil Francopoulo,Patrick Paroubek,Frédéric Vernier,Nam Kyun Kim,Moon Ju Jo,Hong Kook Kim +6 more
TL;DR: The NLP4NLP corpus is created to study the content of scientific publications in the field of speech and natural language processing, comprising 65,000 documents, gathering 50,000 authors, including 325,000 references and representing approximately 270 million words.