scispace - formally typeset
B

Benoît Sagot

Researcher at French Institute for Research in Computer Science and Automation

Publications -  176
Citations -  4849

Benoît Sagot is an academic researcher from French Institute for Research in Computer Science and Automation. The author has contributed to research in topics: Lexicon & Parsing. The author has an hindex of 26, co-authored 174 publications receiving 3389 citations. Previous affiliations of Benoît Sagot include University of Paris.

Papers
More filters
Proceedings ArticleDOI

What does BERT learn about the structure of language

TL;DR: This work provides novel support for the possibility that BERT networks capture structural information about language by performing a series of experiments to unpack the elements of English language structure learned by BERT.
Proceedings ArticleDOI

CamemBERT: a Tasty French Language Model

TL;DR: This paper investigates the feasibility of training monolingual Transformer-based language models for other languages, taking French as an example and evaluating their language models on part-of-speech tagging, dependency parsing, named entity recognition and natural language inference tasks.

Asynchronous Pipeline for Processing Huge Corpora on Medium to Low Resource Infrastructures

TL;DR: A general, highly parallel, multithreaded pipeline to clean and classify Common Crawl by language is proposed and developed so that it runs efficiently on medium to low resource infrastructures where I/O speeds are the main constraint.
Proceedings ArticleDOI

CamemBERT: a Tasty French Language Model

TL;DR: CamemBERT as discussed by the authors is a French version of the Bi-directional Encoders for Transformers (BERT) for part-of-speech tagging, dependency parsing, named entity recognition, and natural language inference.
Proceedings Article

The Lefff, a freely available and large-coverage morphological and syntactic lexicon for French

Benoît Sagot
TL;DR: The Lefff is introduced, a freely available, accurate and large-coverage morphological and syntactic lexicon for French, used in many NLP tools such as large- coverage parsers.