scispace - formally typeset
Search or ask a question
Author

Claire Gardent

Bio: Claire Gardent is an academic researcher from Centre national de la recherche scientifique. The author has contributed to research in topics: Parsing & Natural language generation. The author has an hindex of 25, co-authored 167 publications receiving 2506 citations. Previous affiliations of Claire Gardent include Facebook & Saarland University.


Papers
More filters
Proceedings ArticleDOI
04 Aug 2017
TL;DR: This paper proposes the corpus generation framework as a novel method for creating challenging data sets from which NLG models can be learned which are capable of handling the complex interactions occurring during in micro-planning between lexicalisation, aggregation, surface realisation, referring expression generation and sentence segmentation.
Abstract: In this paper, we present a novel framework for semi-automatically creating linguistically challenging micro-planning data-to-text corpora from existing Knowledge Bases. Because our method pairs data of varying size and shape with texts ranging from simple clauses to short texts, a dataset created using this framework provides a challenging benchmark for microplanning. Another feature of this framework is that it can be applied to any large scale knowledge base and can therefore be used to train and learn KB verbalisers. We apply our framework to DBpedia data and compare the resulting dataset with Wen et al. 2016’s. We show that while Wen et al.’s dataset is more than twice larger than ours, it is less diverse both in terms of input and in terms of text. We thus propose our corpus generation framework as a novel method for creating challenging data sets from which NLG models can be learned which are capable of handling the complex interactions occurring during in micro-planning between lexicalisation, aggregation, surface realisation, referring expression generation and sentence segmentation. To encourage researchers to take up this challenge, we made available a dataset of 21,855 data/text pairs created using this framework in the context of the WebNLG shared task.

320 citations

Proceedings ArticleDOI
07 Sep 2017
TL;DR: The microplanning task is introduced, data preparation, evaluation methodology, participant results and a brief description of the participating systems are provided.
Abstract: The WebNLG challenge consists in mapping sets of RDF triples to text It provides a common benchmark on which to train, evaluate and compare “microplanners”, ie generation systems that verbalise a given content by making a range of complex interacting choices including referring expression generation, aggregation, lexicalisation, surface realisation and sentence segmentation In this paper, we introduce the microplanning task, describe data preparation, introduce our evaluation methodology, analyse participant results and provide a brief description of the participating systems

318 citations

PatentDOI
13 Jul 2016
TL;DR: An approach for semantic parsing that uses a recurrent neural network to map a natural language question into a logical form representation of a KB query and shows how grammatical constraints on the derivation sequence can easily be integrated inside the RNN-based sequential predictor.
Abstract: A system and method are provided which employ a neural network model which has been trained to predict a sequentialized form for an input text sequence. The sequentialized form includes a sequence of symbols. The neural network model includes an encoder which generates a representation of the input text sequence based on a representation of n-grams in the text sequence and a decoder which sequentially predicts a next symbol of the sequentialized form based on the representation and a predicted prefix of the sequentialized form. Given an input text sequence, a sequentialized form is predicted with the trained neural network model. The sequentialized form is converted to a structured form and information based on the structured form is output.

159 citations

Proceedings ArticleDOI
01 Jun 2014
TL;DR: A hybrid approach to sentence simplification which combines deep semantics and monolingual machine translation to derive simple sentences from complex ones that yields significantly simpler output that is both grammatical and meaning preserving.
Abstract: We present a hybrid approach to sentence simplification which combines deep semantics and monolingual machine translation to derive simple sentences from complex ones. The approach differs from previous work in two main ways. First, it is semantic based in that it takes as input a deep semantic representation rather than e.g., a sentence or a parse tree. Second, it combines a simplification model for splitting and deletion with a monolingual translation model for phrase substitution and reordering. When compared against current state of the art methods, our model yields significantly simpler output that is both grammatical and meaning preserving.

153 citations

Proceedings ArticleDOI
06 Jul 2002
TL;DR: An alternative, constraint-based algorithm is presented that builds on existing related algorithms in that it produces minimal descriptions for sets of individuals using positive, negative and disjunctive properties and is integrated with surface realisation.
Abstract: The incremental algorithm introduced in (Dale and Reiter, 1995) for producing distinguishing descriptions does not always generate a minimal description. In this paper, I show that when generalised to sets of individuals and disjunctive properties, this approach might generate unnecessarily long and ambiguous and/or epistemically redundant descriptions. I then present an alternative, constraint-based algorithm and show that it builds on existing related algorithms in that (i) it produces minimal descriptions for sets of individuals using positive, negative and disjunctive properties, (ii) it straightforwardly generalises to n-ary relations and (iii) it is integrated with surface realisation.

94 citations


Cited by
More filters
Journal ArticleDOI
01 Apr 1988-Nature
TL;DR: In this paper, a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) is presented.
Abstract: Deposits of clastic carbonate-dominated (calciclastic) sedimentary slope systems in the rock record have been identified mostly as linearly-consistent carbonate apron deposits, even though most ancient clastic carbonate slope deposits fit the submarine fan systems better. Calciclastic submarine fans are consequently rarely described and are poorly understood. Subsequently, very little is known especially in mud-dominated calciclastic submarine fan systems. Presented in this study are a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) that reveals a >250 m thick calciturbidite complex deposited in a calciclastic submarine fan setting. Seven facies are recognised from core and thin section characterisation and are grouped into three carbonate turbidite sequences. They include: 1) Calciturbidites, comprising mostly of highto low-density, wavy-laminated bioclast-rich facies; 2) low-density densite mudstones which are characterised by planar laminated and unlaminated muddominated facies; and 3) Calcidebrites which are muddy or hyper-concentrated debrisflow deposits occurring as poorly-sorted, chaotic, mud-supported floatstones. These

9,929 citations

Journal Article
TL;DR: A 540-billion parameter, densely activated, Transformer language model, which is called PaLM achieves breakthrough performance, outperforming the state-of-the-art on a suite of multi-step reasoning tasks, and outperforming average human performance on the recently released BIG-bench benchmark.
Abstract: Large language models have been shown to achieve remarkable performance across a variety of natural language tasks using few-shot learning , which drastically reduces the number of task-specific training examples needed to adapt the model to a particular application. To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model (PaLM). We trained PaLM on 6144 TPU v4 chips using Pathways, a new ML system which enables highly efficient training across multiple TPU Pods. We demonstrate continued benefits of scaling by achieving state-of-the-art few-shot learning results on hundreds of language understanding and generation benchmarks. On a number of these tasks, PaLM 540B achieves breakthrough performance, outperforming the finetuned state-of-the-art on a suite of multi-step reasoning tasks, and outperforming average human performance on the recently released BIG-bench benchmark. A significant number of BIG-bench tasks showed discontinuous improvements from model scale, meaning that performance steeply increased as we scaled to our largest model. PaLM also has strong capabilities in multilingual tasks and source code generation, which we demonstrate on a wide array of benchmarks. We additionally provide a comprehensive analysis on bias and toxicity, and study the extent of training data memorization with respect to model scale. Finally, we discuss the ethical considerations related to large language models and discuss potential mitigation strategies.

1,429 citations

Journal ArticleDOI
TL;DR: Minimal recursion semantics (MRS) as discussed by the authors is a framework for computational semantics that is suitable for parsing and generation and can be implemented in typed feature structure formalisms, which enables a simple formulation of the grammatical constraints on lexical and phrasal semantics, including the principles of semantic composition.
Abstract: Minimal recursion semantics (MRS) is a framework for computational semantics that is suitable for parsing and generation and that can be implemented in typed feature structure formalisms. We discuss why, in general, a semantic representation with minimal structure is desirable and illustrate how a descriptively adequate representation with a nonrecursive structure may be achieved. MRS enables a simple formulation of the grammatical constraints on lexical and phrasal semantics, including the principles of semantic composition. We have integrated MRS with a broad-coverage HPSG grammar.

960 citations

01 Mar 1991

605 citations