Home
/
Authors
/
Anne-Laure Ligozat

Author

Anne-Laure Ligozat

Other affiliations: Centre national de la recherche scientifique, Université Paris-Saclay

Bio: Anne-Laure Ligozat is an academic researcher from École Normale Supérieure. The author has contributed to research in topics: Question answering & Annotation. The author has an hindex of 16, co-authored 87 publications receiving 781 citations. Previous affiliations of Anne-Laure Ligozat include Centre national de la recherche scientifique & Université Paris-Saclay.

Topics: Question answering, Annotation, Vocabulary, Clef, Information extraction ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005

Papers

PDF

Open Access

More filters

Journal Article•DOI•

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

[...]

Teven Le Scao, Angela Fan, Christopher Akiki, Elizabeth-Jane Pavlick +383 more

09 Nov 2022-arXiv.org

TL;DR: BLOOM as discussed by the authors is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total).

...read moreread less

Abstract: Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

...read moreread less

407 citations

Proceedings Article•DOI•

Syntactic Sentence Simplification for French

[...]

Laetitia Brouwers, Delphine Bernhard, Anne-Laure Ligozat, Thomas François

27 Apr 2014

TL;DR: This approach is based on the study of two parallel corpora and aims to identify the linguistic phenomena involved in the manual simplification of French texts and organise them within a typology to generate simplified sentences.

...read moreread less

Abstract: This paper presents a method for the syntactic simplification of French texts. Syntactic simplification aims at making texts easier to understand by simplifying complex syntactic structures that hinder reading. Our approach is based on the study of two parallel corpora (encyclopaedia articles and tales). It aims to identify the linguistic phenomena involved in the manual simplification of French texts and organise them within a typology. We then propose a syntactic simplification system that relies on this typology to generate simplified sentences. The module starts by generating all possible variants before selecting the best subset. The evaluation shows that about 80% of the simplified sentences produced by our system are accurate.

...read moreread less

62 citations

Journal Article•DOI•

Hybrid methods for improving information access in clinical documents: concept, assertion, and relation identification

[...]

Anne-Lyse Minard¹, Anne-Laure Ligozat¹, Asma Ben Abacha¹, Delphine Bernhard¹, Bruno Cartoni¹, Louise Deléger¹, Brigitte Grau¹, Sophie Rosset¹, Pierre Zweigenbaum¹, Cyril Grouin¹ - Show less +6 more•Institutions (1)

Centre national de la recherche scientifique¹

01 Sep 2011-Journal of the American Medical Informatics Association

TL;DR: The authors confirm that the use of only machine-learning methods is highly dependent on the annotated training data, and thus obtained better results for well-represented classes.

...read moreread less

61 citations

CARAMBA: Concept, Assertion, and Relation Annotation using Machine-learning Based Approaches

[...]

Cyril Grouin, Asma Ben Abacha, Delphine Bernhard, Bruno Cartoni, Louise Deléger, Brigitte Grau¹, Anne-Laure Ligozat, Anne-Lyse Minard, Sophie Rosset, Pierre Zweigenbaum - Show less +6 more•Institutions (1)

École Normale Supérieure¹

12 Nov 2010

TL;DR: This year’s i2b2/VA challenge is dedicated to medical concept extraction as well as the annotation of assertions and relationships of concepts, mainly based upon machine-learning systems.

...read moreread less

Abstract: This year’s i2b2/VA challenge is dedicated to medical concept extraction as well as the annotation of assertions and relationships of concepts. Several kinds of concepts, assertions, and relations must be processed. In this paper, we present the methods we used, mainly based upon machine-learning systems. The results we obtained on the final ground truth (Fmeasures up to 0.773 for concepts, 0.931 for assertions, and 0.709 for relations) constitute a basis for

...read moreread less

40 citations

Journal Article•DOI•

A French clinical corpus with comprehensive semantic annotations: development of the Medical Entity and Relation LIMSI annOtated Text corpus (MERLOT)

[...]

Leonardo Campillos, Louise Deléger, Cyril Grouin, Thierry Hamon¹, Anne-Laure Ligozat, Aurélie Névéol - Show less +2 more•Institutions (1)

University of Paris¹

01 Jun 2018

TL;DR: A corpus of clinical narratives in French annotated for linguistic, semantic and structural information, aimed at clinical information extraction is presented and harmonization tools to automatically identify annotation differences to be addressed to improve the overall corpus quality are introduced.

...read moreread less

Abstract: Quality annotated resources are essential for Natural Language Processing. The objective of this work is to present a corpus of clinical narratives in French annotated for linguistic, semantic and structural information, aimed at clinical information extraction. Six annotators contributed to the corpus annotation, using a comprehensive annotation scheme covering 21 entities, 11 attributes and 37 relations. All annotators trained on a small, common portion of the corpus before proceeding independently. An automatic tool was used to produce entity and attribute pre-annotations. About a tenth of the corpus was doubly annotated and annotation differences were resolved in consensus meetings. To ensure annotation consistency throughout the corpus, we devised harmonization tools to automatically identify annotation differences to be addressed to improve the overall corpus quality. The annotation project spanned over 24 months and resulted in a corpus comprising 500 documents (148,476 tokens) annotated with 44,740 entities and 26,478 relations. The average inter-annotator agreement is 0.793 F-measure for entities and 0.789 for relations. The performance of the pre-annotation tool for entities reached 0.814 F-measure when sufficient training data was available. The performance of our entity pre-annotation tool shows the value of the corpus to build and evaluate information extraction methods. In addition, we introduced harmonization methods that further improved the quality of annotations in the corpus.

...read moreread less

38 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Collapse

Cited by

PDF

Open Access

More filters

How to Do Things With Words

[...]

Csr Young

01 Jan 2009

7,241 citations

Sustainable Development Goals

[...]

Timothy Robinson, Aleksandra Gorb, Rob Page

21 Nov 2016

2,966 citations

文化と言語の多様性の中のCommon European Framework of Reference for Languages: Learning, teaching, assessment (CEFR)--それは基準か? (第10回明海大学大学院応用言語学研究科セミナー講演)

[...]

茂吉島

01 Mar 2008

TL;DR: It’s time to get used to the idea that there is no such thing as a “magic bullet”.

...read moreread less

Abstract: 中國科技大學通識教育中心英語文證照奬勵金實施要點中華民國 105 年 1 月 8 日通識教育委員會議通過一、中國科技大學(以下簡稱本校)為鼓勵本校學生通過具公信力機構之英語文能力測驗或取得證照,特訂定「中國科技大學通識教育中心英語文證照獎勵金實施要點」(以下簡稱本要點)。二、學生於就讀本校期間,通過歐盟共同架構(CEFR)語言能力參考指標 B1(中級)同等級英語文能力測驗以上(含)者,得依據本要點酌予獎勵。檢測項目請參閱本中心「歐洲語言學習、教學、評量共同參考架構與各英語檢測分級對照表」(參見附表);未列於標準對照表之測驗項目不給予獎助。三、凡本校學生,除應英系外,均得申請。大學部學生通過同等級以申請一次為限,在學期間得重複申請,但該次申請之級別不得低於前次。本獎勵金每學期核發乙次,每次核發全校前 10 名,各名次核發金額如附表。四、申請人應提供在學期間,申請當(學)期參加考試之證明文件及成績證明或證照,以憑辦理。五、獎勵金申請作業:請至通識教育中心網頁下載「英語文證照獎勵金申請表」(附件 1), 填妥後檢附成績單正本及影本(背面簽名並註明與正本無異)各一份、本人金融帳戶存簿(郵局或土地銀行)封面影本送至通識教育中心。通識教育中心得每學期遴選受獎代表,擇期公開頒奬,並辦理後續請款作業。六、奬勵金申請期限:通過相關證照考試半年內應提出申請,逾期視同放棄。七、本要點之獎勵金由學校開設通識教育中心專戶,一切收支專款專用;每年度如有剩餘款,則移至翌年度繼續使用。八、本要點經通識教育中心會議審查通過,陳請校長核定後公告實施,修訂時亦同。

...read moreread less

1,468 citations

Learning Vocabulary In Another Language

[...]

Anna Papst

01 Jan 2016

TL;DR: The learning vocabulary in another language is universally compatible with any devices to read and is available in the digital library an online access to it is set as public so you can get it instantly.

...read moreread less

Abstract: Thank you very much for downloading learning vocabulary in another language. As you may know, people have search numerous times for their favorite novels like this learning vocabulary in another language, but end up in infectious downloads. Rather than enjoying a good book with a cup of tea in the afternoon, instead they cope with some infectious virus inside their laptop. learning vocabulary in another language is available in our digital library an online access to it is set as public so you can get it instantly. Our digital library hosts in multiple countries, allowing you to get the most less latency time to download any of our books like this one. Merely said, the learning vocabulary in another language is universally compatible with any devices to read.

...read moreread less

1,311 citations

Journal Article•DOI•

2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text.

[...]

Özlem Uzuner¹, Brett R. South², Shuying Shen², Scott L. DuVall²•Institutions (2)

State University of New York System¹, University of Utah²

01 Sep 2011-Journal of the American Medical Informatics Association

TL;DR: The 2010 i2b2/VA Workshop on Natural Language Processing Challenges for Clinical Records presented three tasks, which showed that machine learning approaches could be augmented with rule-based systems to determine concepts, assertions, and relations.

...read moreread less

1,111 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse