Home
/
Authors
/
Géraldine Damnati

Author

Géraldine Damnati

Other affiliations: Oregon Health & Science University

Bio: Géraldine Damnati is an academic researcher from Orange S.A.. The author has contributed to research in topics: Language model & Spoken language. The author has an hindex of 14, co-authored 83 publications receiving 891 citations. Previous affiliations of Géraldine Damnati include Oregon Health & Science University.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
1999
1998
1997

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Normalizing SMS: are Two Metaphors Better than One ?

[...]

Catherine Kobus, François Yvon¹, Géraldine Damnati•Institutions (1)

Centre national de la recherche scientifique¹

18 Aug 2008

TL;DR: This paper presents an comparative study of systems aiming at normalizing the orthography of French SMS messages, one drawing inspiration from the Machine Translation task; the other using techniques that are commonly used in automatic speech recognition devices.

...read moreread less

Abstract: Electronic written texts used in computermediated interactions (e-mails, blogs, chats, etc) present major deviations from the norm of the language This paper presents an comparative study of systems aiming at normalizing the orthography of French SMS messages: after discussing the linguistic peculiarities of these messages, and possible approaches to their automatic normalization, we present, evaluate and contrast two systems, one drawing inspiration from the Machine Translation task; the other using techniques that are commonly used in automatic speech recognition devices Combining both approaches, our best normalization system achieves about 11% Word Error Rate on a test set of about 3000 unseen messages

...read moreread less

172 citations

Proceedings Article•

Robust Speech/Non-Speech Detection using LDA applied to MFCC for Continuous Speech Recognition

[...]

Arnaud Martin, Géraldine Damnati, Laurent Mauuary

01 Jan 2001

TL;DR: A method for speech/non-speech detection using a linear discriminant analysis (LDA) applied to mel frequency cepstrum coefficients (MFCC) is presented, which reduces the detection of noise segments.

...read moreread less

Abstract: Continuous speech recognition applications need precise detection because the number of words to recognize is unknown and vocabulary words can be short. The speech/non-speech detection must be robust to the boundary precision. In this work, a new approach to evaluate detection algorithm for continuous speech recognition is presented. The speech/non-speech detection using energy parameter combined with a Linear Discriminant Analysis (LDA) applied to Mel Frequency Cepstrum Coefficients (MFCC) is compared to the algorithm based on signal to noise ratio (SNR). The LDA applied to MFCC for speech/non-speech detection improves recognition performance in noisy environment and for continuous speech recognition applications.

...read moreread less

77 citations

Journal Article•DOI•

On the use of finite state transducers for semantic interpretation

[...]

Christian Raymond¹, Frédéric Béchet¹, Renato De Mori¹, Géraldine Damnati²•Institutions (2)

University of Avignon¹, Orange S.A.²

01 Mar 2006-Speech Communication

TL;DR: A spoken language understanding (SLU) system is described which generates hypotheses of conceptual constituents with a translation process by finite state transducers which accept word patterns from a lattice of word hypotheses generated by an Automatic Speech Recognition system.

...read moreread less

76 citations

Proceedings Article•DOI•

SimBow at SemEval-2017 Task 3: Soft-Cosine Semantic Similarity between Questions for Community Question Answering

[...]

Delphine Charlet¹, Géraldine Damnati¹•Institutions (1)

Orange S.A.¹

01 Aug 2017

TL;DR: The SimBow system submitted at SemEval2017-Task3 is a supervised combination of different unsupervised textual similarities based on the introduction of a relation matrix in the classical cosine similarity between bag-of-words to get a soft-cosine that takes into account relations between words.

...read moreread less

Abstract: This paper describes the SimBow system submitted at SemEval2017-Task3, for the question-question similarity subtask B. The proposed approach is a supervised combination of different unsupervised textual similarities. These textual similarities rely on the introduction of a relation matrix in the classical cosine similarity between bag-of-words, so as to get a soft-cosine that takes into account relations between words. According to the type of relation matrix embedded in the soft-cosine, semantic or lexical relations can be considered. Our system ranked first among the official submissions of subtask B.

...read moreread less

60 citations

Proceedings Article•DOI•

Robust speaker turn role labeling of TV Broadcast News shows

[...]

Géraldine Damnati¹, Delphine Charlet¹•Institutions (1)

Orange S.A.¹

22 May 2011

TL;DR: A mixed approach combining speaker clustering and analysis of Automatic Speech Recognition output is proposed for assigning speaker turns a role among: anchor, reporter and other.

...read moreread less

Abstract: Speaker role recognition in TV Broadcast News shows is addressed in this paper with a particular focus on speaker turn role labeling. A mixed approach combining speaker clustering and analysis of Automatic Speech Recognition output is proposed for assigning speaker turns a role among: anchor, reporter and other. 86% classification accuracy is obtained for automatically segmented speaker turns on a 6.5 hours test corpus of 14 TVBN shows mixing news and conversational speech.

...read moreread less

30 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18

Collapse

Cited by

PDF

Open Access

More filters

Book•

Distributed Systems: Principles and Paradigms

[...]

Andrew S. Tanenbaum, Maarten van Steen¹•Institutions (1)

VU University Amsterdam¹

01 Jan 2001

TL;DR: Intended for use in a senior/graduate level distributed systems course or by professionals, this text systematically shows how distributed systems are designed and implemented in real systems.

...read moreread less

Abstract: From the Publisher: Andrew Tanenbaum and Maarten van Steen cover the principles, advanced concepts, and technologies of distributed systems in detail, including: communication, replication, fault tolerance, and security. Intended for use in a senior/graduate level distributed systems course or by professionals, this text systematically shows how distributed systems are designed and implemented in real systems. Written in the superb writing style of other Tanenbaum books, the material also features unique accessibility and a wide variety of real-world examples and case studies, such as NFS v4, CORBA, DOM, Jini, and the World Wide Web. FEATURES Detailed coverage of seven key principles. An introductory chapter followed by a chapter devoted to each key principle: communication, processes, naming, synchronization, consistency and replication, fault tolerance, and security, including unique comprehensive coverage of middleware models. Four chapters devoted to state-of-the-art real-world examples of middleware. Covers object-based systems, document-based systems, distributed file systems, and coordination-based systems including CORBA, DCOM, Globe, NFS v4, Coda, the World Wide Web, and Jini. Excellent coverage of timely, advanced, distributed systems topics: Security, payment systems, recent Internet and Web protocols, scalability, and caching and replication. NEW-The Prentice Hall Companion Website for this book contains PowerPoint slides, figures in various file formats, and other teaching aids, and a link to the author's Web site.

...read moreread less

2,011 citations

Proceedings Article•

Named Entity Recognition in Tweets: An Experimental Study

[...]

Alan Ritter¹, Sam Clark¹, Oren Etzioni¹•Institutions (1)

University of Washington¹

27 Jul 2011

TL;DR: The novel T-ner system doubles F1 score compared with the Stanford NER system, and leverages the redundancy inherent in tweets to achieve this performance, using LabeledLDA to exploit Freebase dictionaries as a source of distant supervision.

...read moreread less

Abstract: People tweet more than 100 Million times daily, yielding a noisy, informal, but sometimes informative corpus of 140-character messages that mirrors the zeitgeist in an unprecedented manner. The performance of standard NLP tools is severely degraded on tweets. This paper addresses this issue by re-building the NLP pipeline beginning with part-of-speech tagging, through chunking, to named-entity recognition. Our novel T-ner system doubles F1 score compared with the Stanford NER system. T-ner leverages the redundancy inherent in tweets to achieve this performance, using LabeledLDA to exploit Freebase dictionaries as a source of distant supervision. LabeledLDA outperforms co-training, increasing F1 by 25% over ten common entity types. Our NLP tools are available at: http://github.com/aritter/twitter_nlp

...read moreread less

1,351 citations

Journal Article•DOI•

Language and the Internet

[...]

Jean Aitchison

01 Sep 2002-Literary and Linguistic Computing

764 citations

Proceedings Article•DOI•

Power efficient processor architecture and the cell processor

[...]

Harm Peter Hofstee¹•Institutions (1)

IBM¹

12 Feb 2005

TL;DR: In this paper, the authors provide a background and rationale for some of the architecture and design decisions in the cell processor, a processor optimized for compute-intensive and broadband rich media applications, jointly developed by Sony Group, Toshiba, and IBM.

...read moreread less

Abstract: This paper provides a background and rationale for some of the architecture and design decisions in the cell processor, a processor optimized for compute-intensive and broadband rich media applications, jointly developed by Sony Group, Toshiba, and IBM. The paper discusses some of the challenges microprocessor designers face and provides motivation for performance per transistor as a reasonable first-order metric for design efficiency. Common microarchitectural enhancements relative to this metric are provided. Also alternate architectural choices and some of its limitations are discussed and non-homogeneous SMP as a means to overcome these limitations is proposed.

...read moreread less

395 citations

Journal Article•DOI•

Spoken language understanding

[...]

R. De Mori¹, Frédéric Béchet², Dilek Hakkani-Tur³, Michael F. McTear¹, Giuseppe Riccardi⁴, Gokhan Tur⁵ - Show less +2 more•Institutions (5)

Ulster University¹, University of Avignon², Institute of Company Secretaries of India³, University of Trento⁴, SRI International⁵

18 Apr 2008-IEEE Signal Processing Magazine

TL;DR: Spoken language understanding and natural language understanding share the goal of obtaining a conceptual representation of natural language sentences and computational semantics performs a conceptualization of the world using computational processes for composing a meaning representation structure from available signs.

...read moreread less

Abstract: Semantics deals with the organization of meanings and the relations between sensory signs or symbols and what they denote or mean. Computational semantics performs a conceptualization of the world using computational processes for composing a meaning representation structure from available signs and their features present, for example, in words and sentences. Spoken language understanding (SLU) is the interpretation of signs conveyed by a speech signal. SLU and natural language understanding (NLU) share the goal of obtaining a conceptual representation of natural language sentences. Specific to SLU is the fact that signs to be used for interpretation are coded into signals along with other information such as speaker identity. Furthermore, spoken sentences often do not follow the grammar of a language; they exhibit self-corrections, hesitations, repetitions, and other irregular phenomena. SLU systems contain an automatic speech recognition (ASR) component and must be robust to noise due to the spontaneous nature of spoken language and the errors introduced by ASR. Moreover, ASR components output a stream of words with no structure information like punctuation and sentence boundaries. Therefore, SLU systems cannot rely on such markers and must perform text segmentation and understanding at the same time.

...read moreread less

222 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154

Collapse