scispace - formally typeset
Search or ask a question
Author

Evelyne Tzoukermann

Bio: Evelyne Tzoukermann is an academic researcher from Bell Labs. The author has contributed to research in topics: Part of speech & Speech synthesis. The author has an hindex of 19, co-authored 41 publications receiving 1264 citations. Previous affiliations of Evelyne Tzoukermann include Massachusetts Institute of Technology & Alcatel-Lucent.

Papers
More filters
Proceedings ArticleDOI
21 Mar 1993
TL;DR: This work focuses here on selection of training and test data, evaluation of language understanding, and the continuing search for evaluation methods that will correlate well with expected performance of the technology in applications.
Abstract: The Air Travel Information System (ATIS) domain serves as the common task for DARPA spoken language system research and development The approaches and results possible in this rapidly growing area are structured by available corpora, annotations of that data, and evaluation methods Coordination of this crucial infrastructure is the charter of the Multi-Site ATIS Data COllection Working group (MADCOW) We focus here on selection of training and test data, evaluation of language understanding, and the continuing search for evaluation methods that will correlate well with expected performance of the technology in applications

206 citations

Book ChapterDOI
01 Jan 1999
TL;DR: A natural language processing (NLP) approach to automatic indexing over controlled vocabulary which accounts for term variation is presented, applied to the French language.
Abstract: We present a natural language processing (NLP) approach to automatic indexing over controlled vocabulary which accounts for term variation. The approach combines a part of speech tagger, a generator of morphologically related forms, and a shallow transformational parser. The system is applied to the French language; it is trained on newspaper articles and tested on scientific literature.

120 citations

Proceedings ArticleDOI
23 Mar 1992
TL;DR: An understanding system, designed for both speech and text input, has been implemented based on statistical representation of task specific semantic knowledge, which extracts words and their association to the conceptual structure of the task directly from the acoustic signal.
Abstract: An understanding system, designed for both speech and text input, has been implemented based on statistical representation of task specific semantic knowledge. The core of the system is the conceptual decoder, which extracts the words and their association to the conceptual structure of the task directly from the acoustic signal. The conceptual information, which is also used to clarify the English sentences, is encoded following a statistical paradigm. A template generator and an SQL (structured query language) translator process the sentence and produce SQL code for querying a relational database. Results of the system on the official DARPA test are given. >

109 citations

Patent
02 Jul 1998
TL;DR: In this article, an index generator and query expander for information retrieval in a corpus is presented, where the disambiguated corpus is provided as an input to a transformational analyzer, using a grammar and a metagrammar for analyzing syntactic and morphosyntactic variations to conflate and generate variants.
Abstract: An index generator and query expander for use in information retrieval in a corpus. A corpus is provided as an input to an inflectional analyzer, which produces a lemmatized corpus having base forms and associated inflections for each word in the original corpus. The lemmatized corpus is provided as an input to a disambiguator, which performs part of speech tagging and morpho-syntactic disambiguation to produce a disambiguated corpus. The disambiguated corpus is provided as an input to a derivational generator, which produces an expanded corpus having all possible valid derivatives of each word of the disambiguated corpus. The disambiguated corpus is provided as an input to a transformational analyzer, using a grammar and a metagrammar for analyzing syntactic and morphosyntactic variations to conflate and generate variants, producing an index to the corpus having a minimum of variants. Alternatively, a query expander is provided utilizing similar techniques.

101 citations

Proceedings ArticleDOI
07 Jul 1997
TL;DR: The contribution of this research is the successful combination of parsing over a seed term list coupled with derivational morphology to achieve greater coverage of multi-word terms for indexing and retrieval.
Abstract: A system for the automatic production of controlled index terms is presented using linguistically-motivated techniques. This includes a finite-state part of speech tagger, a derivational morphological processor for analysis and generation, and a unification-based shallow-level parser using transformational rules over syntactic patterns. The contribution of this research is the successful combination of parsing over a seed term list coupled with derivational morphology to achieve greater coverage of multi-word terms for indexing and retrieval. Final results are evaluated for precision and recall, and implications for indexing and retrieval are discussed.

85 citations


Cited by
More filters
Book
05 Jun 2007
TL;DR: The second edition of Ontology Matching has been thoroughly revised and updated to reflect the most recent advances in this quickly developing area, which resulted in more than 150 pages of new content.
Abstract: Ontologies tend to be found everywhere. They are viewed as the silver bullet for many applications, such as database integration, peer-to-peer systems, e-commerce, semantic web services, or social networks. However, in open or evolving systems, such as the semantic web, different parties would, in general, adopt different ontologies. Thus, merely using ontologies, like using XML, does not reduce heterogeneity: it just raises heterogeneity problems to a higher level. Euzenat and Shvaikos book is devoted to ontology matching as a solution to the semantic heterogeneity problem faced by computer systems. Ontology matching aims at finding correspondences between semantically related entities of different ontologies. These correspondences may stand for equivalence as well as other relations, such as consequence, subsumption, or disjointness, between ontology entities. Many different matching solutions have been proposed so far from various viewpoints, e.g., databases, information systems, and artificial intelligence. The second edition of Ontology Matching has been thoroughly revised and updated to reflect the most recent advances in this quickly developing area, which resulted in more than 150 pages of new content. In particular, the book includes a new chapter dedicated to the methodology for performing ontology matching. It also covers emerging topics, such as data interlinking, ontology partitioning and pruning, context-based matching, matcher tuning, alignment debugging, and user involvement in matching, to mention a few. More than 100 state-of-the-art matching systems and frameworks were reviewed. With Ontology Matching, researchers and practitioners will find a reference book that presents currently available work in a uniform framework. In particular, the work and the techniques presented in this book can be equally applied to database schema matching, catalog integration, XML schema matching and other related problems. The objectives of the book include presenting (i) the state of the art and (ii) the latest research results in ontology matching by providing a systematic and detailed account of matching techniques and matching systems from theoretical, practical and application perspectives.

2,579 citations

Journal ArticleDOI
TL;DR: In this paper, a measure of semantic similarity in an IS-A taxonomy based on the notion of shared information content is presented, and experimental evaluation against a benchmark set of human similarity judgments demonstrates that the measure performs better than the traditional edge counting approach.
Abstract: This article presents a measure of semantic similarity in an IS-A taxonomy based on the notion of shared information content. Experimental evaluation against a benchmark set of human similarity judgments demonstrates that the measure performs better than the traditional edge-counting approach. The article presents algorithms that take advantage of taxonomic similarity in resolving syntactic and semantic ambiguity, along with experimental results demonstrating their effectiveness.

2,190 citations

Journal ArticleDOI
Andrei Z. Broder1
01 Sep 2002
TL;DR: This taxonomy of web searches is explored and how global search engines evolved to deal with web-specific needs is discussed.
Abstract: Classic IR (information retrieval) is inherently predicated on users searching for information, the so-called "information need". But the need behind a web search is often not informational -- it might be navigational (give me the url of the site I want to reach) or transactional (show me sites where I can perform a certain transaction, e.g. shop, download a file, or find a map). We explore this taxonomy of web searches and discuss how global search engines evolved to deal with web-specific needs.

2,094 citations

Patent
11 Jan 2011
TL;DR: In this article, an intelligent automated assistant system engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions.
Abstract: An intelligent automated assistant system engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions. The system can be implemented using any of a number of different platforms, such as the web, email, smartphone, and the like, or any combination thereof. In one embodiment, the system is based on sets of interrelated domains and tasks, and employs additional functionally powered by external services with which the system can interact.

1,462 citations

Proceedings ArticleDOI
18 Jun 1991
TL;DR: This paper will describe a method and a program for aligning sentences based on a simple statistical model of character lengths, which uses the fact that longer sentences in one language tend to be translated into longer sentence in the other language, and that shorter sentences tend to been translated into shorter sentences.
Abstract: Researchers in both machine translation (e.g., Brown et al., 1990) and bilingual lexicography (e.g., Klavans and Tzoukermann, 1990) have recently become interested in studying parallel texts, texts such as the Canadian Hansards (parliamentary proceedings) which are available in multiple languages (French and English). This paper describes a method for aligning sentences in these parallel texts, based on a simple statistical model of character lengths. The method was developed and tested on a small trilingual sample of Swiss economic reports. A much larger sample of 90 million words of Canadian Hansards has been aligned and donated to the ACL/DCI.

1,088 citations