scispace - formally typeset
Search or ask a question
Author

Michael Elhadad

Bio: Michael Elhadad is an academic researcher from Ben-Gurion University of the Negev. The author has contributed to research in topics: Hebrew & Parsing. The author has an hindex of 28, co-authored 91 publications receiving 4224 citations. Previous affiliations of Michael Elhadad include Hebrew University of Jerusalem & Columbia University.


Papers
More filters
DOI
01 Jan 1997
TL;DR: Empirical results on the identification of strong chains and of significant sentences are presented in this paper, and plans to address short-comings are briefly presented.
Abstract: We investigate one technique to produce a summary of an original text without requiring its full semantic interpretation, but instead relying on a model of the topic progression in the text derived from lexical chains We present a new algorithm to compute lexical chains in a text, merging several robust knowledge sources: the WordNet thesaurus, a part-of-speech tagger, shallow parser for the identification of nominal groups, and a segmentation algorithm Summarization proceeds in four steps: the original text is segmented, lexical chains are constructed, strong chains are identified and significant sentences are extracted We present in this paper empirical results on the identification of strong chains and of significant sentences Preliminary results indicate that quality indicative summaries are produced Pending problems are identified Plans to address these short-comings are briefly presented

1,047 citations

Proceedings ArticleDOI
20 Jun 1999
TL;DR: This approach is unique in its usage of language generation to reformulate the wording of the summary by identifying and synthesizing similar elements across related text from a set of multiple documents.
Abstract: We present a method to automatically generate a concise summary by identifying and synthesizing similar elements across related text from a set of multiple documents. Our approach is unique in its usage of language generation to reformulate the wording of the summary.

433 citations

Proceedings Article
02 Jun 2010
TL;DR: A novel deterministic dependency parsing algorithm that attempts to create the easiest arcs in the dependency structure first in a non-directional manner, which is significantly more accurate than best-first transition based parsers, and nears the performance of globally optimized parsing models.
Abstract: We present a novel deterministic dependency parsing algorithm that attempts to create the easiest arcs in the dependency structure first in a non-directional manner. Traditional deterministic parsing algorithms are based on a shift-reduce framework: they traverse the sentence from left-to-right and, at each step, perform one of a possible set of actions, until a complete tree is built. A drawback of this approach is that it is extremely local: while decisions can be based on complex structures on the left, they can look only at a few words to the right. In contrast, our algorithm builds a dependency tree by iteratively selecting the best pair of neighbours to connect at each parsing step. This allows incorporation of features from already built structures both to the left and to the right of the attachment point. The parser learns both the attachment preferences and the order in which they should be performed. The result is a deterministic, best-first, O(nlogn) parser, which is significantly more accurate than best-first transition based parsers, and nears the performance of globally optimized parsing models.

219 citations

DOI
01 Jan 1998
TL;DR: The results show that different parameters of an experiment can affect how well a system scores, and describe how parameters can be controlled to produce a sound evaluation.
Abstract: Two methods are used for evaluation of summarization systems: an evaluation of generated summaries against an "ideal" summary and evaluation of how well summaries help a person perform in a task such as informa. tion retrieval. We carried out two large experiments to study the two evaluation methods. Our results show that different parameters of an experiment can (h-amatically affect how well a system scores. For example, summary length was found to affect both types of evaluations. For the "ideal" summary based evaluation, accuracy decreases as summary length increases, while for task based evaluations summary length and accuracy on an information retrieval task appear to correlate randomly. In this paper, we show how this parameter and others can affect evaluation results and describe how parameters can be controlled to produce a sound evaluation.

218 citations

Proceedings ArticleDOI
01 Jan 1996
TL;DR: This paper describes a short demo providing an overview of SURGE (Systemic Unification Realization Grammar of English) a syntactic realization front-end for natural language generation systems.
Abstract: This paper describes a short demo providing an overview of SURGE (Systemic Unification Realization Grammar of English) a syntactic realization front-end for natural language generation systems. Developed over the last seven years 1 it embeds one of the most comprehensive computational grammar of English for generation available to date. It has been successfully reused in eight generators, that have little in common in terms of architecture. It has also been used for teaching natural language generation at several academic institutions.

160 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this article, exact dynamic programming algorithms can be used to compute ground states, base pairing probabilities, as well as thermodynamic properties of nucleic acids based on carefully measured thermodynamic parameters.
Abstract: Background Secondary structure forms an important intermediate level of description of nucleic acids that encapsulates the dominating part of the folding energy, is often well conserved in evolution, and is routinely used as a basis to explain experimental findings. Based on carefully measured thermodynamic parameters, exact dynamic programming algorithms can be used to compute ground states, base pairing probabilities, as well as thermodynamic properties.

3,620 citations

Journal ArticleDOI
01 Jun 1959

3,442 citations

01 Jun 2005

3,154 citations

Journal ArticleDOI
TL;DR: LexRank as discussed by the authors is a stochastic graph-based method for computing relative importance of textual units for Natural Language Processing (NLP), which is based on the concept of eigenvector centrality.
Abstract: We introduce a stochastic graph-based method for computing relative importance of textual units for Natural Language Processing. We test the technique on the problem of Text Summarization (TS). Extractive TS relies on the concept of sentence salience to identify the most important sentences in a document or set of documents. Salience is typically defined in terms of the presence of particular important words or in terms of similarity to a centroid pseudo-sentence. We consider a new approach, LexRank, for computing sentence importance based on the concept of eigenvector centrality in a graph representation of sentences. In this model, a connectivity matrix based on intra-sentence cosine similarity is used as the adjacency matrix of the graph representation of sentences. Our system, based on LexRank ranked in first place in more than one task in the recent DUC 2004 evaluation. In this paper we present a detailed analysis of our approach and apply it to a larger data set including data from earlier DUC evaluations. We discuss several methods to compute centrality using the similarity graph. The results show that degree-based methods (including LexRank) outperform both centroid-based methods and other systems participating in DUC in most of the cases. Furthermore, the LexRank with threshold method outperforms the other degree-based techniques including continuous LexRank. We also show that our approach is quite insensitive to the noise in the data that may result from an imperfect topical clustering of documents.

2,367 citations

Proceedings Article
11 Jul 2010
TL;DR: This work evaluates Brown clusters, Collobert and Weston (2008) embeddings, and HLBL (Mnih & Hinton, 2009) embeds of words on both NER and chunking, and finds that each of the three word representations improves the accuracy of these baselines.
Abstract: If we take an existing supervised NLP system, a simple and general way to improve accuracy is to use unsupervised word representations as extra word features. We evaluate Brown clusters, Collobert and Weston (2008) embeddings, and HLBL (Mnih & Hinton, 2009) embeddings of words on both NER and chunking. We use near state-of-the-art supervised baselines, and find that each of the three word representations improves the accuracy of these baselines. We find further improvements by combining different word representations. You can download our word features, for off-the-shelf use in existing NLP systems, as well as our code, here: http://metaoptimize.com/projects/wordreprs/

2,243 citations