Showing papers by "Michael Collins published in 2000"

PDF

Open Access

Journal Article•DOI•

Logistic Regression, AdaBoost and Bregman Distances

[...]

Michael Collins¹, Robert E. Schapire¹, Yoram Singer²•Institutions (2)

AT&T Labs¹, Hebrew University of Jerusalem²

28 Jun 2000

TL;DR: A unified account of boosting and logistic regression in which each learning problem is cast in terms of optimization of Bregman distances, and a parameterized family of algorithms that includes both a sequential- and a parallel-update algorithm as special cases are described, thus showing how the sequential and parallel approaches can themselves be unified.

...read moreread less

Abstract: We give a unified account of boosting and logistic regression in which each learning problem is cast in terms of optimization of Bregman distances. The striking similarity of the two problems in this framework allows us to design and analyze algorithms for both simultaneously, and to easily adapt algorithms designed for one problem to the other. For both problems, we give new algorithms and explain their potential advantages over existing methods. These algorithms are iterative and can be divided into two types based on whether the parameters are updated sequentially (one at a time) or in parallel (all at once). We also describe a parameterized family of algorithms that includes both a sequential- and a parallel-update algorithm as special cases, thus showing how the sequential and parallel approaches can themselves be unified. For all of the algorithms, we give convergence proofs using a general formalization of the auxiliary-function proof technique. As one of our sequential-update algorithms is equivalent to AdaBoost, this provides the first general proof of convergence for AdaBoost. We show that all of our algorithms generalize easily to the multiclass case, and we contrast the new algorithms with the iterative scaling algorithm. We conclude with a few experimental results with synthetic data that highlight the behavior of the old and newly proposed algorithms in different settings.

...read moreread less

730 citations

Proceedings Article•

Discriminative Reranking for Natural Language Parsing

[...]

Michael Collins

29 Jun 2000

TL;DR: The boosting approach to ranking problems described in Freund et al. (1998) is applied to parsing the Wall Street Journal treebank, and it is argued that the method is an appealing alternative-in terms of both simplicity and efficiency-to work on feature selection methods within log-linear (maximum-entropy) models.

...read moreread less

Abstract: This article considers approaches which rerank the output of an existing probabilistic parser. The base parser produces a set of candidate parses for each input sentence, with associated probabilities that define an initial ranking of these parses. A second model then attempts to improve upon this initial ranking, using additional features of the tree as evidence. The strength of our approach is that it allows a tree to be represented as an arbitrary set of features, without concerns about how these features interact or overlap and without the need to define a derivation or a generative model which takes these features into account. We introduce a new method for the reranking task, based on the boosting approach to ranking problems described in Freund et al. (1998). We apply the boosting method to parsing the Wall Street Journal treebank. The method combined the log-likelihood under a baseline model (that of Collins [1999]) with evidence from an additional 500,000 features over parse trees that were not included in the original model. The new model achieved 89.75 F-measure, a 13 relative decrease in F-measure error over the baseline model's score of 88.2. The article also introduces a new algorithm for the boosting approach which takes advantage of the sparsity of the feature space in the parsing data. Experiments show significant efficiency gains for the new algorithm over the obvious implementation of the boosting approach. We argue that the method is an appealing alternative-in terms of both simplicity and efficiency-to work on feature selection methods within log-linear (maximum-entropy) models. Although the experiments in this article are on natural language parsing (NLP), the approach should be applicable to many other NLP problems which are naturally framed as ranking tasks, for example, speech recognition, machine translation, or natural language generation.

...read moreread less

500 citations

Proceedings Article•DOI•

Answer Extraction

[...]

Steven Abney¹, Michael Collins¹, Amit Singhal¹•Institutions (1)

AT&T¹

29 Apr 2000

TL;DR: This paper describes a system that attempts to retrieve a much smaller section of text, namely, a direct answer to a user's question, using the SMART IR system to extract a ranked set of passages that are relevant to the query.

...read moreread less

Abstract: Information retrieval systems have typically concentrated on retrieving a set of documents which are relevant to a user's query. This paper describes a system that attempts to retrieve a much smaller section of text, namely, a direct answer to a user's question. The SMART IR system is used to extract a ranked set of passages that are relevant to the query. Entities are extracted from these passages as potential answers to the question, and ranked for plausibility according to how well their type matches the query, and according to their frequency and position in the passages. The system was evaluated at the TREC-8 question answering track: we give results and error analysis on these queries.

...read moreread less

161 citations

Proceedings Article•DOI•

The Rules Behind Roles: Identifying Speaker Role in Radio Broadcasts

[...]

Regina Barzilay¹, Michael Collins², Julia Hirschberg², Steve Whittaker²•Institutions (2)

Columbia University¹, AT&T²

30 Jul 2000

TL;DR: An algorithm is implemented that classies story segments into three Speaker Roles based on several content and duration features and correctly classies about 80% of segments when applied to ASR derived transcriptions of broadcast data.

...read moreread less

Abstract: Previous work has shown that providing information about story structure is critical for browsing audio broadcasts. We investigate the hypothesis that Speaker Role is an important cue to story structure. We implement an algorithm that classies story segments into three Speaker Roles based on several content and duration features. The algorithm correctly classies about 80% of segments (compared with a baseline frequency of 35.4%) when applied to ASR derived transcriptions of broadcast data.

...read moreread less

109 citations

Proceedings Article•DOI•

Improving intonational phrasing with syntactic information

[...]

Philipp Koehn¹, Steven Abney¹, Julia Hirschberg¹, Michael Collins¹•Institutions (1)

AT&T¹

05 Jun 2000

TL;DR: This paper improved upon this work by adding syntactic information gained from a high-accuracy parser and reported significant improvement using various experimental setups, and also showed that their improved method comes close to interannotator agreement.

...read moreread less

Abstract: The prediction of intonational phrase boundaries from raw text is an important step for a text-to-speech system: locating where to place short pauses enables more natural sounding speech, that can be more easily understood. We improved upon earlier work [Hirschberg and Prieto, 1996] by adding syntactic information gained from a high-accuracy parser [Collins, 1999]. We report significant improvement using various experimental setups. We also show that our improved method comes close to interannotator agreement.

...read moreread less

60 citations

Journal Article•DOI•

Procedural costs of digital vs. analog archiving of diagnostic cardiac catheterizations.

[...]

Matthew E. Oetgen, Gishel New, Issam Moussa, Stephen Balter, Michael Collins, Sriram S. Iyer, Gary S. Roubin, Antonio Colombo, Jeffrey W. Moses¹ - Show less +5 more•Institutions (1)

Lenox Hill Hospital¹

01 Mar 2000-Catheterization and Cardiovascular Interventions

TL;DR: There is a procedural cost savings in a cardiac catheterization room that uses digital CDs versus cineangiogram film as the archival media, and this cost difference was eliminated when recording media costs were excluded from analysis.

...read moreread less

Abstract: The use of digital technology in the cardiac catheterization laboratory is expanding at a rapid pace. The cost-effectiveness of this new technology is yet to be proven. The aims of this study were to determine the direct cost differences of digital versus analog media (CDs) for the storage of diagnostic cardiac catheterizations and to explore the factors influencing these differences. Procedural costs of all diagnostic angiograms (n = 109), from three physicians, performed in an analog catheterization laboratory (room A) and a digital catheterization laboratory (room C) were compared during a 9-month period. The mean procedural cost was higher in room A than in room C ($1,102 vs. $1,087, P < 0.001). This cost difference was eliminated when recording media costs were excluded from analysis ($1,079 vs. $1,080, P = 0.931). Therefore, we conclude there is a procedural cost savings in a cardiac catheterization room that uses digital CDs versus cineangiogram film as the archival media. Cathet. Cardiovasc. Intervent. 49:246-250, 2000.

...read moreread less

2 citations