Showing papers on "Chunking (computing) published in 2000"

PDF

Open Access

Proceedings Article•DOI•

Introduction to the CoNLL-2000 shared task: chunking

[...]

Erik Tjong Kim Sang¹, Sabine Buchholz²•Institutions (2)

University of Antwerp¹, Tilburg University²

13 Sep 2000

TL;DR: The CoNLL-2000 shared task: dividing text into syntactically related non-overlapping groups of words, so-called text chunking is described.

...read moreread less

Abstract: We describe the CoNLL-2000 shared task: dividing text into syntactically related non-overlapping groups of words, so-called text chunking. We give background information on the data sets, present a general overview of the systems that have taken part in the shared task and briefly discuss their performance.

...read moreread less

855 citations

Posted Content•

Introduction to the CoNLL-2000 Shared Task: Chunking

[...]

Erik Tjong Kim Sang¹, Sabine Buchholz²•Institutions (2)

University of Antwerp¹, Tilburg University²

18 Sep 2000-arXiv: Computation and Language

TL;DR: The CoNLL-2000 shared task on text chunking as discussed by the authors divided text into syntactically related non-overlapping groups of words, so-called text chunk.

...read moreread less

194 citations

Proceedings Article•DOI•

Chunking with maximum entropy models

[...]

Rob Koeling

13 Sep 2000

TL;DR: A first attempt to create a text chunker using a Maximum Entropy model is discussed, implementing classifiers that tag every word in a sentence with a phrase-tag using very local lexical information, part-of-speech tags and phrase tags of surrounding words.

...read moreread less

Abstract: In this paper I discuss a first attempt to create a text chunker using a Maximum Entropy model. The first experiments, implementing classifiers that tag every word in a sentence with a phrase-tag using very local lexical information, part-of-speech tags and phrase tags of surrounding words, give encouraging results.

...read moreread less

91 citations

Proceedings Article•DOI•

Tagging and chunking with bigrams

[...]

Ferran Pla¹, Antonio Molina¹, Natividad Prieto¹•Institutions (1)

Polytechnic University of Valencia¹

31 Jul 2000

TL;DR: An integrated system for tagging and chunking texts from a certain language based on stochastic finite-state models that are learnt automatically, which is a very flexible and portable system.

...read moreread less

Abstract: In this paper we present an integrated system for tagging and chunking texts from a certain language. The approach is based on stochastic finite-state models that are learnt automatically. This includes biagram models or finite-state automata learnt using grammatical inference techniques. As the models involved in our system are learnt automatically, this is a very flexible and portable system.In order to show the viability of our approach we present results for tagging and chunking using bigram models on the Wall Street Journal corpus. We have achieved an accuracy rate for tagging of 96.8%, and a precision rate for NP chunks of 94.6% with a recall rate of 93.6%.

...read moreread less

57 citations

Proceedings Article•DOI•

Text chunking by system combination

[...]

Erik Tjong Kim Sang¹•Institutions (1)

University of Antwerp¹

13 Sep 2000

TL;DR: A system-internal combination of memory-based learning classifiers is applied to the CoNLL-2000 shared task: finding base chunks to examine if dividing the chunking process in a boundary recognition phase and a type identification phase would aid performance.

...read moreread less

Abstract: We will apply a system-internal combination of memory-based learning classifiers to the CoNLL-2000 shared task: finding base chunks. Apart from testing different combination methods, we will also examine if dividing the chunking process in a boundary recognition phase and a type identification phase would aid performance.

...read moreread less

57 citations

Patent•

Hierarchical language chunking translation table

[...]

Michael S. Lopke¹•Institutions (1)

Hewlett-Packard¹

03 Oct 2000

TL;DR: In this article, a computer executes a translation program for translating a phrase from a first language to a second language, and then replaces at least a part of the translatable portion with the second-language chunk from the located translation pair.

...read moreread less

Abstract: In one embodiment, a computer executes a translation program for translating a phrase from a first language to a second language. Initially, the executing program receives the phrase to be translated. It then identifies in the phrase a translatable portion thereof. Next, it locates from a hierarchally-ordered expanded list of translation pairs, which each have equivalent first-language and second-language chunks, a pair having a first-language chunk that most closely matches the translatable portion with att least a part of the translatable portion being identical to the located first-language chunk. Finally, the executing program replaces at least a part of the translatable portion with the second-language chunk from the located translation pair. In this way, the executing program at least partially (if not completely) automatically translates the input phrase from the first language to the second language.

...read moreread less

50 citations

Proceedings Article•DOI•

Named entity chunking techniques in supervised learning for Japanese named entity recognition

[...]

Manabu Sassano¹, Takehito Utsuro²•Institutions (2)

Fujitsu¹, Toyohashi University of Technology²

31 Jul 2000

TL;DR: A method is proposed for incorporating richer contextual information as well as patterns of constituent morphemes within a named entity, which have not been considered in previous research, and it is shown that the proposed method outperforms these previous approaches.

...read moreread less

Abstract: This paper focuses on the issue of named entity chunking in Japanese named entity recognition. We apply the supervised decision list learning method to Japanese named entity recognition. We also investigate and incorporate several named-entity noun phrase chunking techniques and experimentally evaluate and compare their performance. In addition, we propose a method for incorporating richer contextual information as well as patterns of constituent morphemes within a named entity, which have not been considered in previous research, and show that the proposed method outperforms these previous approaches.

...read moreread less

34 citations

Proceedings Article•DOI•

Hybrid text chunking

[...]

GuoDong Zhou, Jian Su, TongGuan Tey

13 Sep 2000

TL;DR: Compared with standard HMM-based tagger, this tagger incorporates more contextual information into a lexical entry and an error-driven learning approach is adopted to decrease the memory requirement by keeping only positive lexical entries and makes it possible to further incorporate more context-dependentLexical entries.

...read moreread less

Abstract: This paper proposes an error-driven HMM-based text chunk tagger with context-dependent lexicon. Compared with standard HMM-based tagger, this tagger incorporates more contextual information into a lexical entry. Moreover, an error-driven learning approach is adopted to decrease the memory requirement by keeping only positive lexical entries and makes it possible to further incorporate more context-dependent lexical entries. Finally, memory-based learning is adopted to further improve the performance of the chunk tagger.

...read moreread less

33 citations

Proceedings Article•DOI•

Phrase parsing with rule sequence processors: an application to the shared CoNLL task

[...]

Marc Vilain¹, David Day¹•Institutions (1)

Mitre Corporation¹

13 Sep 2000

TL;DR: This work exploits chunking in two principal ways: first, as part of the authors' extraction system (Alembic) (Aberdeen et al., 1995), the chunker delineates descriptor phrases for entity extraction; and, as a part of the ongoing research in parsing, chunks provide the first level of a stratified approach to syntax.

...read moreread less

Abstract: For several years, chunking has been an integral part of MITRE's approach to information extraction. Our work exploits chunking in two principal ways. First, as part of our extraction system (Alembic) (Aberdeen et al., 1995), the chunker delineates descriptor phrases for entity extraction. Second, as part of our ongoing research in parsing, chunks provide the first level of a stratified approach to syntax - the second level is defined by grammatical relations, much as in the SPARKLE effort (Carroll et al., 1997).

...read moreread less

31 citations

Journal Article•DOI•

A lightweight dependency analyzer for partial parsing

[...]

Bangalore Srinivas¹•Institutions (1)

AT&T Labs¹

01 Jun 2000-Natural Language Engineering

TL;DR: A novel approach to partial parsing that produces dependency links between words of a sentence and a proposal for a general framework for parser evaluation that is applicable for evaluating both constituency-based and dependency-based, partial and complete parsers are presented.

...read moreread less

Abstract: In this paper, we present a novel approach to partial parsing that produces dependency links between words of a sentence. The partial parser called a lightweight dependency analyzer uses information encoded in supertags and hence can produce constituency-based as well as dependency-based analyses. The lightweight dependency analyzer has been used for text chunking, including noun and verb group chunking. We also present a proposal for a general framework for parser evaluation that is applicable for evaluating both constituency-based and dependency-based, partial and complete parsers. The performance results of the lightweight dependency analyzer on Wall Street Journal and Brown corpus using the proposed evaluation metrics are discussed.

...read moreread less

20 citations

Journal Article•

Integrating pivot based search with branch and bound for binary MIPs

[...]

Arne Løkketangen, David L. Woodruff

01 Jan 2000-Control and Cybernetics

Proceedings Article•

NP Chunking using ILP

[...]

S. Konstantopoulous

01 Jan 2000

Journal Article•DOI•

Noun phrase chunking with APL2

[...]

Suresh Manandhar¹, Enrique Alfonseca²•Institutions (2)

University of York¹, Autonomous University of Madrid²

01 Jun 2000

TL;DR: The learning paradigm CLOG is modified to produce transformation lists, and several interesting conclusions are arrived at about Noun Phrase chunking.

...read moreread less

Abstract: The identification of phrases in a sentence can be useful as a pre-processing step before attempting the full parsing There is already much literature about finding simple non-recursive non-overlapping Noun Phrases We have modified the learning paradigm CLOG [4] to produce transformation lists, and we arrived to several interesting conclusions about Noun Phrase chunking IBM APL2 was used to build a prototype that was later rewritten in C++ for performance purposes

...read moreread less

Chunking-Synthetic Approaches to Large-Scale Kernel Machines

[...]

Robert R. Meyer, Francisco J. González-Castaño

01 Jan 2000

TL;DR: A kernel-based approach to nonlinear classification that combines the generation of “synthetic” points (to be used in the kernel) with “chunking” (working with subsets of the data) in order to significantly reduce the size of the optimization problems required to construct classifiers for massive datasets.

...read moreread less

Abstract: We consider a kernel-based approach to nonlinear classification that combines the generation of “synthetic” points (to be used in the kernel) with “chunking” (working with subsets of the data) in order to significantly reduce the size of the optimization problems required to construct classifiers for massive datasets. Rather than solving a single massive classification problem involving all points in the training set, we employ a series of problems that gradually increase in size and which consider kernels based on small numbers of synthetic points. These synthetic points are generated by solving relatively small nonlinear unconstrained optimization problems. In addition to greatly reducing optimization problem size, the procedure that we describe also has the advantage of being easily parallelized. Computational results show that our method efficiently generates high-performance classifiers on a variety of problems involving both real and randomly generated datasets.

...read moreread less

Chunking w ith W P D V M odels

[...]

Hans van Halteren

01 Jan 2000