scispace - formally typeset
Search or ask a question

Showing papers on "Chunking (computing) published in 2002"


Proceedings ArticleDOI
31 Aug 2002
TL;DR: A new statistical Japanese dependency parser using a cascaded chunking model that is simple and efficient, since it parses a sentence deterministically only deciding whether the current segment modifies the segment on its immediate right hand side.
Abstract: In this paper, we propose a new statistical Japanese dependency parser using a cascaded chunking model. Conventional Japanese statistical dependency parsers are mainly based on a probabilistic model, which is not always efficient or scalable. We propose a new method that is simple and efficient, since it parses a sentence deterministically only deciding whether the current segment modifies the segment on its immediate right hand side. Experiments using the Kyoto University Corpus show that the method outperforms previous systems as well as improves the parsing and training efficiency.

524 citations


01 Jan 2002
TL;DR: It is argued that a chunked syntactic representation can usefully be exploited as such for non trivial NLP applications which do not require full text understanding such as automatic lexical acquisition and information retrieval.
Abstract: This paper illustrates a technique of shallow parsing named “text chunking” whereby “parse incompleteness” is reinterpreted as “parse underspecification”. A text is chunked into structured units which can be identified with certainty on the basis of available knowledge. The chunking process stops at that level of granularity beyond which the analysis gets undecidable. We argue that a chunked syntactic representation can usefully be exploited as such for non trivial NLP applications which do not require full text understanding such as automatic lexical acquisition and information retrieval.

28 citations


Proceedings ArticleDOI
24 Aug 2002
TL;DR: The pre-processing modules for tokenisation, sentence splitting, paragraph segmentation, part-of-speech tagging, clause chunking and noun phrase extraction are outlined and the anaphora resolution module is described.
Abstract: This paper describes LINGUA - an architecture for text processing in Bulgarian. First, the pre-processing modules for tokenisation, sentence splitting, paragraph segmentation, part-of-speech tagging, clause chunking and noun phrase extraction are outlined. Next, the paper proceeds to describe in more detail the anaphora resolution module. Evaluation results are reported for each processing task.

23 citations


Proceedings ArticleDOI
24 Aug 2002
TL;DR: By unifying the POS tagging and chunking in the search process, the algorithm alleviates effectively the influence of POS tagging deficiency to the chunking result and can dramatically reduce search space, support time synchronous DP algorithm, and lead to highly consistent chunking.
Abstract: A new statistical method called "bilingual chunking" for structure alignment is proposed. Different with the existing approaches which align hierarchical structures like sub-trees, our method conducts alignment on chunks. The alignment is finished through a simultaneous bilingual chunking algorithm. Using the constrains of chunk correspondence between source language (SL) and target language (TL), our algorithm can dramatically reduce search space, support time synchronous DP algorithm, and lead to highly consistent chunking. Furthermore, by unifying the POS tagging and chunking in the search process, our algorithm alleviates effectively the influence of POS tagging deficiency to the chunking result.The experimental results with English-Chinese structure alignment show that our model can produce 90% in precision for chunking, and 87% in precision for chunk alignment.

22 citations


Patent
25 Mar 2002
TL;DR: In this paper, a non-symmetric arcuate biasing member for angularly articulating a baitfish against a plurality of blade surfaces was employed for cutting baitfish.
Abstract: Apparatus for cutting baitfish employing a non-symmetrically arcuate biasing member for angularly articulating a baitfish against a plurality of blade surfaces. In certain embodiments, apparatus having an improved lever arm with increased mechanical advantage.

20 citations


Proceedings ArticleDOI
01 Sep 2002
TL;DR: A hybrid model to combine Memory-Based Learning method and disambiguation proposal based on lexical information and grammar rules populated from a large corpus for 9 types of Chinese base phrases chunking is presented.
Abstract: This paper introduces new definitions of Chinese base phrases and presents a hybrid model to combine Memory-Based Learning method and disambiguation proposal based on lexical information and grammar rules populated from a large corpus for 9 types of Chinese base phrases chunking Our experiment achieves an accuracy (F-measure) of 934% The significance of the research lies in the fact that it provides a solid foundation for the Chinese parser

12 citations



Proceedings ArticleDOI
24 Aug 2002
TL;DR: It is shown that chunking of recursive noun phrases necessitates a readjustment of the finite-state cascades approach, and the property of monotonicity must be given up.
Abstract: The paper describes a method to process recursive noun phrases with finite-state cascades. It is shown that chunking of recursive noun phrases necessitates a readjustment of the finite-state cascades approach. In particular, the property of monotonicity must be given up. Furthermore, the paper explores the influence of POS tags and online agreement checking on the overall performance.

10 citations


Journal ArticleDOI
TL;DR: A general statistical model for text chunking which is based on a generalization of the Winnow algorithm is proposed and then converted into a classification model that is suitable for classification-based chunking.
Abstract: This paper describes a text chunking system based on a generalization of the Winnow algorithm. We propose a general statistical model for text chunking which we then convert into a classification p...

2 citations