scispace - formally typeset
Search or ask a question

Showing papers on "Shallow parsing published in 2007"


Proceedings Article
01 Jun 2007
TL;DR: This paper shows how to use corpus statistics to validate and correct the arguments of extracted relation instances, improving the overall RE performance.
Abstract: Many errors produced by unsupervised and semi-supervised relation extraction (RE) systems occur because of wrong recognition of entities that participate in the relations. This is especially true for systems that do not use separate named-entity recognition components, instead relying on general-purpose shallow parsing. Such systems have greater applicability, because they are able to extract relations that contain attributes of unknown types. However, this generality comes with the cost in accuracy. In this paper we show how to use corpus statistics to validate and correct the arguments of extracted relation instances, improving the overall RE performance. We test the methods on SRES – a self-supervised Web relation extraction system. We also compare the performance of corpus-based methods to the performance of validation and correction methods based on supervised NER components.

51 citations


Journal Article
TL;DR: This paper works on the output of a part-of-speech tagger and uses shallow parsing instead of complex parsing to resolve zero anaphors in written Chinese and employs centering theory and constraint rules to identify the antecedents of zero anphors as they appear in the preceding utterances.
Abstract: Most traditional approaches to anaphora resolution are based on the integration of complex linguistic information and domain knowledge. However, the construction of a domain knowledge base is very labor-intensive and time-consuming. In this paper, we work on the output of a part-of-speech tagger and use shallow parsing instead of complex parsing to resolve zero anaphors in written Chinese. We employ centering theory and constraint rules to identify the antecedents of zero anaphors as they appear in the preceding utterances. We focus on the cases of zero anaphors that occur in the topic or subject, and object positions of utterances. The experimental result shows that the precision rates of zero anaphora detection and the recall rate of zero anaphora resolution with the method are 81% and 70% respectively.

44 citations


Journal ArticleDOI
01 Jun 2007
TL;DR: The SemRol method is a corpus-based approach that uses two different statistical models, conditional Maximum Entropy (ME) Probability Models and the TiMBL program, a Memory-based Learning to determine the semantic role for the constituents of a sentence.
Abstract: In this paper, a method to determine the semantic role for the constituents of a sentence is presented. This method, named SemRol, is a corpus-based approach that uses two different statistical models, conditional Maximum Entropy (ME) Probability Models and the TiMBL program, a Memory-based Learning. It consists of three phases that make use of features using words, lemmas, PoS tags and shallow parsing information. Our method introduces a new phase in the Semantic Role Labeling task which has usually been approached as a two phase procedure consisting of recognition and labeling arguments. From our point of view, firstly the sense of the verbs in the sentences must be disambiguated. That is why depending on the sense of the verb a different set of roles must be considered. Regarding the labeling arguments phase, a tuning procedure is presented. As a result of this procedure one of the best sets of features for the labeling arguments task is detected. With this set, that is different for TiMBL and ME, precisions of 76.71% for TiMBL or 70.55% for ME, are obtained. Furthermore, the semantic role information provided by our SemRol method could be used as an extension of Information Retrieval or Question Answering systems. We propose using this semantic information as an extension of an Information Retrieval system in order to reduce the number of documents or passages retrieved by the system.

35 citations


01 Jan 2007
TL;DR: This paper gives the complete account of the contest in terms of how the data for the three languages was released, the performances of the participating systems and an overview of the approaches followed for POS tagging and chunking.
Abstract: As part of the IJCAI workshop on ”Shallow Parsing for South Asian Languages”, a contest was held in which the participants trained and tested their shallow parsing systems for Hindi, Bengali and Telugu. This paper gives the complete account of the contest in terms of how the data for the three languages was released, the performances of the participating systems and an overview of the approaches followed for POS tagging and chunking. We finally give an analysis of the systems which gives insights to directions for future research on shallow parsing for South Asian languages.

22 citations


Journal ArticleDOI
TL;DR: A novel phrase chunking model based on the proposed mask method without employing external knowledge and multiple learners that could automatically derive more training examples from the original training data, which significantly improves system performance.
Abstract: Automatic text chunking aims to recognize grammatical phrase structures in natural language text. Text chunking provides downstream syntactic information for further analysis, which is also an important technology in the area of text mining (TM) and natural language processing (NLP). Existing chunking systems make use of external knowledge, e.g. grammar parsers, or integrate multiple learners to achieve higher performance. However, the external knowledge is almost unavailable in many domains and languages. Besides, employing multiple learners does not only complicate the system architecture, but also increase training and testing time costs. In this paper, we present a novel phrase chunking model based on the proposed mask method without employing external knowledge and multiple learners. The mask method could automatically derive more training examples from the original training data, which significantly improves system performance. We had evaluated our method in different chunking tasks and languages in comparison to previous studies. The experimental results show that our method achieves state of the art performance in chunking tasks. In two English chunking tasks, i.e., shallow parsing and base-chunking, our method achieves 94.22 and 93.23 in F"("@b"="1") rates. When porting to Chinese, the F"("@b"="1") rate is 92.30. Also, our chunker is quite efficient. The complete chunking time of a 50K-words is less than 10s.

21 citations


01 Jan 2007
TL;DR: A system which uses lexical shallow parsing to find adjectival “appraisal groups” in sentences, which convey a positive or negative appraisal of an item, is described.
Abstract: We describe a system which uses lexical shallow parsing to find adjectival “appraisal groups” in sentences, which convey a positive or negative appraisal of an item. We used a simple heuristic to detect opinion holders, determining whether a person was being quoted in a specific sentence or not, and if so, who. We also explored the the use of unsupervised learners and voting to increase our coverage.

20 citations


Proceedings ArticleDOI
01 Jan 2007
TL;DR: Three natural language marking strategies based on fast and reliable shallow parsing techniques, and on widely available lexical resources: lexical substitution, adjective conjunction swaps, and relativiser switching are presented.
Abstract: We present three natural language marking strategies based on fast and reliable shallow parsing techniques, and on widely available lexical resources: lexical substitution, adjective conjunction swaps, and relativiser switching. We test these techniques on a random sample of the British National Corpus. Individual candidate marks are checked for goodness of structural and semantic fit, using both lexical resources, and the web as a corpus. A representative sample of marks is given to 25 human judges to evaluate for acceptability and preservation of meaning. This establishes a correlation between corpus based felicity measures and perceived quality, and makes qualified predictions. Grammatical acceptability correlates with our automatic measure strongly (Pearson's r = 0.795, p = 0.001), allowing us to account for about two thirds of variability in human judgements. A moderate but statistically insignificant (Pearson's r = 0.422, p = 0.356) correlation is found with judgements of meaning preservation, indicating that the contextual window of five content words used for our automatic measure may need to be extended.

16 citations


Journal ArticleDOI
Christoph Tillmann1, Tong Zhang
TL;DR: A novel training method for a localized phrase-based prediction model for statistical machine translation (SMT) that explicitly handles local phrase reordering and a novel stochastic gradient descent training algorithm is presented that can easily handle millions of features.
Abstract: In this article, we present a novel training method for a localized phrase-based prediction model for statistical machine translation (SMT). The model predicts block neighbors to carry out a phrase-based translation that explicitly handles local phrase reordering. We use a maximum likelihood criterion to train a log-linear block bigram model which uses real-valued features (e.g., a language model score) as well as binary features based on the block identities themselves (e.g., block bigram features). The model training relies on an efficient enumeration of local block neighbors in parallel training data. A novel stochastic gradient descent (SGD) training algorithm is presented that can easily handle millions of features. Moreover, when viewing SMT as a block generation process, it becomes quite similar to sequential natural language annotation problems such as part-of-speech tagging, phrase chunking, or shallow parsing. Our novel approach is successfully tested on a standard Arabic-English translation task using two different phrase reordering models: a block orientation model and a phrase-distortion model.

14 citations


Journal ArticleDOI
TL;DR: This paper proposes an efficient and accurate text chunking system using linear SVM kernel and a new technique called masked method and proposes a masked-based method to solve unknown word problem to enhance system performance.
Abstract: In this paper, we proposed an efficient and accurate text chunking system using linear SVM kernel and a new technique called masked method. Previous researches indicated that systems combination or external parsers can enhance the chunking performance. However, the cost of constructing multi-classifiers is even higher than developing a single processor. Moreover, the use of external resources will complicate the original tagging process. To remedy these problems, we employ richer features and propose a masked-based method to solve unknown word problem to enhance system performance. In this way, no external resources or complex heuristics are required for the chunking system. The experiments show that when training with the CoNLL-2000 chunking dataset, our system achieves 94.12 in F"("@b") rate with linear. Furthermore, our chunker is quite efficient since it adopts a linear kernel SVM. The turn-around tagging time on CoNLL-2000 testing data is less than 50s which is about 115 times than polynomial kernel SVM.

7 citations


Proceedings ArticleDOI
Yun Xing1
23 Jun 2007
TL;DR: The implementation of Word Sense Disambiguation system that participated in the SemEval-2007 multilingual Chinese-English lexical sample task was adopted with Maximum Entropy classifier, which obtained precision of 0.716 in micro-average, which is the best among all participated systems.
Abstract: This article describes the implementation of Word Sense Disambiguation system that participated in the SemEval-2007 multilingual Chinese-English lexical sample task. We adopted a supervised learning approach with Maximum Entropy classifier. The features used were neighboring words and their part-of-speech, as well as single words in the context, and other syntactic features based on shallow parsing. In addition, we used word category information of a Chinese thesaurus as features for verb disambiguation. For the task we participated in, we obtained precision of 0.716 in micro-average, which is the best among all participated systems.

5 citations


Proceedings Article
06 Nov 2007
TL;DR: The purpose of this paper is to characterize a chunk boundary parsing algorithm, using a statistical method combining adjustment rules, which serves as a supplement to traditional statistics-based parsing methods.
Abstract: Natural language processing (NLP) is a very hot research domain. One important branch of it is sentence analysis, including Chinese sentence analysis. However, currently, no mature deep analysis theories and techniques are available. An alternative way is to perform shallow parsing on sentences which is very popular in the domain. The chunk identification is a fundamental task for shallow parsing. The purpose of this paper is to characterize a chunk boundary parsing algorithm, using a statistical method combining adjustment rules, which serves as a supplement to traditional statistics-based parsing methods. The experimental results show that the model works well on the small dataset. It will contribute to the sequent processes like chunk tagging and chunk collocation extraction under other topics etc.

Journal ArticleDOI
TL;DR: This paper presents a chunk segmentation algorithm using a combined statistical and rule-based approach (CSRA), where the decision rules for refining chunks are generated from incorrectly segmented chunks from a statistical model which is built on a training corpus.
Abstract: Deep parsing of Chinese sentences is a very challenging task due to their complexity such as ambiguous word boundaries and meanings. An alternative mode of Chinese language processing is to perform shallow parsing of Chinese sentences in which chunk segmentation plays an important role. In this paper, we present a chunk segmentation algorithm using a combined statistical and rule-based approach (CSRA). The decision rules for refining chunk segmentation are generated from incorrectly segmented chunks from a statistical model which is built on a training corpus. Experimental results show that the CSRA works well and produces satisfactory chunk segmentation results for subsequent processes such as chunk tagging and chunk collocation extraction.

Proceedings ArticleDOI
24 Aug 2007
TL;DR: A semi-automatic word extraction approach for general dictionary and the specialty dictionaries based on Information Entropy is presented, which shows that CRF achieved 1.09% improvement in POS tagging task, and 0.67% in shallow parsing task in terms of F-measure.
Abstract: An enhanced text analysis approach for Chinese text- to-speech (TTS) systems is presented in this paper, as the basic understanding process, the text analysis need provide a fine and effective linguistic information, which is marked explicitly with the corresponding notation. Two kinds of work are done to improve the TTS performance. Firstly, the shallow parsing information is introduced, which is processed by the conditional random fields, accordingly, the label bias problem is overcome; Secondly, considering the dictionary is very important not only in the Chinese word segmentation, but also in the Pinyin-to-Character conversion, we present a semi-automatic word extraction approach for general dictionary and the specialty dictionaries based on Information Entropy. The experiments show that CRF achieved 1.09% improvement in POS tagging task, and 0.67% in shallow parsing task in terms of F-measure. The specialty words can increases the precision by 1.80% to the word segmentation.

Proceedings ArticleDOI
29 Oct 2007
TL;DR: By introducing the Kernel principle, SVMs can carry out the training in high-dimensional space with smaller computational cost independent of their dimensionality.
Abstract: To be able to represent the whole hierarchical phrase structure, 10 types of Chinese chunks are defined. The paper presents a method of Chinese shallow Paring based on Support Vector machines (SVMs). Conventional recognition techniques based on Machine Learning have difficulty in selecting useful features as well as finding appropriate combination of selected features. SVMs can automatically focus on useful features and robustly handle a large feature set to develop models that maximize their generalizability. On the other hand, it is well known that SVMs achieve high generalization of very high dimensional feature space. Furthermore, by introducing the Kernel principle, SVMs can carry out the training in high-dimensional space with smaller computational cost independent of their dimensionality. The experiments produced promising results.