scispace - formally typeset
Proceedings ArticleDOI

Coreference Resolution Using Decision Trees

TLDR
This work shows that pessimistic error pruning method gives better generalization in a coreference resolution task than that reported in W.M. Soon et al. (2001) when weights of positive and negative examples are properly chosen.
Abstract
Coreference resolution is the process of determining whether two expressions in natural language refer to the same entity in the world. We adopt machine learning approach using decision tree to a coreference resolution of general noun phrases in unrestricted text based on well defined features. We also use approximate matching algorithms for a string match feature and databases of American last names and male and female first names for gender agreement and alias feature. For the evaluation we use MUC-6 coreference corpora. We show that pessimistic error pruning method gives better generalization in a coreference resolution task than that reported in W.M. Soon et al. (2001) when weights of positive and negative examples are properly chosen

read more

Citations
More filters

Knowledge and Data Engineering for e-Learning Special Issue of IEEE Transactions on Knowledge and Data Engineering

TL;DR: In this special issue, the focus will be on the technical side, although other issues related to knowledge and data engineering for e-Iearning may also be considered.
Patent

Determining the degree of relevance of alerts in an entity resolution system

TL;DR: An entity resolution system and alert analysis system configured to process inbound identity records and to generate alerts based on relevant identities, entities, conditions, activities, or events is disclosed in this paper.
Patent

Determining entity relevance by relationships to other relevant entities

TL;DR: In this article, an entity resolution system configured to process an inbound identity record and to generate a relevance score for the inbound ID record is disclosed. And the relevance score is computed based on base relevance score, association relevance scores, derived relevance scores and relationship strengths of entities related to the ID record.
Proceedings ArticleDOI

Review on natural language processing tasks for text documents

TL;DR: This survey is done to decide which NLP task will be better for preprocessing of search keyword, which in turn uses for appropriate matching to desired text documents, and comes to a conclusion that POS tagging and chunking, both will be a better option for pre processing of keyword.
Proceedings ArticleDOI

Smart and secure IOT based child behaviour and health monitoring system using hadoop

TL;DR: A smart and secure health care monitor application that personify the monitoring of total health and mind status of the children and adopts Hadoop in the background to effectively map the data and to reduce it into elementary.
References
More filters
Journal ArticleDOI

A tutorial on hidden Markov models and selected applications in speech recognition

TL;DR: In this paper, the authors provide an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and give practical details on methods of implementation of the theory along with a description of selected applications of HMMs to distinct problems in speech recognition.
Book

C4.5: Programs for Machine Learning

TL;DR: A complete guide to the C4.5 system as implemented in C for the UNIX environment, which starts from simple core learning methods and shows how they can be elaborated and extended to deal with typical problems such as missing data and over hitting.
Journal ArticleDOI

Induction of Decision Trees

J. R. Quinlan
- 25 Mar 1986 - 
TL;DR: In this paper, an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such system, ID3, in detail, is described, and a reported shortcoming of the basic algorithm is discussed.
Book

Foundations of Statistical Natural Language Processing

TL;DR: This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear and provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations.
ReportDOI

Building a large annotated corpus of English: the penn treebank

TL;DR: As a result of this grant, the researchers have now published on CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, which includes a fully hand-parsed version of the classic Brown corpus.
Related Papers (5)