Topic
Knowledge extraction
About: Knowledge extraction is a research topic. Over the lifetime, 20251 publications have been published within this topic receiving 413401 citations.
Papers published on a yearly basis
Papers
More filters
••
IBM1
TL;DR: The authors' perspective of database mining as the confluence of machine learning techniques and the performance emphasis of database technology is presented and an algorithm for classification obtained by combining the basic rule discovery operations is given.
Abstract: The authors' perspective of database mining as the confluence of machine learning techniques and the performance emphasis of database technology is presented. Three classes of database mining problems involving classification, associations, and sequences are described. It is argued that these problems can be uniformly viewed as requiring discovery of rules embedded in massive amounts of data. A model and some basic operations for the process of rule discovery are described. It is shown how the database mining problems considered map to this model, and how they can be solved by using the basic operations proposed. An example is given of an algorithm for classification obtained by combining the basic rule discovery operations. This algorithm is efficient in discovering classification rules and has accuracy comparable to ID3, one of the best current classifiers. >
1,539 citations
••
24 Aug 2003TL;DR: This paper proposes a general framework for mining concept-drifting data streams using weighted ensemble classifiers, and shows that the proposed methods have substantial advantage over single-classifier approaches in prediction accuracy, and the ensemble framework is effective for a variety of classification models.
Abstract: Recently, mining data streams with concept drifts for actionable insights has become an important and challenging task for a wide range of applications including credit card fraud protection, target marketing, network intrusion detection, etc. Conventional knowledge discovery tools are facing two challenges, the overwhelming volume of the streaming data, and the concept drifts. In this paper, we propose a general framework for mining concept-drifting data streams using weighted ensemble classifiers. We train an ensemble of classification models, such as C4.5, RIPPER, naive Beyesian, etc., from sequential chunks of the data stream. The classifiers in the ensemble are judiciously weighted based on their expected classification accuracy on the test data under the time-evolving environment. Thus, the ensemble approach improves both the efficiency in learning the model and the accuracy in performing classification. Our empirical study shows that the proposed methods have substantial advantage over single-classifier approaches in prediction accuracy, and the ensemble framework is effective for a variety of classification models.
1,403 citations
••
04 Aug 2014TL;DR: View on new challenges identified are shared, and some of the application scenarios such as micro-blog data analysis and data processing in building next generation search engines are covered.
Abstract: Big data may contain big values, but also brings lots of challenges to the computing theory, architecture, framework, knowledge discovery algorithms, and domain specific tools and applications. Beyond the 4-V or 5-V characters of big datasets, the data processing shows the features like inexact, incremental, and inductive manner. This brings new research opportunities to research community across theory, systems, algorithms, and applications. Is there some new "theory" for the big data? How to handle the data computing algorithms in an operatable manner? This report shares some view on new challenges identified, and covers some of the application scenarios such as micro-blog data analysis and data processing in building next generation search engines.
1,364 citations
••
TL;DR: After a decade of fundamental interdisciplinary research in machine learning, the spadework in this field has been done; the 1990s should see the widespread exploitation of knowledge discovery as an aid to assembling knowledge bases.
Abstract: After a decade of fundamental interdisciplinary research in machine learning, the spadework in this field has been done; the 1990s should see the widespread exploitation of knowledge discovery as an aid to assembling knowledge bases. The contributors to the AAAI Press book Knowledge Discovery in Databases were excited at the potential benefits of this research. The editors hope that some of this excitement will communicate itself to "AI Magazine readers of this article.
1,332 citations
••
15 Oct 2008
TL;DR: KEEL as discussed by the authors is a software tool to assess evolutionary algorithms for data mining problems of various kinds including regression, classification, unsupervised learning, etc., which includes evolutionary learning algorithms based on different approaches: Pittsburgh, Michigan and IRL.
Abstract: This paper introduces a software tool named KEEL which is a software tool to assess evolutionary algorithms for Data Mining problems of various kinds including as regression, classification, unsupervised learning, etc. It includes evolutionary learning algorithms based on different approaches: Pittsburgh, Michigan and IRL, as well as the integration of evolutionary learning techniques with different pre-processing techniques, allowing it to perform a complete analysis of any learning model in comparison to existing software tools. Moreover, KEEL has been designed with a double goal: research and educational.
1,297 citations