scispace - formally typeset
Search or ask a question
Topic

Knowledge extraction

About: Knowledge extraction is a research topic. Over the lifetime, 20251 publications have been published within this topic receiving 413401 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A novel conformance checking method to measure how well a process model performs in terms of precision and generalization with respect to the actual executions of a process as recorded in an event log is introduced.
Abstract: Process mining encompasses the research area which is concerned with knowledge discovery from event logs. One common process mining task focuses on conformance checking, comparing discovered or designed process models with actual real-life behavior as captured in event logs in order to assess the “goodness” of the process model. This paper introduces a novel conformance checking method to measure how well a process model performs in terms of precision and generalization with respect to the actual executions of a process as recorded in an event log. Our approach differs from related work in the sense that we apply the concept of so-called weighted artificial negative events toward conformance checking, leading to more robust results, especially when dealing with less complete event logs that only contain a subset of all possible process execution behavior. In addition, our technique offers a novel way to estimate a process model’s ability to generalize. Existing literature has focused mainly on the fitness (recall) and precision (appropriateness) of process models, whereas generalization has been much more difficult to estimate. The described algorithms are implemented in a number of ProM plugins, and a Petri net conformance checking tool was developed to inspect process model conformance in a visual manner.

81 citations

Journal ArticleDOI
TL;DR: A framework is developed that is built around a concept-based model using domain-dependant ontologies that is exercised in a coastal zone domain and achieves assignment of concepts in the ontology to the objects automatically.
Abstract: Earth observation data have increased significantly over the last decades with satellites collecting and transmitting to Earth receiving stations in excess of 3 TB of data a day. This data acquisition rate is a major challenge to the existing data exploitation and dissemination approaches. The lack of content- and semantic-based interactive information searching and retrieval capabilities from the image archives is an impediment to the use of the data. In this paper, we describe a framework we have developed [Intelligent Interactive Image Knowledge Retrieval (I/sup 3/KR)] that is built around a concept-based model using domain-dependant ontologies. In this framework, the basic concepts of the domain are identified first and generalized later, depending upon the level of reasoning required for executing a particular query. We employ an unsupervised segmentation algorithm to extract homogeneous regions and calculate primitive descriptors for each region based on color, texture, and shape. We initially perform an unsupervised classification by means of a kernel principal components analysis method, which extracts components of features that are nonlinearly related to the input variables, followed by a support vector machine classification to generate models for the object classes. The assignment of concepts in the ontology to the objects is achieved automatically by the integration of a description logics-based inference mechanism, which processes the interrelationships between the properties held in the specific concepts of the domain ontology. The framework is exercised in a coastal zone domain.

81 citations

Journal ArticleDOI
TL;DR: The incorporation of spatial information into the knowledge discovery process is found not only to improve the accuracy of the extracted knowledge, but also to add to the explicitness and extensiveness of the extraction soil-landscape model.
Abstract: This paper develops a knowledge discovery procedure for extracting knowledge of soil-landscape models from a soil map. It has broad relevance to knowledge discovery from other natural resource maps. The procedure consists of four major steps: data preparation, data preprocessing, pattern extraction, and knowledge consolidation. In order to recover true expert knowledge from the error-prone soil maps, our study pays specific attention to the reduction of representation noise in soil maps. The data preprocessing step has exhibited an important role in obtaining greater accuracy. A specific method for sampling pixels based on modes of environmental histograms has proven to be effective in terms of reducing noise and constructing representative sample sets. Three inductive learning algorithms, the See5 decision tree algorithm, Naive Bayes, and artificial neural network, are investigated for a comparison concerning learning accuracy and result comprehensibility. See5 proves to be an accurate method and produces the most comprehensible results, which are consistent with the rules (expert knowledge) used in producing the soil map. The incorporation of spatial information into the knowledge discovery process is found not only to improve the accuracy of the extracted knowledge, but also to add to the explicitness and extensiveness of the extracted soil-landscape model.

81 citations

Patent
27 Mar 2002
TL;DR: In this article, a system is described that allows a multi-dimensional data set to be mined as a single dimension data set so that useful information can be derived from the data set in an efficient manner.
Abstract: A system is disclosed that allows a multi-dimensional data set to be mined as a single dimension data set so that useful information can be derived from the data set in an efficient manner. In one embodiment, the present invention allows for association rules and/or sequential patterns to be generated from M-dimensional data using a 1-dimensional mining process. In one implementation, one or more conditional items are appended to a data item in order to transform the multi-dimensional data to one-dimensional data.

81 citations

BookDOI
01 Nov 2007
TL;DR: This edited volume by highly regarded authors, includes several contributors of the 2005, Data Mining and Knowledge Discovery Handbook and is suitable as a secondary textbook or reference for advanced-level students in information systems, engineering, computer science and statistics management.
Abstract: Data mining is the science and technology of exploring large and complex bodies of data in order to discover useful patterns. It is extremely important because it enables modeling and knowledge extraction from abundant data availability. Soft Computing for Knowledge Discovery and Data Mining introduces soft computing methods extending the envelope of problems that data mining can solve efficiently. It presents practical soft-computing approaches in data mining. This edited volume by highly regarded authors, includes several contributors of the 2005, Data Mining and Knowledge Discovery Handbook.This bookwas written to provide investigators in the fields of information systems, engineering, computer science, statistics and management with a profound source for the role of soft computing in data mining. Not only does this book feature illustrations of various applications including manufacturing, medical, banking, insurance and others, but also includes various real-world case studies with detailed results. Soft Computing for Knowledge Discovery and Data Mining is designed for practitioners and researchers in industry. Practitioners and researchers may be particularly interested in the description of real world data mining projects performed with soft computing. This book is also suitable as a secondary textbook or reference for advanced-level students in information systems, engineering, computer science and statistics management.

81 citations


Network Information
Related Topics (5)
Cluster analysis
146.5K papers, 2.9M citations
90% related
Support vector machine
73.6K papers, 1.7M citations
90% related
Artificial neural network
207K papers, 4.5M citations
87% related
Fuzzy logic
151.2K papers, 2.3M citations
86% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023120
2022285
2021506
2020660
2019740
2018683