scispace - formally typeset
Search or ask a question
Topic

Knowledge extraction

About: Knowledge extraction is a research topic. Over the lifetime, 20251 publications have been published within this topic receiving 413401 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: It is shown that experiment- and simulation-based data mining in combination with machine leaning tools provide exceptional opportunities to enable highly reliant identification of fundamental interrelations within materials for characterization and optimization in a scale-bridging manner.
Abstract: Machine learning tools represent key enablers for empowering material scientists and engineers to accelerate the development of novel materials, processes and techniques. One of the aims of using such approaches in the field of materials science is to achieve high-throughput identification and quantification of essential features along the process-structure-property-performance chain. In this contribution, machine learning and statistical learning approaches are reviewed in terms of their successful application to specific problems in the field of continuum materials mechanics. They are categorized with respect to their type of task designated to be either descriptive, predictive or prescriptive; thus to ultimately achieve identification, prediction or even optimization of essential characteristics. The respective choice of the most appropriate machine learning approach highly depends on the specific use-case, type of material, kind of data involved, spatial and temporal scales, formats, and desired knowledge gain as well as affordable computational costs. Different examples are reviewed involving case-by-case dependent application of different types of artificial neural networks and other data-driven approaches such as support vector machines, decision trees and random forests as well as Bayesian learning, and model order reduction procedures such as principal component analysis, among others. These techniques are applied to accelerate the identification of material parameters or salient features for materials characterization, to support rapid design and optimization of novel materials or manufacturing methods, to improve and correct complex measurement devices, or to better understand and predict fatigue behavior, among other examples. Besides experimentally obtained datasets, numerous studies draw required information from simulation-based data mining. Altogether, it is shown that experiment- and simulation-based data mining in combination with machine leaning tools provide exceptional opportunities to enable highly reliant identification of fundamental interrelations within materials for characterization and optimization in a scale-bridging manner. Potentials of further utilizing applied machine learning in materials science and empowering significant acceleration of knowledge output are pointed out.

222 citations

Journal ArticleDOI
TL;DR: This special section on Intelligent Mobile Knowledge Discovery and Management Systems is to bring together top-quality articles on the art and practice of mobile knowledge discovery and management systems that exhibit a level of intelligence.
Abstract: Advances in wireless communication mobile-information infrastructures such as GPS, WiFi, and mobile phone technologies have enabled us to collect, process, and manage massive amounts of mobile data from diverse information sources. These mobile data are fine-grained, information-rich, and provide unparalleled opportunities for us to understand mobile user behaviours and generate useful knowledge, which in turn allows the delivery of intelligence for real-time decision making in various real-world applications. In this context, knowledge discovery is the process of automatic extraction of interesting and useful knowledge from large amounts of mobile data, whereas knowledge management consists of a range of strategies and practices to identify, create, represent, distribute, and enable the adoption of novel insights and experiences for decision making. There is a critical emerging need to investigate knowledge discovery and management issues in the mobile context. The objective of this special section on Intelligent Mobile Knowledge Discovery and Management Systems is to bring together top-quality articles on the art and practice of mobile knowledge discovery and management systems that exhibit a level of intelligence. We received a total of 12 submissions from which 3 articles have been selected for publication after an extensive peer-review process. The first article, entitled \" Mining Geographic-Temporal-Semantic Patterns in Trajectories for Location Prediction \" by Ying et al., has a focus on location prediction by mining human location traces. A unique perspective of this article is to exploit a user's geographic, temporal, and semantic information simultaneously for estimating the probability of a traveler in visiting a location. The key idea underlying this study is the discovery of user trajectory patterns, which are used to capture frequent movements triggered by the user's geographic, temporal, and semantic intentions. The article \" A Framework of Traveling Companion Discovery on Trajectory Data Streams \" by Tang et al. studies the problem of discovering object groups which travel together (i.e., traveling companions) from trajectory data streams. Since the solution of this problem requires a large computational cost because of expensive spatial operations , the authors propose a smart data structure to facilitate scalable and flexible companion discovery from location traces. The article \" Mondrian Tree: A Fast Index for Spatial Alarm Processing \" authored by M. Doo and L. Liu promotes the efficient process of spatial alarms, which remind us of the arrival of a future spatial event. A key research challenge in scaling spatial alarm processing is how to efficiently …

221 citations

Journal ArticleDOI
TL;DR: This paper presents a uniform theoretical framework, based on annotated logics, for amalgamating multiple knowledge bases when these knowledge bases may contain inconsistencies, uncertainties, and nonmonotonicmodes of negation.
Abstract: The integration of knowledge for multiple sources is an important aspect of automated reasoning systems. When different knowledge bases are used to store knowledge provided by multiple sources, we are faced with the problem of integrating multiple knowledge bases: Under these circumstances, we are also confronted with the prospect of inconsistency. In this paper we present a uniform theoretical framework, based on annotated logics, for amalgamating multiple knowledge bases when these knowledge bases (possibly) contain inconsistencies, uncertainties, and nonmonotonic modes of negation. We show that annotated logics may be used, with some modifications, to mediate between different knowledge bases. The multiple knowledge bases are amalgamated by a transformation of the individual knowledge bases into new annotated logic programs, together with the addition of a new axiom scheme. We characterize the declarative semantics of such amalgamated knowledge bases and study how the semantics of the amalgam is related to the semantics of the individual knowledge bases being combined.—Author's Abstract

221 citations

Book
01 Jan 2001
TL;DR: This paper presents a data mining technique and an interestingness framework based on heuristic measures of interestingness that were developed in the second part of this monograph on interestingness and data mining.
Abstract: List of Figures. List of Tables. Preface. Acknowledgments. 1. Introduction. 2. Background and Related Work. 3. A Data Mining Technique. 4. Heuristic Measures of Interestingness. 5. An Interestingness Framework. 6. Experimental Analyses. 7. Conclusion. Appendices. Index.

221 citations

Journal ArticleDOI
TL;DR: Four techniques intended for noise removal to enhance data analysis in the presence of high noise levels are explored, including a hyperclique-based data cleaner (HCleaner), which generally leads to better clustering performance and higher quality association patterns as the amount of noise being removed increases.
Abstract: Removing objects that are noisy is an important goal of data cleaning as noise hinders most types of data analysis. Most existing data cleaning methods focus on removing noise that is the product of low-level data errors that result from an imperfect data collection process, but data objects that are irrelevant or only weakly relevant can also significantly hinder data analysis. Thus, if the goal is to enhance the data analysis as much as possible, these objects should also be considered as noise, at least with respect to the underlying analysis. Consequently, there is a need for data cleaning techniques that remove both types of noise. Because data sets can contain large amounts of noise, these techniques also need to be able to discard a potentially large fraction of the data. This paper explores four techniques intended for noise removal to enhance data analysis in the presence of high noise levels. Three of these methods are based on traditional outlier detection techniques: distance-based, clustering-based, and an approach based on the local outlier factor (LOF) of an object. The other technique, which is a new method that we are proposing, is a hyperclique-based data cleaner (HCleaner). These techniques are evaluated in terms of their impact on the subsequent data analysis, specifically, clustering and association analysis. Our experimental results show that all of these methods can provide better clustering performance and higher quality association patterns as the amount of noise being removed increases, although HCleaner generally leads to better clustering performance and higher quality associations than the other three methods for binary data.

220 citations


Network Information
Related Topics (5)
Cluster analysis
146.5K papers, 2.9M citations
90% related
Support vector machine
73.6K papers, 1.7M citations
90% related
Artificial neural network
207K papers, 4.5M citations
87% related
Fuzzy logic
151.2K papers, 2.3M citations
86% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023120
2022285
2021506
2020660
2019740
2018683