scispace - formally typeset
Search or ask a question
Topic

Knowledge extraction

About: Knowledge extraction is a research topic. Over the lifetime, 20251 publications have been published within this topic receiving 413401 citations.


Papers
More filters
Patent
04 Nov 2002
TL;DR: In this article, a data mining framework for mining high-quality structured clinical information is presented, including a data miner that mines medical information from a computerized patient record (CPR) based on domain-specific knowledge contained in a knowledge base.
Abstract: The present invention provides a data mining framework for mining high-quality structured clinical information. The data mining framework includes a data miner that mines medical information from a computerized patient record (CPR) based on domain-specific knowledge contained in a knowledge base. The data miner includes components for extracting information from the CPR, combining all available evidence in a principled fashion over time, and drawing inferences from this combination process. The mined medical information is stored in a structured CPR which can be a data warehouse.

114 citations

Journal ArticleDOI
TL;DR: Preliminary results generated from the semantic retrieval research component of the Illinois Digital Library Initiative (DLI) project are presented, which aimed to create graphs of domain-specific concepts and their weighted co-occurrence relationships for all major engineering domains.
Abstract: This research presents preliminary results generated from the semantic retrieval research component of the Illinois Digital Library Initiative (DLI) project. Using a variation of the automatic thesaurus generation techniques, to which we refer to as the concept space approach, we aimed to create graphs of domain-specific concepts (terms) and their weighted co-occurrence relationships for all major engineering domains. Merging these concept spaces and providing traversal paths across different concept spaces could potentially help alleviate the vocabulary (difference) problem evident in large-scale information retrieval. In order to address the scalability issue related to large-scale information retrieval and analysis for the current Illinois DLI project, we conducted experiments using the concept space approach on parallel supercomputers. Our test collection included computer science and electrical engineering abstracts extracted from the INSPEC database. The concept space approach called for extensive textual and statistical analysis (a form of knowledge discovery) based on automatic indexing and co-occurrence analysis algorithms, both previously tested in the biology domain. Initial testing results using a 512-node CM-5 and a 16-processor SGI Power Challenge were promising.

114 citations

01 Jan 1991
TL;DR: The quality of database security in general is stressed as an important set of controls to secure against knowl­ edge discovery and some of the potential security risks associated with knowledge discovery are discussed.
Abstract: This chapter investigates the affect of knowledge discovery from databases on the security of databases. First, it examines the current concern with database systems for security from knowl­ edge discovery. Second, this chapter discusses some of the potential security risks associated with knowledge discovery. Third, some potential structure for the development of controls for such systems is examined. It is suggested that the technology itself and the voluntary or invol­ untary nature of the unauthorized disclosure form a. basis for analysis. Fourth, the quality of database security in general is stressed as an important set of controls to secure against knowl­ edge discovery. Finally, it is noted tha.t any set of controls should be compared with the benefit derived from these controls. Security can be established; however, it is important not to forget that ease (or the decision maker is one of the primary reasons for creating and maintaining the

114 citations

09 Nov 1996
TL;DR: The Co4 system is dedicated to the representation of formal knowledge in an object and task based manner and is fully interleaved with hyper-documents and thus provides integration of formal and informal knowledge.
Abstract: The Co4 system is dedicated to the representation of formal knowledge in an object and task based manner. It is fully interleaved with hyper-documents and thus provides integration of formal and informal knowledge. Moreover, consensus about the content of the knowledge bases is enforced with the help of a protocol for integrating knowledge through several levels of consensual knowledge bases. Co4 is presented here as addressing three claims about corporate memory: (1) it must be formalised to the greatest possible extent so that its semantics is clear and its manipulation can be automated; (2) it cannot be totally formalised and thus formal and informal knowledge must be organised such that they refer to each other; (3) in order to be useful, it must be accepted by the people involved (providers and users) and thus must be non contradictory and consensual.

114 citations

Book
15 Dec 1999
TL;DR: This book discusses current Approaches to Process Monitoring, Diagnosis and Control, and a method for Selection of Training / Test Data and Model Retraining for Supervised Learning for Operational Support.
Abstract: 1 Introduction.- 1.1 Current Approaches to Process Monitoring, Diagnosis and Control.- 1.2 Monitoring Charts for Statistical Quality Control.- 1.3 The Operating Window.- 1.4 State Space Based Process Monitoring and Control.- 1.5 Characteristics of Process Operational Data.- 1.6 System Requirement and Architecture.- 1.7 Outline of the Book.- 2 Data Mining and Knowledge Discovery - an Overview.- 2.1 Definition and Development.- 2.2 The KDD Process.- 2.3 Data Mining Techniques.- 2.4 Feature Selection with Data Mining.- 2.5 Final Remarks and Additional Resources.- 3 Data Pre-processing for Feature Extraction, Dimension Reduction and Concept Formation.- 3.1 Data Pre-processing.- 3.2 Use of Principal Component Analysis.- 3.3 Wavelet Analysis.- 3.4 Episode Approach.- 3.5 Summary.- 4 Multivariate Statistical Analysis for Data Analysis and Statistical Control.- 4.1 PCA for State Identification and Monitoring.- 4.2 Partial Least Squares (PLS).- 4.3 Variable Contribution Plots.- 4.4 Multiblock PCA and PLS.- 4.5 Batch Process Monitoring Using Multiway PCA.- 4.6 Nonlinear PCA.- 4.7 Operational Strategy Development and Product Design - an Industrial Case Study.- 4.8 General Observations.- 5 Supervised Learning for Operational Support.- 5.1 Feedforward Neural Networks.- 5.2 Variable Selection and Feature Extraction for FFNN Inputs.- 5.3 Model Validation and Confidence Bounds.- 5.4 Application of FFNN to Process Fault Diagnosis.- 5.5 Fuzzy Neural Networks.- 5.6 Fuzzy Set Covering Method.- 5.7 Fuzzy Signed Digraphs.- 5.8 Case Studies.- 5.9 General Observations.- 6 Unsupervised Learning for Operational State Identification.- 6.1 Supervised vs. Unsupervised Learning.- 6.2 Adaptive Resonance Theory.- 6.3 A Framework for Integrating Wavelet Feature Extraction and ART2.- 6.4 Application of ARTnet to the FCC Process.- 6.5 Bayesian Automatic Classification.- 6.6 Application of AutoClass to the FCC Process.- 6.7 General Comments.- 7 Inductive Learning for Conceptual Clustering and Real-time Process Monitoring.- 7.1 Inductive Learning.- 7.2 IL for Knowledge Discovery from Averaged Data.- 7.3 IL for Conceptual Clustering and Real-time Monitoring.- 7.4 Application to the Refinery MTBE Process.- 7.5 General Review.- 8 Automatic Extraction of Knowledge Rules from Process Operational Data.- 8.1 Rules Generation Using Fuzzy Set Operation.- 8.2 Rules Generation from Neural Networks.- 8.3 Rules Generation Using Rough Set Method.- 8.4 A Fuzzy Neural Network Method for Rules Extraction.- 8.5 Discussion.- 9 Inferential Models and Software Sensors.- 9.1 Feedforward Neural Networks as Software Sensors.- 9.2 A Method for Selection of Training / Test Data and Model Retraining.- 9.3 An Industrial Case Study.- 9.4 Dimension Reduction of Input Variables.- 9.5 Dynamic Neural Networks as Inferential Models.- 9.6 Summary.- 10 Concluding Remarks.- Appendix A The Continuous Stirred Tank Reactor (CSTR).- Appendix B The Residue Fluid Catalytic Cracking (R-FCC) Process.- Appendix C The Methyl Tertiary Butyl Ether (MTBE) Process.- References.

113 citations


Network Information
Related Topics (5)
Cluster analysis
146.5K papers, 2.9M citations
90% related
Support vector machine
73.6K papers, 1.7M citations
90% related
Artificial neural network
207K papers, 4.5M citations
87% related
Fuzzy logic
151.2K papers, 2.3M citations
86% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023120
2022285
2021506
2020660
2019740
2018683