scispace - formally typeset
Search or ask a question
Topic

Knowledge extraction

About: Knowledge extraction is a research topic. Over the lifetime, 20251 publications have been published within this topic receiving 413401 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper presents a distributed ontology architecture for knowledge management in highway construction, intended to be the base for a cross-discipline knowledge exchange in the infrastructure domain.
Abstract: The ongoing plethora of rehabilitation in the infrastructure domain requires more planning and integration during design and construction. To achieve this, there is a need for developing and using semantic (ontology-based) mechanisms for the exchange of development knowledge among all project stakeholders. This paper presents a distributed ontology architecture for knowledge management in highway construction. With every other utility tied to the highway geometry, the architecture is intended to be the base for a cross-discipline knowledge exchange in the infrastructure domain. The architecture presents highway knowledge on three levels: domain knowledge (an umbrella for infrastructure shared knowledge), application knowledge (representation of highway-specific knowledge), and user knowledge (an enterprise-specific representation of highway knowledge). The proposed architecture models highway concepts using six major root concepts: project, process, product, actor, resources, and technical topics (attribu...

99 citations

Proceedings ArticleDOI
28 Jun 2009
TL;DR: The article categorizes the observed techniques in classes, and looks at strengths and weaknesses of information visualization and data mining, and for which purposes researchers in infovis use data mining techniques and reversely how researchers in data mining employ infov is techniques.
Abstract: The aim of this work is to survey and reflect on the various ways to integrate visualization and data mining techniques toward a mixed-initiative knowledge discovery taking the best of human and machine capabilities. Following a bottom-up bibliographic research approach, the article categorizes the observed techniques in classes, highlighting current trends, gaps, and potential future directions for research. In particular it looks at strengths and weaknesses of information visualization and data mining, and for which purposes researchers in infovis use data mining techniques and reversely how researchers in data mining employ infovis techniques. The article further uses this information to analyze the discovery process by comparing the analysis steps from the perspective of information visualization and data mining. The comparison permits to bring to light new perspectives on how mining and visualization can best employ human and machine skills.

99 citations

Journal ArticleDOI
TL;DR: This paper overviews the recent developments in learning classifier systems research, the new models, and the most interesting applications, suggesting some of the most relevant future research directions.

98 citations

Journal ArticleDOI
TL;DR: This article suggests four kernels: predicate, walk, dependency and hybrid kernels to adequately encapsulate information required for a relation prediction based on the sentential structures involved in two entities, and views the dependency structure of a sentence as a graph, which allows the system to deal with an essential one from the complex syntactic structure by finding the shortest path between entities.
Abstract: Motivation: Automatic knowledge discovery and efficient information access such as named entity recognition and relation extraction between entities have recently become critical issues in the biomedical literature. However, the inherent difficulty of the relation extraction task, mainly caused by the diversity of natural language, is further compounded in the biomedical domain because biomedical sentences are commonly long and complex. In addition, relation extraction often involves modeling long range dependencies, discontiguous word patterns and semantic relations for which the pattern-based methodology is not directly applicable. Results: In this article, we shift the focus of biomedical relation extraction from the problem of pattern extraction to the problem of kernel construction. We suggest four kernels: predicate, walk, dependency and hybrid kernels to adequately encapsulate information required for a relation prediction based on the sentential structures involved in two entities. For this purpose, we view the dependency structure of a sentence as a graph, which allows the system to deal with an essential one from the complex syntactic structure by finding the shortest path between entities. The kernels we suggest are augmented gradually from the flat features descriptions to the structural descriptions of the shortest paths. As a result, we obtain a very promising result, a 77.5 F-score with the walk kernel on the Language Learning in Logic (LLL) 05 genic interaction shared task. Availability: The used algorithms are free for use for academic research and are available from our Web site http://mllab.sogang.ac.kr/~shkim/LLL05.tar.gz. Contact: shkim@lex.yonsei.ac.kr

98 citations

Proceedings ArticleDOI
02 Dec 1995
TL;DR: The advantages of using domain knowledge within the discovery process are highlighted by providing results from the application of the STRIP algorithm in the actuarial domain.
Abstract: The ideal situation for a Data Mining or Knowledge Discovery system would be for the user to be able to pose a query of the form “Give me something interesting that could be useful” and for the system to discover some useful knowledge for the user. But such a system would be unrealistic as databases in the real world are very large and so it would be too inefficient to be workable. So the role of the human within the discovery process is essential. Moreover, the measure of what is meant by “interesting to the user” is dependent on the user as well as the domain within which the Data Mining system is being used. In this paper we discuss the use of domain knowledge within Data Mining. We define three classes of domain knowledge: Hierarchical Generalization Trees ( HG-Trees), Attribute Relationship Rules (AR-rules) and EnvironmentBased Constraints (EBC). We discuss how each one of these types of domain knowledge is incorporated into the discovery process within the EDM (Evidential Data Mining) framework for Data Mining proposed earlier by the authors [ANAN94], and in particular within the STRIP (Strong Rule Induction in Parallel) algorithm [ANAN95] implemented within the EDM framework. We highlight the advantages of using domain knowledge within the discovery process by providing results from the application of the STRIP algorithm in the actuarial domain.

98 citations


Network Information
Related Topics (5)
Cluster analysis
146.5K papers, 2.9M citations
90% related
Support vector machine
73.6K papers, 1.7M citations
90% related
Artificial neural network
207K papers, 4.5M citations
87% related
Fuzzy logic
151.2K papers, 2.3M citations
86% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023120
2022285
2021506
2020660
2019740
2018683