scispace - formally typeset
Search or ask a question
Topic

Knowledge extraction

About: Knowledge extraction is a research topic. Over the lifetime, 20251 publications have been published within this topic receiving 413401 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper presents the experiences of the authors and others in applying exploratory data mining techniques to medical, health and clinical data and provides pointers to possible areas of future research in data mining and knowledge discovery more broadly.
Abstract: The application of data mining and knowledge discovery techniques to medical and health datasets is a rewarding but highly challenging area. Not only are the datasets large, complex, heterogeneous, hierarchical, time-varying and of varying quality but there exists asubstantial medical knowledge base which demands a robust collaboration between the data miner and the health professional(s) if useful information is to be extracted.This paper presents the experiences of the authors and others in applying exploratory data mining techniques to medical, health and clinical data. In so doing, it elicits a number of general issues and provides pointers to possible areas of future research in data mining and knowledge discovery more broadly.

104 citations

Proceedings Article
03 Aug 2013
TL;DR: This paper proposes a novel knowledge-based model, called MDK-LDA, which is capable of using prior knowledge from multiple domains, and its evaluation results will demonstrate its effectiveness.
Abstract: Topic models have been widely used to identify topics in text corpora. It is also known that purely unsupervised models often result in topics that are not comprehensible in applications. In recent years, a number of knowledge-based models have been proposed, which allow the user to input prior knowledge of the domain to produce more coherent and meaningful topics. In this paper, we go one step further to study how the prior knowledge from other domains can be exploited to help topic modeling in the new domain. This problem setting is important from both the application and the learning perspectives because knowledge is inherently accumulative. We human beings gain knowledge gradually and use the old knowledge to help solve new problems. To achieve this objective, existing models have some major difficulties. In this paper, we propose a novel knowledge-based model, called MDK-LDA, which is capable of using prior knowledge from multiple domains. Our evaluation results will demonstrate its effectiveness.

104 citations

Proceedings Article
01 Jan 2001
TL;DR: A Semantic Annotation Tool for extraction of knowledge structures from web pages through the use of simple user-defined knowledge extraction patterns and to provide support for ontology population by using the information extraction component.
Abstract: This paper describes a Semantic Annotation Tool for extraction of knowledge structures from web pages through the use of simple user-defined knowledge extraction patterns. The semantic annotation tool contains: an ontology-based mark-up component which allows the user to browse and to mark-up relevant pieces of information; a learning component (Crystal from the University of Massachusetts at Amherst) which learns rules from examples and an information extraction component which extracts the objects and relation between these objects. Our final aim is to provide support for ontology population by using the information extraction component. Our system uses as domain of study “KMi Planet”, a Webbased news server that helps to communicate relevant information between members in our institute.

104 citations

Journal ArticleDOI
01 Dec 2004
TL;DR: A visual concept ontology is proposed to be used to guide experts in the visual description of the objects of their domain (e.g., pollen grain) to result in a knowledge base enabling semantic image interpretation.
Abstract: This paper details a visual-concept-ontology-driven knowledge acquisition methodology. We propose to use a visual concept ontology to guide experts in the visual description of the objects of their domain (e.g., pollen grain). The proposed knowledge acquisition process results in a knowledge base enabling semantic image interpretation. An important benefit of our approach is that the knowledge acquisition process guided by the ontology leads to a knowledge base close to low-level vision. A visual concept ontology and a dedicated knowledge acquisition tool have been developed and are presented. We propose a generic methodology that is not linked to any application domain. An example shows how the knowledge acquisition model can be applied to the description of pollen grain images.

103 citations

01 Apr 2004
TL;DR: This thesis is focused on the monotonicity property in knowledge discovery and more specifically in classification, attribute reduction, function decomposition, frequent patterns generation and missing values handling.
Abstract: textThe monotonicity property is ubiquitous in our lives and it appears in different roles: as domain knowledge, as a requirement, as a property that reduces the complexity of the problem, and so on. It is present in various domains: economics, mathematics, languages, operations research and many others. This thesis is focused on the monotonicity property in knowledge discovery and more specifically in classification, attribute reduction, function decomposition, frequent patterns generation and missing values handling. Four specific problems are addressed within four different methodologies, namely, rough sets theory, monotone decision trees, function decomposition and frequent patterns generation. In the first three parts, the monotonicity is domain knowledge and a requirement for the outcome of the classification process. The three methodologies are extended for dealing with monotone data in order to be able to guarantee that the outcome will also satisfy the monotonicity requirement. In the last part, monotonicity is a property that helps reduce the computation of the process of frequent patterns generation. Here the focus is on two of the best algorithms and their comparison both theoretically and experimentally. About the Author: Viara Popova was born in Bourgas, Bulgaria in 1972. She followed her secondary education at Mathematics High School "Nikola Obreshkov" in Bourgas. In 1996 she finished her higher education at Sofia University, Faculty of Mathematics and Informatics where she graduated with major in Informatics and specialization in Information Technologies in Education. She then joined the Department of Information Technologies, First as an associated member and from 1997 as an assistant professor. In 1999 she became a PhD student at Erasmus University Rotterdam, Faculty of Economics, Department of Computer Science. In 2004 she joined the Artificial Intelligence Group within the Department of Computer Science, Faculty of Sciences at Vrije Universiteit Amsterdam as a PostDoc researcher.

103 citations


Network Information
Related Topics (5)
Cluster analysis
146.5K papers, 2.9M citations
90% related
Support vector machine
73.6K papers, 1.7M citations
90% related
Artificial neural network
207K papers, 4.5M citations
87% related
Fuzzy logic
151.2K papers, 2.3M citations
86% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023120
2022285
2021506
2020660
2019740
2018683