scispace - formally typeset
Search or ask a question
Topic

Knowledge extraction

About: Knowledge extraction is a research topic. Over the lifetime, 20251 publications have been published within this topic receiving 413401 citations.


Papers
More filters
Proceedings Article
01 Aug 1999
TL;DR: An overview of approaches for ontologies and problem-solving methods is given, which can be viewed as complementary entities that can be used to configure new knowledge systems from existing, reusable components.
Abstract: Ontologies and problem-solving methods are promising candidates for reuse in Knowledge Engineering. Ontologies define domain knowledge at a generic level, while problem-solving methods specify generic reasoning knowledge. Both type of components can be viewed as complementary entities that can be used to configure new knowledge systems from existing, reusable components. In this paper, we give an overview of approaches for ontologies and problem-solving methods.

418 citations

Proceedings ArticleDOI
01 Apr 1997
TL;DR: The study shows that the new incremental algorithm is signijcantly faster than the traditional approach of mining the whole updated database, and compared with the best algorithms for mining association rules studied so far.
Abstract: A more general incremental updating technique is developed for maintaining the association rules discovered in a database in the cases including insertion, deletion, and modijication of transactions in the database. A previously proposed algorithm FUP can only handle the maintenance problem in the case of insertion. The proposed algorithm FUP2 makes use of the previous mining result to cut down the cost of finding the new rules in an updated database. In the insertion only case, FUP2 is equivalent to FUP. In the deletion only case, FUP2 is a complementary algorithm of FUP which is very eficient when the deleted transactions is a small part of the database, which is the most applicable case. In the general case, FUP2 can elqiciently update the discovered rules when new transactions are added to a transaction database, and obsolete transactions are removed from it. The proposed algorithm has been implemented and its performance is studied and compared with the best algorithms for mining association rules studied so far. The study shows that the new incremental algorithm is signijcantly faster than the traditional approach of mining the whole updated database.

412 citations

Journal ArticleDOI
TL;DR: This paper provides an introduction to ontology-based information extraction and reviews the details of different OBIE systems developed so far to identify a common architecture among these systems and classify them based on different factors, which leads to a better understanding on their operation.
Abstract: Information extraction (IE) aims to retrieve certain types of information from natural language text by processing them automatically. For example, an IE system might retrieve information about geopolitical indicators of countries from a set of web pages while ignoring other types of information. Ontology-based information extraction (OBIE) has recently emerged as a subfield of information extraction. Here, ontologies - which provide formal and explicit specifications of conceptualizations - play a crucial role in the IE process. Because of the use of ontologies, this field is related to knowledge representation and has the potential to assist the development of the Semantic Web. In this paper, we provide an introduction to ontology-based information extraction and review the details of different OBIE systems developed so far. We attempt to identify a common architecture among these systems and classify them based on different factors, which leads to a better understanding on their operation. We also discuss the implementation details of these systems including the tools used by them and the metrics used to measure their performance. In addition, we attempt to identify the possible future directions for this field.

409 citations

Journal ArticleDOI
TL;DR: A model for a new type of topic-specific overview resource that provides efficient access to distributed information is developed, which is a freely accessible Web resource that offers one hypertext 'card' for each of the more than 7000 human genes that have an approved gene symbol published by the HUGO/GDB nomenclature committee.
Abstract: Motivation: Modem biology is shifting from the 'one gene one postdoc' approach to genomic analyses that include the simultaneous monitoring of thousands of genes. The importance of efficient access to concise and integrated biomedical information to support data analysis and decision making is therefore increasing rapidly, in both academic and industrial research. However, knowledge discovery in the widely scattered resources relevant for biomedical research is often a cumbersome and non-trivial task, one that requires a significant amount of training and effort. Results: To develop a model for a new type of topic-specific overview resource that provides efficient access to distributed information, we designed a database called 'GeneCards'. It is a freely accessible Web resource that offers one hypertext 'card' for each of the more than 7000 human genes that currently have an approved gene symbol published by the HUGO/GDB nomenclature committee. The presented information aims at giving immediate insight into current knowledge about the respective gene, including a focus on its functions in health and disease. It is compiled by Perl scripts that automatically extract relevant information from several databases, including SWISS-PROT, OMIM, Genatlas and GDB. Analyses of the interactions of users with the Web interface of GeneCards triggered development of easy-to-scan displays optimized for human browsing. Also, we developed algorithms that offer 'ready-to-click' query reformulation support, to facilitate information retrieval and exploration. Many of the long-term users turn to GeneCards to quickly access information about the function of very large sets of genes, for example in the realm of large-scale expression studies using 'DNA chip' technology or two-dimensional protein electrophoresis. Availability: Freely available at http://bioinformatics.weizmann.ac.il/cards/ Contact: cards@bioinformatics.weizmann.ac. il.

402 citations


Network Information
Related Topics (5)
Cluster analysis
146.5K papers, 2.9M citations
90% related
Support vector machine
73.6K papers, 1.7M citations
90% related
Artificial neural network
207K papers, 4.5M citations
87% related
Fuzzy logic
151.2K papers, 2.3M citations
86% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023120
2022285
2021506
2020660
2019740
2018683