Topic

Knowledge extraction

About: Knowledge extraction is a research topic. Over the lifetime, 20251 publications have been published within this topic receiving 413401 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Discovery of frequent DATALOG patterns

[...]

Luc Dehaspe¹, Hannu Toivonen²•Institutions (2)

Katholieke Universiteit Leuven¹, University of Helsinki²

01 Mar 1999-Data Mining and Knowledge Discovery

TL;DR: WARMR is presented, a general purpose inductive logic programming algorithm that addresses frequent query discovery: a very general DATALOG formulation of the frequent pattern discovery problem.

...read moreread less

Abstract: Discovery of frequent patterns has been studied in a variety of data mining settings. In its simplest form, known from association rule mining, the task is to discover all frequent itemsets, i.e., all combinations of items that are found in a sufficient number of examples. The fundamental task of association rule and frequent set discovery has been extended in various directions, allowing more useful patterns to be discovered with special purpose algorithms. We present WARMR, a general purpose inductive logic programming algorithm that addresses frequent query discovery: a very general DATALOG formulation of the frequent pattern discovery problem. The motivation for this novel approach is twofold. First, exploratory data mining is well supported: WARMR offers the flexibility required to experiment with standard and in particular novel settings not supported by special purpose algorithms. Also, application prototypes based on WARMR can be used as benchmarks in the comparison and evaluation of new special purpose algorithms. Second, the unified representation gives insight to the blurred picture of the frequent pattern discovery domain. Within the DATALOG formulation a number of dimensions appear that relink diverged settings. We demonstrate the frequent query approach and its use on two applications, one in alarm analysis, and one in a chemical toxicology domain.

...read moreread less

330 citations

Journal Article•DOI•

Image Captioning and Visual Question Answering Based on Attributes and External Knowledge

[...]

Qi Wu¹, Chunhua Shen¹, Peng Wang¹, Anthony Dick¹, Anton van den Hengel¹ - Show less +1 more•Institutions (1)

University of Adelaide¹

01 Jun 2018-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A visual question answering model that combines an internal representation of the content of an image with information extracted from a general knowledge base to answer a broad range of image-based questions and allows questions to be asked where the image alone does not contain the information required to select the appropriate answer.

...read moreread less

Abstract: Much of the recent progress in Vision-to-Language problems has been achieved through a combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). This approach does not explicitly represent high-level semantic concepts, but rather seeks to progress directly from image features to text. In this paper we first propose a method of incorporating high-level concepts into the successful CNN-RNN approach, and show that it achieves a significant improvement on the state-of-the-art in both image captioning and visual question answering. We further show that the same mechanism can be used to incorporate external knowledge, which is critically important for answering high level visual questions. Specifically, we design a visual question answering model that combines an internal representation of the content of an image with information extracted from a general knowledge base to answer a broad range of image-based questions. It particularly allows questions to be asked where the image alone does not contain the information required to select the appropriate answer. Our final model achieves the best reported results for both image captioning and visual question answering on several of the major benchmark datasets.

...read moreread less

329 citations

Book•

Data Mining: A Tutorial Based Primer

[...]

Richard J. Roiger

06 Oct 2002

TL;DR: This chapter discusses data mining techniques for managing Uncertainty in Rule-Based Systems, which involves Integrating Data Mining, Expert Systems, and Intelligent Agents.

...read moreread less

Abstract: (Each Chapter concludes with a Chapter Summary, Key Terms, and Exercises.) Preface. I. DATA MINING FUNDAMENTALS. 1. Data Mining: A First View. Data Mining: A Definition. What Can Computers Learn? Is Data Mining Appropriate for my Problem? Expert Systems or Data Mining? A Simple Data Mining Process Model. Why not Simple Search? Data Mining Applications. 2. Data Mining: A Closer Look. Data Mining Strategies. Supervised Data Mining Techniques. Association Rules. Clustering Techniques. Evaluating Performance. 3. Basic Data Mining Techniques. Decision Trees. Generating Association Rules. The K-Means Algorithm. Genetic Learning. Choosing a Data Mining Technique. 4. An Excel-Based Data Mining Tool. The iData Analyzer. ESX: A Multipurpose Tool for Data Mining. iDAV Format for Data Mining. A Five-Step Approach for Unsupervised Clustering. A Six-Step Approach for Supervised Learning. Techniques for Generating Rules. Instance Typicality. Special Considerations and Features. II. TOOLS FOR KNOWLEDGE DISCOVERY. 5. Knowledge Discovery in Databases. A KDD Process Model. Step 1: Goal Identification. Step 2: Creating a Target Data Set. Step 3: Data Preprocessing. Step 4: Data Transformation. Step 5: Data Mining. Step 6: Interpretation and Evaluation. Step 7: Taking Action. The CRISP-DM Process Model. Experimenting with ESX. 6. The Data Warehouse. Operational Databases. Data Warehouse Design. On-line Analytical Processing (OLAP). Excel Pivot Tables for Data Analysis. 7. Formal Evaluation Techniques. What Should be Evaluated? Tools for Evaluation. Computing Test Set Confidence Intervals. Comparing Supervised Learner Models. Attribute Evaluation. Unsupervised Evaluation Techniques. Evaluating Supervised Models with Numeric Output. III. ADVANCED DATA MINING TECHNIQUES. 8. Neural Networks. Feed-Forward Neural Networks. Neural Network Training: A Conceptual View. Neural Network Explanation. General Considerations. Neural Network Learning: A Detailed View. 9. Building Neural Networks with iDA. A Four-Step Approach for Backpropagation Learning. A Four-Step Approach for Neural Network Clustering. ESX for Neural Network Cluster Analysis. 10. Statistical Techniques. Linear Regression Analysis. Logistic Regression. Bayes Classifier. Clustering Algorithms. Heuristics or Statistics? 11. Specialized Techniques. Time-Series Analysis. Mining the Web. Mining Textual Data. Improving Performance. IV. INTELLIGENT SYSTEMS. 12. Rule-Based Systems. Exploring Artificial Intelligence. Problem Solving as a State Space Search. Expert Systems. Structuring a Rule-Based System. 13. Managing Uncertainty in Rule-Based Systems. Uncertainty: Sources and Solutions. Fuzzy Rule-Based Systems. A Probability-Based Approach to Uncertainty. 14. Intelligent Agents. Characteristics of Intelligent Agents. Types of Agents. Integrating Data Mining, Expert Systems, and Intelligent Agents. Appendix. Appendix A: Software Installation. Appendix B: Datasets for Data Mining. Appendix C: Decision Tree Attribute Selection. Appendix D: Statistics for Performance Evaluation. Appendix E: Excel 97 Pivot Tables. Bibliography.

...read moreread less

326 citations

Journal Article•DOI•

Using evolutionary algorithms as instance selection for data reduction in KDD: an experimental study

[...]

José Ramón Cano¹, Francisco Herrera², Manuel Lozano²•Institutions (2)

University of Huelva¹, University of Granada²

01 Dec 2003-IEEE Transactions on Evolutionary Computation

TL;DR: The results show that the evolutionary instance selection algorithms consistently outperform the nonevolutionary ones, the main advantages being: better instance reduction rates, higher classification accuracy, and models that are easier to interpret.

...read moreread less

Abstract: Evolutionary algorithms are adaptive methods based on natural evolution that may be used for search and optimization As data reduction in knowledge discovery in databases (KDDs) can be viewed as a search problem, it could be solved using evolutionary algorithms (EAs) In this paper, we have carried out an empirical study of the performance of four representative EA models in which we have taken into account two different instance selection perspectives, the prototype selection and the training set selection for data reduction in KDD This paper includes a comparison between these algorithms and other nonevolutionary instance selection algorithms The results show that the evolutionary instance selection algorithms consistently outperform the nonevolutionary ones, the main advantages being: better instance reduction rates, higher classification accuracy, and models that are easier to interpret

...read moreread less

325 citations

Journal Article•DOI•

Reactome graph database: Efficient access to complex pathway data.

[...]

Antonio Fabregat¹, Florian Korninger¹, Guilherme Viteri¹, Konstantinos Sidiropoulos¹, Pablo Marin-Garcia², Peipei Ping³, Guanming Wu⁴, Lincoln Stein⁵, Lincoln Stein⁶, Peter D'Eustachio⁷, Henning Hermjakob¹, Henning Hermjakob⁸ - Show less +8 more•Institutions (8)

European Bioinformatics Institute¹, University of Valencia², University of California, Los Angeles³, Oregon Health & Science University⁴, Ontario Institute for Cancer Research⁵, University of Toronto⁶, New York University⁷, Protein Sciences⁸

29 Jan 2018-PLOS Computational Biology

TL;DR: The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery and the Reactome graph database use case shows the power of NoSQL database engines for complex biological data types.

...read moreread less

Abstract: Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways One of its main priorities is to provide easy and efficient access to its high quality curated data At present, biological pathway databases typically store their contents in relational databases This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data The same data in a graph database can be queried more efficiently Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery The adoption of this technology greatly improved query efficiency, reducing the average query time by 93% The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage By adopting graph database technology we are providing a high performance pathway data resource to the community The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types

...read moreread less

324 citations

Collapse

Network Information

Performance

Metrics

20,644

Papers

453,302

Citations

No. of papers in the topic in previous years
Year	Papers
2023	120
2022	285
2021	506
2020	660
2019	740
2018	683

Knowledge extraction

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics