scispace - formally typeset
Search or ask a question
Topic

Knowledge extraction

About: Knowledge extraction is a research topic. Over the lifetime, 20251 publications have been published within this topic receiving 413401 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: The CityPulse framework supports smart city service creation by means of a distributed system for semantic discovery, data analytics, and interpretation of large-scale (near-)real-time Internet of Things data and social media data streams to break away from silo applications and enable cross-domain data integration.
Abstract: Our world and our lives are changing in many ways. Communication, networking, and computing technologies are among the most influential enablers that shape our lives today. Digital data and connected worlds of physical objects, people, and devices are rapidly changing the way we work, travel, socialize, and interact with our surroundings, and they have a profound impact on different domains, such as healthcare, environmental monitoring, urban systems, and control and management applications, among several other areas. Cities currently face an increasing demand for providing services that can have an impact on people’s everyday lives. The CityPulse framework supports smart city service creation by means of a distributed system for semantic discovery, data analytics, and interpretation of large-scale (near-)real-time Internet of Things data and social media data streams. To goal is to break away from silo applications and enable cross-domain data integration. The CityPulse framework integrates multimodal, mixed quality, uncertain and incomplete data to create reliable, dependable information and continuously adapts data processing techniques to meet the quality of information requirements from end users. Different than existing solutions that mainly offer unified views of the data, the CityPulse framework is also equipped with powerful data analytics modules that perform intelligent data aggregation, event detection, quality assessment, contextual filtering, and decision support. This paper presents the framework, describes its components, and demonstrates how they interact to support easy development of custom-made applications for citizens. The benefits and the effectiveness of the framework are demonstrated in a use-case scenario implementation presented in this paper.

199 citations

Book
06 Feb 2008
TL;DR: The GeoPKDD project as mentioned in this paper, a research project for Geographic Privacy-Aware Knowledge Discovery and Delivery (GeoPKDD), is an example of such a project, involving 40 researchers from 7 countries.
Abstract: The technologies of mobile communications and ubiquitous computing pervade our society, and wireless networks sense the movement of people and vehicles, generating large volumes of mobility data. This is a scenario of great opportunities and risks: on one side, mining this data can produce useful knowledge, supporting sustainable mobility and intelligent transportation systems; on the other side, individual privacy is at risk, as the mobility data contain sensitive personal information. A new multidisciplinary research area is emerging at this crossroads of mobility, data mining, and privacy. This book assesses this research frontier from a computer science perspective, investigating the various scientific and technological issues, open problems, and roadmap. The editors manage a research project called GeoPKDD, Geographic Privacy-Aware Knowledge Discovery and Delivery, funded by the EU Commission and involving 40 researchers from 7 countries, and this book tightly integrates and relates their findings in 13 chapters covering all related subjects, including the concepts of movement data and knowledge discovery from movement data; privacy-aware geographic knowledge discovery; wireless network and next-generation mobile technologies; trajectory data models, systems and warehouses; privacy and security aspects of technologies and related regulations; querying, mining and reasoning on spatiotemporal data; and visual analytics methods for movement data. This book will benefit researchers and practitioners in the related areas of computer science, geography, social science, statistics, law, telecommunications and transportation engineering.

198 citations

BookDOI
01 Jan 2005
TL;DR: In this paper, a WSRF-enabled Weka Toolkit for Distributed Data Mining on Grids is presented for NER using Inductive Logic Programming for predicting protein-protein interactions from multiple Genomic Data.
Abstract: Invited Talks.- Data Analysis in the Life Sciences - Sparking Ideas -.- Machine Learning for Natural Language Processing (and Vice Versa?).- Statistical Relational Learning: An Inductive Logic Programming Perspective.- Recent Advances in Mining Time Series Data.- Focus the Mining Beacon: Lessons and Challenges from the World of E-Commerce.- Data Streams and Data Synopses for Massive Data Sets.- Long Papers.- k-Anonymous Patterns.- Interestingness is Not a Dichotomy: Introducing Softness in Constrained Pattern Mining.- Generating Dynamic Higher-Order Markov Models in Web Usage Mining.- Tree 2 - Decision Trees for Tree Structured Data.- Agglomerative Hierarchical Clustering with Constraints: Theoretical and Empirical Results.- Cluster Aggregate Inequality and Multi-level Hierarchical Clustering.- Ensembles of Balanced Nested Dichotomies for Multi-class Problems.- Protein Sequence Pattern Mining with Constraints.- An Adaptive Nearest Neighbor Classification Algorithm for Data Streams.- Support Vector Random Fields for Spatial Classification.- Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication.- A Correspondence Between Maximal Complete Bipartite Subgraphs and Closed Patterns.- Improving Generalization by Data Categorization.- Mining Model Trees from Spatial Data.- Word Sense Disambiguation for Exploiting Hierarchical Thesauri in Text Classification.- Mining Paraphrases from Self-anchored Web Sentence Fragments.- M2SP: Mining Sequential Patterns Among Several Dimensions.- A Systematic Comparison of Feature-Rich Probabilistic Classifiers for NER Tasks.- Knowledge Discovery from User Preferences in Conversational Recommendation.- Unsupervised Discretization Using Tree-Based Density Estimation.- Weighted Average Pointwise Mutual Information for Feature Selection in Text Categorization.- Non-stationary Environment Compensation Using Sequential EM Algorithm for Robust Speech Recognition.- Hybrid Cost-Sensitive Decision Tree.- Characterization of Novel HIV Drug Resistance Mutations Using Clustering, Multidimensional Scaling and SVM-Based Feature Ranking.- Object Identification with Attribute-Mediated Dependences.- Weka4WS: A WSRF-Enabled Weka Toolkit for Distributed Data Mining on Grids.- Using Inductive Logic Programming for Predicting Protein-Protein Interactions from Multiple Genomic Data.- ISOLLE: Locally Linear Embedding with Geodesic Distance.- Active Sampling for Knowledge Discovery from Biomedical Data.- A Multi-metric Index for Euclidean and Periodic Matching.- Fast Burst Correlation of Financial Data.- A Propositional Approach to Textual Case Indexing.- A Quantitative Comparison of the Subgraph Miners MoFa, gSpan, FFSM, and Gaston.- Efficient Classification from Multiple Heterogeneous Databases.- A Probabilistic Clustering-Projection Model for Discrete Data.- Short Papers.- Collaborative Filtering on Data Streams.- The Relation of Closed Itemset Mining, Complete Pruning Strategies and Item Ordering in Apriori-Based FIM Algorithms.- Community Mining from Multi-relational Networks.- Evaluating the Correlation Between Objective Rule Interestingness Measures and Real Human Interest.- A Kernel Based Method for Discovering Market Segments in Beef Meat.- Corpus-Based Neural Network Method for Explaining Unknown Words by WordNet Senses.- Segment and Combine Approach for Non-parametric Time-Series Classification.- Producing Accurate Interpretable Clusters from High-Dimensional Data.- Stress-Testing Hoeffding Trees.- Rank Measures for Ordering.- Dynamic Ensemble Re-Construction for Better Ranking.- Frequency-Based Separation of Climate Signals.- Efficient Processing of Ranked Queries with Sweeping Selection.- Feature Extraction from Mass Spectra for Classification of Pathological States.- Numbers in Multi-relational Data Mining.- Testing Theories in Particle Physics Using Maximum Likelihood and Adaptive Bin Allocation.- Improved Naive Bayes for Extremely Skewed Misclassification Costs.- Clustering and Prediction of Mobile User Routes from Cellular Data.- Elastic Partial Matching of Time Series.- An Entropy-Based Approach for Generating Multi-dimensional Sequential Patterns.- Visual Terrain Analysis of High-Dimensional Datasets.- An Auto-stopped Hierarchical Clustering Algorithm for Analyzing 3D Model Database.- A Comparison Between Block CEM and Two-Way CEM Algorithms to Cluster a Contingency Table.- An Imbalanced Data Rule Learner.- Improvements in the Data Partitioning Approach for Frequent Itemsets Mining.- On-Line Adaptive Filtering of Web Pages.- A Bi-clustering Framework for Categorical Data.- Privacy-Preserving Collaborative Filtering on Vertically Partitioned Data.- Indexed Bit Map (IBM) for Mining Frequent Sequences.- STochFS: A Framework for Combining Feature Selection Outcomes Through a Stochastic Process.- Speeding Up Logistic Model Tree Induction.- A Random Method for Quantifying Changing Distributions in Data Streams.- Deriving Class Association Rules Based on Levelwise Subspace Clustering.- An Incremental Algorithm for Mining Generators Representation.- Hybrid Technique for Artificial Neural Network Architecture and Weight Optimization.

198 citations

Journal ArticleDOI
TL;DR: A data mining technique that integrates neural network, case-based reasoning, and rule- based reasoning is proposed; it would search the unstructured customer service records for machine fault diagnosis.

198 citations


Network Information
Related Topics (5)
Cluster analysis
146.5K papers, 2.9M citations
90% related
Support vector machine
73.6K papers, 1.7M citations
90% related
Artificial neural network
207K papers, 4.5M citations
87% related
Fuzzy logic
151.2K papers, 2.3M citations
86% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023120
2022285
2021506
2020660
2019740
2018683