scispace - formally typeset
Journal ArticleDOI

Toward semantic data imputation for a dengue dataset

TLDR
An improvement in the efficiency of predicting missing data utilizing Particle Swarm Optimization (PSO), which is applied to the numerical data cleansing problem, with the performance of PSO being enhanced using K-means to help determine the fitness value.
Abstract
Missing data are a major problem that affects data analysis techniques for forecasting. Traditional methods suffer from poor performance in predicting missing values using simple techniques, e.g., mean and mode. In this paper, we present and discuss a novel method of imputing missing values semantically with the use of an ontology model. We make three new contributions to the field: first, an improvement in the efficiency of predicting missing data utilizing Particle Swarm Optimization (PSO), which is applied to the numerical data cleansing problem, with the performance of PSO being enhanced using K-means to help determine the fitness value. Second, the incorporation of an ontology with PSO for the purpose of narrowing the search space, to make PSO provide greater accuracy in predicting numerical missing values while quickly converging on the answer. Third, the facilitation of a framework to substitute nominal data that are lost from the dataset using the relationships of concepts and a reasoning mechanism concerning the knowledge-based model. The experimental results indicated that the proposed method could estimate missing data more efficiently and with less chance of error than conventional methods, as measured by the root mean square error.

read more

Citations
More filters
Journal ArticleDOI

A Critical Review of Real-Time Modelling of Flood Forecasting in Urban Drainage Systems

TL;DR: In this article , the authors present a comprehensive review of the current state-of-the-art and future trends of real-time modelling of flood forecasting in urban drainage systems.
Journal ArticleDOI

Semantic data mining in the information age: A systematic review

TL;DR: A comprehensive overview of the literature on domain ontologies as used in the various semantic data‐mining tasks, such as preprocessing, modeling, and postprocessing is provided.
Journal ArticleDOI

Virtual sensor-based imputed graph attention network for anomaly detection of equipment with incomplete data

TL;DR: Wang et al. as discussed by the authors proposed a virtual sensor-based imputed graph attention network, which generates signals to impute the time of sensor record failure by generative adversarial network (GAN) and extracts the features of complete signals mixed with real signals and generated signals by GAT.
Posted Content

Nearest Neighbor Imputation for Categorical Data by Weighting of Attributes

Shahla Faisal, +1 more
- 03 Oct 2017 - 
TL;DR: The weighted nearest neighbors approach is extended to impute missing values in categorical variables and shows that the weighting of attributes yields smaller imputation errors than existing approaches.
Journal ArticleDOI

Intelligent approach to automated star-schema construction using a knowledge base

TL;DR: A new strategy that incorporates knowledge-based models into a framework, named the Semantic-based Star-schema Designer, that assists the automation of star schema construction and their relationship information without human intervention using homegrown algorithms.
References
More filters

Predicting Missing Attribute Values Using Cooperative Particle Swarm Optimization

TL;DR: This paper proposes a new method which using Cooperative Particle Swarm Optimization offer for predicting missing attribute values without extracting data relations and it does not need knowledge of professional person for detection relation between data.
Book ChapterDOI

Lessons Learned — The Case of CROCUS: Cluster-Based Ontology Data Cleansing

TL;DR: This system provides a semi-automatic approach for instance-level error detection in ontologies which is agnostic of the underlying Linked Data knowledge base and works at very low costs.
Book ChapterDOI

New Genetic Algorithm for the p-Median Problem

TL;DR: This work proposes a new genetic algorithm for the p-median problem and evaluates it in a series of computational experiments.
Proceedings ArticleDOI

Imputing missing values in microarray data with ontology information

TL;DR: This paper proposes a novel approach to deal with missing values in microarray data based on a practical similarity measurement between gene pairs, which takes gene expression values and gene ontology information for genes into consideration.
Journal ArticleDOI

New Key Factors Discovery to Enhance Dengue Fever Forecasting Model

TL;DR: The research constructs a forecast model using Poisson regression analysis and shows the proposed model obtains significantly low forecasting error rate when compared it against the conventional model using only temperature, humidity, wind speed, and rainfall parameters.
Related Papers (5)