scispace - formally typeset
Open AccessJournal ArticleDOI

A Review: Data Pre-Processing and Data Augmentation Techniques

Kiran Maharana, +2 more
- 01 Apr 2022 - 
- Vol. 3, Iss: 1, pp 91-99
Reads0
Chats0
TLDR
In this paper , the authors provide an overview of data pre-processing in machine learning, focusing on all types of problems while building the machine learning problems and discuss flipping, rotating with slight degrees and others to augment the image data.
Abstract
This review paper provides an overview of data pre-processing in Machine learning, focusing on all types of problems while building the machine learning problems. It deals with two significant issues in the pre-processing process (i). issues with data and (ii). Steps to follow to do data analysis with its best approach. As raw data are vulnerable to noise, corruption, missing, and inconsistent data, it is necessary to perform pre-processing steps, which is done using classification, clustering, and association and many other pre-processing techniques available. Poor data can primarily affect the accuracy and lead to false prediction, so it is necessary to improve the dataset's quality. So, data pre-processing is the best way to deal with such problems. It makes the knowledge extraction from the data set much easier with cleaning, Integration, transformation, and reduction methods. The issue with Data missing and significant differences in the variety of data always exists as the information is collected through multiple sources and from a real-world application. So, the data augmentation approach generates data for machine learning models. To decrease the dependency on training data and to improve the performance of the machine learning model. This paper discusses flipping, rotating with slight degrees and others to augment the image data and shows how to perform data augmentation methods without distorting the original data.

read more

Citations
More filters
Journal ArticleDOI

Data augmentation: A comprehensive survey of modern approaches

Alhassan G. Mumuni, +1 more
- 01 Nov 2022 - 
TL;DR: Data augmentation is the most effective way of alleviating the problem of data collection and annotation processes and consumes a lot of time and resources as mentioned in this paper , which is the main goal of data augmentation, to increase the volume, quality and diversity of training data.
Journal ArticleDOI

Data augmentation for univariate time series forecasting with neural networks

TL;DR: In this article , the authors investigate nine data augmentation techniques, ranging from simple transformations and adjustments to sophisticated generative models and a novel upsampling approach, and empirically evaluate the impact of these techniques on forecasting accuracy considering both shallow and deep feed-forward neural networks.
Journal ArticleDOI

Experimental Investigation of Efficiency of Worm Gears and Modeling of Power Loss through Artificial Neural Networks

Yunus Emre Karabacak, +1 more
- 01 Aug 2022 - 
TL;DR: In this article , an experimental system that can operate at different speeds and loading rates was developed for the efficiency calculations of worm gears (WGs), and measurements were made accordingly, a comprehensive efficiency analysis has been made for WGs based on power loss calculations under different operating conditions.
Journal ArticleDOI

Towards automated eye cancer classification via VGG and ResNet networks using transfer learning

TL;DR: In this article , a convolutional neural network (CNN) with transfer learning was used to detect uveal melanoma (UM), a type of ocular cancer, achieving an improved sensitivity, precision and accuracy of 99, 98% and 99%, respectively.
Journal ArticleDOI

AlexNet‐NDTL: Classification of MRI brain tumor images using modified AlexNet with deep transfer learning and Lipschitz‐based data augmentation

TL;DR: In this article, the authors used Lipschitz-based data augmentation on a dataset, and the output of the augmentation model was fed into a modified AlexNet that uses network-based deep transfer learning to extract features from a dataset.
References
More filters
Journal ArticleDOI

Data Preprocessing and Intelligent Data Analysis

TL;DR: This paper first provides an overview of data preprocessing, focusing on problems of real world data, and details of dataPreprocessing techniques achieving each of the above mentioned objectives.
Journal ArticleDOI

Dealing with noise problem in machine learning data-sets: A systematic review

TL;DR: Among noise identification schemes, the accuracy of identification of noisy instances by using ensemble-based techniques are better than other techniques, but regarding efficiency, usually single based techniques method is better; it is more suitable for noisy data sets.
Journal ArticleDOI

Land-Use and Land-Cover Classification Using a Human Group-Based Particle Swarm Optimization Algorithm with an LSTM Classifier on Hybrid Pre-Processing Remote-Sensing Images

TL;DR: A hybrid feature optimization algorithm along with a deep learning classifier is proposed to improve performance of LULC classification, helping to predict wildlife habitat, deteriorating environmental quality, haphazard, etc.
Journal ArticleDOI

Deep Learning-Embedded Social Internet of Things for Ambiguity-Aware Social Recommendations

TL;DR: A deep learning-embedded social Internet of Things (IoT) architecture is developed for social computing scenarios to guarantee reliable data management and overcomes the preference ambiguity problem in SR.
Journal ArticleDOI

Fine-tuned support vector regression model for stock predictions

TL;DR: In this paper, a new machine learning (ML) technique is proposed that uses the fine-tuned version of support vector regression for stock forecasting of time series data, and Grid search technique is applied over training dataset to select the best kernel function and to optimize its parameters.
Related Papers (5)
Trending Questions (1)
How can pre-processing techniques be used to improve the performance of machine learning algorithms on social media data?

The provided paper does not specifically mention the use of pre-processing techniques to improve the performance of machine learning algorithms on social media data.