scispace - formally typeset
Search or ask a question
Author

Shweta Srivastava

Bio: Shweta Srivastava is an academic researcher. The author has contributed to research in topics: Data pre-processing & Feature selection. The author has an hindex of 1, co-authored 1 publications receiving 58 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: The fundamentals of data mining steps like preprocessing the data, feature selection (to select the relevant features and removing the irrelevant and redundant features), classification and evaluation of different classifier models using WEKA tool are given.
Abstract: basic principle of data mining is to analyze the data from different perspectives, classify it and recapitulate it. Data mining has become very popular in each and every application. Though we have large amount of data but we don't have useful information in every field. There are many data mining tools and software to facilitate us the useful information. This paper gives the fundamentals of data mining steps like preprocessing the data (removing the noisy data, replacing the missing values etc.), feature selection (to select the relevant features and removing the irrelevant and redundant features), classification and evaluation of different classifier models using WEKA tool. The WEKA tool is not useful for only one type of application, though it can be used in various applications. This tool consists of various algorithms for feature selection, classification and clustering as well. Keywordsfeature selection, classification, clustering, evaluation of classifier models, evaluation of cluster models.

75 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The recent advancement in the IDS datasets that can be used by various research communities as the manifesto for using the new IDS dataset for developing efficient and effective ML and DM based IDS.

113 citations

Journal ArticleDOI
TL;DR: This work proposes to integrate an urban growth simulation model with the multi-region input-output (MRIO) model, thereby illustrating how urban land consumption in one region can cause ecosystem services' degradation in another under five shared socioeconomic pathway (SSP) scenarios.

70 citations

Journal ArticleDOI
TL;DR: How people of Chennai used social media especially twitter, in response to the country’s worst flood that had occurred recently was studied, and Random Forests is the best algorithm that can be relied on, during a disaster.

61 citations

Journal ArticleDOI
TL;DR: An innovative method that is capable of simulating UGB alternatives with economic and ecological constraints is developed and indicates that increasing the shares of low energy consumption industries and tertiary industries can effectively reduce urban land demand.
Abstract: Urban growth boundaries (UGBs) have been applied in many rapid urbanizing areas to alleviate the problems of urban sprawl Although empirical research has stressed the importance of ecological prot

54 citations

Proceedings ArticleDOI
01 Nov 2017
TL;DR: In this paper, the authors used feature vectors from both the front and back side of a green leaf along with morphological features to arrive at a unique optimum combination of features that maximizes the identification rate.
Abstract: Identification of the correct medicinal plants that goes in to the preparation of a medicine is very important in ayurvedic medicinal industry. The main features required to identify a medicinal plant is its leaf shape, colour and texture. Colour and texture from both sides of the leaf contain deterministic parameters to identify the species. This paper explores feature vectors from both the front and back side of a green leaf along with morphological features to arrive at a unique optimum combination of features that maximizes the identification rate. A database of medicinal plant leaves is created from scanned images of front and back side of leaves of commonly used ayurvedic medicinal plants. The leaves are classified based on the unique feature combination. Identification rates up to 99% have been obtained when tested over a wide spectrum of classifiers. The above work has been extended to include identification by dry leaves and a combination of feature vectors is obtained, using which, identification rates exceeding 94% have been achieved.

51 citations