Author
Shweta Srivastava
Bio: Shweta Srivastava is an academic researcher. The author has contributed to research in topics: Data pre-processing & Feature selection. The author has an hindex of 1, co-authored 1 publications receiving 58 citations.
Papers
More filters
••
TL;DR: The fundamentals of data mining steps like preprocessing the data, feature selection (to select the relevant features and removing the irrelevant and redundant features), classification and evaluation of different classifier models using WEKA tool are given.
Abstract: basic principle of data mining is to analyze the data from different perspectives, classify it and recapitulate it. Data mining has become very popular in each and every application. Though we have large amount of data but we don't have useful information in every field. There are many data mining tools and software to facilitate us the useful information. This paper gives the fundamentals of data mining steps like preprocessing the data (removing the noisy data, replacing the missing values etc.), feature selection (to select the relevant features and removing the irrelevant and redundant features), classification and evaluation of different classifier models using WEKA tool. The WEKA tool is not useful for only one type of application, though it can be used in various applications. This tool consists of various algorithms for feature selection, classification and clustering as well. Keywordsfeature selection, classification, clustering, evaluation of classifier models, evaluation of cluster models.
75 citations
Cited by
More filters
••
TL;DR: The recent advancement in the IDS datasets that can be used by various research communities as the manifesto for using the new IDS dataset for developing efficient and effective ML and DM based IDS.
113 citations
••
TL;DR: This work proposes to integrate an urban growth simulation model with the multi-region input-output (MRIO) model, thereby illustrating how urban land consumption in one region can cause ecosystem services' degradation in another under five shared socioeconomic pathway (SSP) scenarios.
70 citations
••
TL;DR: How people of Chennai used social media especially twitter, in response to the country’s worst flood that had occurred recently was studied, and Random Forests is the best algorithm that can be relied on, during a disaster.
61 citations
••
TL;DR: An innovative method that is capable of simulating UGB alternatives with economic and ecological constraints is developed and indicates that increasing the shares of low energy consumption industries and tertiary industries can effectively reduce urban land demand.
Abstract: Urban growth boundaries (UGBs) have been applied in many rapid urbanizing areas to alleviate the problems of urban sprawl Although empirical research has stressed the importance of ecological prot
54 citations
••
01 Nov 2017TL;DR: In this paper, the authors used feature vectors from both the front and back side of a green leaf along with morphological features to arrive at a unique optimum combination of features that maximizes the identification rate.
Abstract: Identification of the correct medicinal plants that goes in to the preparation of a medicine is very important in ayurvedic medicinal industry. The main features required to identify a medicinal plant is its leaf shape, colour and texture. Colour and texture from both sides of the leaf contain deterministic parameters to identify the species. This paper explores feature vectors from both the front and back side of a green leaf along with morphological features to arrive at a unique optimum combination of features that maximizes the identification rate. A database of medicinal plant leaves is created from scanned images of front and back side of leaves of commonly used ayurvedic medicinal plants. The leaves are classified based on the unique feature combination. Identification rates up to 99% have been obtained when tested over a wide spectrum of classifiers. The above work has been extended to include identification by dry leaves and a combination of feature vectors is obtained, using which, identification rates exceeding 94% have been achieved.
51 citations