scispace - formally typeset
Search or ask a question
Author

Vikas Jain

Bio: Vikas Jain is an academic researcher from Indian Institutes of Information Technology. The author has contributed to research in topics: Random forest & Decision tree. The author has an hindex of 3, co-authored 5 publications receiving 18 citations.

Papers
More filters
Proceedings ArticleDOI
01 Jul 2019
TL;DR: The experimental results show that EWRF can provide competitive solutions for hyperspectral image classification and has been compared with state-of-the-art methods over two well-known benchmark datasets.
Abstract: Hyperspectral image (HSI) classification is a challenging task due to the problem of a few labeled samples versus the high dimensional features. Random forest (RF) have been applied to the HSI problem in the past. For RF based HSI classification methods, equal weights/votes have been considered to all decision trees in the past. However, this may not be true always for varying test cases. Hence, this paper proposes an Exponentially Weighted Random Forest (EWRF) method which uses dynamic association with trained trees. Thus, we try to capture the relationship between trained trees and test case in EWRF. The performance of the proposed method has been compared with state-of-the-art methods over two well-known benchmark datasets. The experimental results show that EWRF can provide competitive solutions for hyperspectral image classification.

15 citations

Proceedings ArticleDOI
01 Oct 2018
TL;DR: Joint splitting criteria using two of the most used criterion i.e. Information Gain and Gini index is investigated to split the data points when Information Gain is maximum and GINI index is minimum.
Abstract: Decision Tree is a well-accepted supervised classifier in machine learning. It splits the given data points based on features and considers a threshold value. In general, a single predefined splitting criterion is used which may lead to poor performance. To this end, in this paper, we investigate joint splitting criteria using two of the most used criterion i.e. Information Gain and Gini index. We propose to split the data points when Information Gain is maximum and Gini index is minimum. The proposed approach is rigorously tested and compared by constructing decision tree based random forests. All the experiments are performed on UCI machine learning datasets.

12 citations

Journal ArticleDOI
TL;DR: This work empirically proves that the performance of the MaRF improves due to the improvement in the strength of the M-ary trees, and proposes to use multiple features at a node for splitting the data as in axis parallel method.
Abstract: Random Forest (RF) is composed of decision trees as base classifiers. In general, a decision tree recursively partitions the feature space into two disjoint subspaces using a single feature as axis-parallel splits for each internal node. The oblique decision tree uses a linear combination of features (to form a hyperplane) to partition the feature space in two subspaces. The later approach is an NP-hard problem to compute the best-suited hyperplane. In this work, we propose to use multiple features at a node for splitting the data as in axis parallel method. Each feature independently divides into two subspaces and this process is done by multiple features at one node. Hence, the given space is divided into multiple subspaces simultaneously, and in turn, to construct the M-ary trees. Hence, the forest formed is named as M-ary Random Forest (MaRF). To measure the performance of the task in MaRF, we have extended the notion of tree strength of the regression tree. We empirically prove that the performance of the MaRF improves due to the improvement in the strength of the M-ary trees. We have shown the performance to wide range of datasets ranging from UCI datasets, Hyperspectral dataset, MNIST dataset, Caltech 101 and Caltech 256 datasets. The efficiency of the MaRF approach is found satisfactory as compared to state-of-the-art methods.

7 citations

Book ChapterDOI
17 Dec 2019
TL;DR: A dynamic weighing scheme is proposed between test samples and decision tree in RF using exponential distribution, which is rigorously tested over benchmark datasets from the UCI repository for both classification and regression tasks.
Abstract: Random forest (RF) is a supervised, non-parametric, ensemble-based machine learning method used for classification and regression task. It is easy in terms of implementation and scalable, hence attracting many researchers. Being an ensemble-based method, it considers equal weights/votes to all atomic units i.e. decision trees. However, this may not be true always for varying test cases. Hence, the correlation between decision tree and data samples are explored in the recent past to take care of such issues. In this paper, a dynamic weighing scheme is proposed between test samples and decision tree in RF. The correlation is defined in terms of similarity between the test case and the decision tree using exponential distribution. Hence, the proposed method named as Exponentially Weighted Random Forest (EWRF). The performance of the proposed method is rigorously tested over benchmark datasets from the UCI repository for both classification and regression tasks.

6 citations

Book ChapterDOI
17 Dec 2019
TL;DR: The proposed MaRF method has been tested over Hyperspectral imaging (HSI) for classification and it has shown satisfactory improvement with respect to other state-of-the-art methods.
Abstract: Random forest (RF) is a supervised, ensemble of decision trees method. Each decision tree recursively partitions the feature space into two disjoint sub-regions using axis parallel splits until each sub-region becomes homogeneous with respect to a particular class or reach to a stoppage criterion. The conventional RF uses one feature at a time for splitting. Therefore, it does not consider the feature inter-dependency. Keeping this aim in mind, the current paper introduces an approach to perform multi-features splitting. This partition the feature space into M-regions using axis parallel splits. Therefore, the forest created using this is named as M-ary Random Forest (MaRF). The suitability of the proposed method is tested over the various heterogeneous UCI datasets. Experimental results show that the proposed MaRF is performing better for both classification and regression. The proposed MaRF method has also been tested over Hyperspectral imaging (HSI) for classification and it has shown satisfactory improvement with respect to other state-of-the-art methods.

2 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this paper, a method based on learning automata is presented, through which the adaptive capabilities of the problem space, as well as the independence of the data domain, are added to the random forest to increase its efficiency.
Abstract: The goal of aggregating the base classifiers is to achieve an aggregated classifier that has a higher resolution than individual classifiers Random forest is one of the types of ensemble learning methods that have been considered more than other ensemble learning methods due to its simple structure, ease of understanding, as well as higher efficiency than similar methods The ability and efficiency of classical methods are always influenced by the data The capabilities of independence from the data domain, and the ability to adapt to problem space conditions, are the most challenging issues about the different types of classifiers In this paper, a method based on learning automata is presented, through which the adaptive capabilities of the problem space, as well as the independence of the data domain, are added to the random forest to increase its efficiency Using the idea of reinforcement learning in the random forest has made it possible to address issues with data that have a dynamic behaviour Dynamic behaviour refers to the variability in the behaviour of a data sample in different domains Therefore, to evaluate the proposed method, and to create an environment with dynamic behaviour, different domains of data have been considered In the proposed method, the idea is added to the random forest using learning automata The reason for this choice is the simple structure of the learning automata and the compatibility of the learning automata with the problem space The evaluation results confirm the improvement of random forest efficiency

22 citations

Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper proposed a multiscale dual-branch residual spectral and spatial network with attention to the hyperspectral image classification model, which can learn and fuse deeper hierarchical spectral features with fewer training samples.
Abstract: The development of remote sensing images in recent years has made it possible to identify materials in inaccessible environments and study natural materials on a large scale. But hyperspectral images (HSIs) are a rich source of information with their unique features in various applications. However, several problems reduce the accuracy of HSI classification; for example, the extracted features are not effective, noise, the correlation of bands, and most importantly, the limited labeled samples. To improve accuracy in the case of limited training samples, we propose a multiscale dual-branch residual spectral–spatial network with attention to the HSI classification model named MDBRSSN in this article. First, due to the correlation and redundancy between HSI bands, a principal component analysis operation is applied to preprocess the raw HSI data. Then, in MDBRSSN, a dual-branch structure is designed to extract the useful spectral–spatial features of HSI. The advanced feature, multiscale abstract information extracted by the convolution neural network, is applied to image processing, which can improve complex hyperspectral data classification accuracy. In addition, the attention mechanisms applied separately to each branch enable MDBRSSN to optimize and refine the extracted feature maps. Such an MDBRSSN framework can learn and fuse deeper hierarchical spectral–spatial features with fewer training samples. The purpose of designing the MDBRSSN model is to have high classification accuracy compared to state-of-the-art methods when the training samples are limited, which is proved by the results of the experiments in this article on four datasets. In Salinas, Pavia University, Indian Pines, and Houston 2013, the proposed model obtained 99.64%, 98.93%, 98.17%, and 96.57% overall accuracy using only 1%, 1%, 5%, and 5% of labeled data for training, respectively, which are much better compared to the state-of-the-art methods.

15 citations

Proceedings ArticleDOI
26 Sep 2020
TL;DR: This work proposes a two-branch deep learning based method for few-shot HSI classification, where two branches separately accomplish H SI classification in a cube-wise level and acube-pair level, with a shared feature extractor sub-network.
Abstract: Despite the success of deep learning based methods for hyperspectral imagery (HSI) classification, they demand amounts of labeled samples for training whereas the labeled samples in lots of applications are always insufficient due to the expensive manual annotation cost. To address this problem, we propose a two-branch deep learning based method for few-shot HSI classification, where two branches separately accomplish HSI classification in a cube-wise level and a cube-pair level. With a shared feature extractor sub-network, the self-supervised knowledge contained in the cube-pair branch provides an effective way to regularize the original few-shot HSI classification branch (i.e., cube-wise branch) with limited labeled samples, which thus improves the performance of HSI classification. The superiority of the proposed method on few-shot HSI classification is demonstrated experimentally on two HSI benchmark datasets.

11 citations