scispace - formally typeset
Search or ask a question
Author

Amit Saxena

Bio: Amit Saxena is an academic researcher from Guru Ghasidas University. The author has contributed to research in topics: Feature selection & Feature (computer vision). The author has an hindex of 10, co-authored 25 publications receiving 587 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: The applications of clustering in some fields like image segmentation, object and character recognition and data mining are highlighted and the approaches used in these methods are discussed with their respective states of art and applicability.

745 citations

Journal ArticleDOI
01 Nov 2015
TL;DR: A novel approach is proposed to improve the classification performance of a polynomial neural network (PNN) based on all possible combinations of two features of the training input patterns of a dataset.
Abstract: In this paper, a novel approach is proposed to improve the classification performance of a polynomial neural network (PNN). In this approach, the partial descriptions (PDs) are generated at the first layer based on all possible combinations of two features of the training input patterns of a dataset. The set of PDs from the first layer, the set of all input features, and a bias constitute the chromosome of the real-coded genetic algorithm (RCGA). A system of equations is solved to determine the values of the real coefficients of each chromosome of the RCGA for the training dataset with the mean classification accuracy (CA) as the fitness value of each chromosome. To adjust these values for unknown testing patterns, the RCGA is iterated in the usual manner using simple selection, crossover, mutation, and elitist selection. The method is tested extensively with the University of California, Irvine benchmark datasets by utilizing tenfold cross validation of each dataset, and the performance is compared with various well-known state-of-the-art techniques. The results obtained from the proposed method in terms of CA are superior and outperform other known methods on various datasets.

72 citations

Journal ArticleDOI
TL;DR: Four methods are proposed for feature selection in an unsupervised manner by using genetic algorithms that select a set of features using a task independent criterion that can preserve the geometric structure (topology) of the original data in the reduced feature space.

23 citations

Proceedings ArticleDOI
01 Nov 2017
TL;DR: A new fuzzy logic-based QE approach that considers the relevance score produced by different rank aggregation approaches is proposed and combines different weights of each term using fuzzy rules to infer the weights of the additional query terms.
Abstract: Individual query expansion term selection methods have been widely investigated in an attempt to improve their performance Each expansion term selection method has its own weaknesses and strengths To overcome the weaknesses and utilize the strengths of individual methods, this paper combined multiple term selection methods In this paper, initially the possibility of improving the overall performance using individual query expansion (QE) term selection methods are explored Secondly, some well-known rank aggregation approaches are used for combining multiple QE term selection methods Thirdly, a new fuzzy logic-based QE approach that considers the relevance score produced by different rank aggregation approaches is proposed The proposed fuzzy logic approach combines different weights of each term using fuzzy rules to infer the weights of the additional query terms Finally, Word2vec approach is used to filter semantically irrelevant terms obtained after applying the fuzzy logic approach The experimental results demonstrate that the proposed approaches achieve significant improvements over each individual term selection method, aggregated method and related state-of-the-art method

22 citations

Journal ArticleDOI
TL;DR: A novel fuzzy logic-based expansion approach considering the relevance score produced by different rank aggregation approaches is proposed, combining different weights of each term by using fuzzy rules to infer the weights of the additional query terms.
Abstract: In this paper, a novel fuzzy logic-based expansion approach considering the relevance score produced by different rank aggregation approaches is proposed. It is well known that different rank aggregation approaches yield different relevance scores for each term. The proposed fuzzy logic approach combines different weights of each term by using fuzzy rules to infer the weights of the additional query terms. Experimental results demonstrate that the proposed approach achieves significant improvement over individual expansion, aggregated and other related state-of-the-arts methods.

17 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

01 Jan 2002

9,314 citations

01 Jan 2004
TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.
Abstract: From the Publisher: The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods. A CD-ROM with every copy of the text contains source code for programming practice, color images, and illustrative movies. Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth. Application surveys describe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code. Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.

3,627 citations

Journal Article
TL;DR: In this article, the authors explore the effect of dimensionality on the nearest neighbor problem and show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance of the farthest data point.
Abstract: We explore the effect of dimensionality on the nearest neighbor problem. We show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance to the farthest data point. To provide a practical perspective, we present empirical results on both real and synthetic data sets that demonstrate that this effect can occur for as few as 10-15 dimensions. These results should not be interpreted to mean that high-dimensional indexing is never meaningful; we illustrate this point by identifying some high-dimensional workloads for which this effect does not occur. However, our results do emphasize that the methodology used almost universally in the database literature to evaluate high-dimensional indexing techniques is flawed, and should be modified. In particular, most such techniques proposed in the literature are not evaluated versus simple linear scan, and are evaluated over workloads for which nearest neighbor is not meaningful. Often, even the reported experiments, when analyzed carefully, show that linear scan would outperform the techniques being proposed on the workloads studied in high (10-15) dimensionality!.

1,992 citations

Journal ArticleDOI
TL;DR: This paper presents a comprehensive survey of the state-of-the-art work on EC for feature selection, which identifies the contributions of these different algorithms.
Abstract: Feature selection is an important task in data mining and machine learning to reduce the dimensionality of the data and increase the performance of an algorithm, such as a classification algorithm. However, feature selection is a challenging task due mainly to the large search space. A variety of methods have been applied to solve feature selection problems, where evolutionary computation (EC) techniques have recently gained much attention and shown some success. However, there are no comprehensive guidelines on the strengths and weaknesses of alternative approaches. This leads to a disjointed and fragmented field with ultimately lost opportunities for improving performance and successful applications. This paper presents a comprehensive survey of the state-of-the-art work on EC for feature selection, which identifies the contributions of these different algorithms. In addition, current issues and challenges are also discussed to identify promising areas for future research.

1,237 citations