Author
Ehtesham Hassan
Other affiliations: Indian Institute of Technology Delhi, Harvard University
Bio: Ehtesham Hassan is an academic researcher from Tata Consultancy Services. The author has contributed to research in topics: Multiple kernel learning & Search engine indexing. The author has an hindex of 12, co-authored 45 publications receiving 363 citations. Previous affiliations of Ehtesham Hassan include Indian Institute of Technology Delhi & Harvard University.
Papers
More filters
07 Dec 2015
TL;DR: A two stage pedestrian detector is proposed that yields better than state-of-the-art performance on the INRIA benchmark dataset and yields a miss rate of 10.35% at FPPI=10-1.
Abstract: In this paper, we propose a two stage pedestrian detector. The first stage involves a cascade of Aggregated Channel Features (ACF) to extract potential pedestrian windows from an image. We further introduce a thresholding technique on the ACF confidence scores that segregates candidate windows lying at the extremes of the ACF score distribution. The windows with ACF scores in between the upper and lower bounds are passed on to a Mixture of Expert (MoE) CNNs for more refined classification in the second stage. Results show that the designed detector yields better than state-of-the-art performance on the INRIA benchmark dataset and yields a miss rate of 10.35% at FPPI=10 -- 1.
31 citations
01 Mar 2017
TL;DR: A novel method to update assets for telecommunication infrastructure using google street view (GSV) images using HOG descriptors with SVM, Deformable parts model (DPM), and Deep learning using faster RCNNs is presented.
Abstract: We present a novel method to update assets for telecommunication infrastructure using google street view (GSV) images. The problem is formulated as a object recognition task, followed by use of triangulation to estimate the object coordinates from sensor plane coordinates, To this end, we have explored different state-of-the-art object recognition techniques both from feature engineering and using deep learning namely HOG descriptors with SVM, Deformable parts model (DPM), and Deep learning (DL) using faster RCNNs. While HOG+SVM has proved to be robust human detector, DPM which is based on probabilistic graphical models and DL which is a non-linear classifier have proved their versatility in different types of object recognition problems. Asset recognition from the street view images however pose unique challenge as they could be installed on the ground in various poses, orientations and with occlusions, objects camouflaged in the background and in some cases inter class variation is small. We present comparative performance of these techniques for specific use-case involving telecom equipment for highest precision and recall. The blocks of proposed pipeline are detailed and compared to traditional inventory management methods.
28 citations
01 Sep 2016
TL;DR: An Augmented Reality (AR) based re-configurable framework for inspection that can be utilized in cross-domain applications such as maintenance and repair assistance in industrial inspection, health sector to record vitals, and automotive/avionics domain inspection, amongst others is presented.
Abstract: We present an Augmented Reality (AR) based re-configurable framework for inspection that can be utilized in cross-domain applications such as maintenance and repair assistance in industrial inspection, health sector to record vitals, and automotive/avionics domain inspection, amongst others. The novelty of the inspection framework as compared to the existing counterparts are three fold. Firstly, the inspection check-list can be prioritized by detecting the parts viewed in inspector's field using deep learning principles. Second, the backend of the framework is easily configurable for different applications where instructions and assistance manuals can be directly imported and visually integrated with inspection type. Third, we conduct a feasibility study on inspection modes such as Google Glass, Google Cardboard, Paper based and Tablet for inspection turnaround time, ease, and usefulness by taking a 3D printer inspection use-case.
25 citations
TL;DR: The scheme presents the extension of distance based hashing to kernel space for generating the indexing structure based on similarity in kernel space using the concept of multiple kernel learning to incorporate multiple features for defining the image indexing space.
Abstract: The paper presents a novel feature based indexing scheme for image collections. The scheme presents the extension of distance based hashing to kernel space for generating the indexing structure based on similarity in kernel space. The objective of the scheme is to incorporate multiple features for defining the image indexing space using the concept of multiple kernel learning. However, the indexing problems are defined with unique learning objective; therefore, a novel application of genetic algorithm is presented for the optimization task. The extensive evaluation of the proposed concept is performed for developing word based document indexing application of Devanagari, Bengali, and English scripts. In addition, the efficacy of the proposed concept is shown by experimental evaluations on handwritten digits and natural image collection.
22 citations
01 Sep 2016
TL;DR: An AR based re-configurable inspection framework that can be utilized in cross-domain applications such as maintenance and repair assistance in industrial inspection and automotive/avionics domain inspection, amongst others.
Abstract: With the advancement in camera technologies and data streaming protocols, AR based applications are proving to be an important aid for inspection, training and supervision tasks in various operations including automotive industry, education etc. We demonstrate an AR based re-configurable inspection framework that can be utilized in cross-domain applications such as maintenance and repair assistance in industrial inspection and automotive/avionics domain inspection, amongst others. A deep learning component detects parts viewed in inspector's Field-of-View (FoV) accurately and the corresponding inspection check-list can be prioritized based on detection results. The back-end of the framework is easily configurable for different applications where instructions can be directly imported and visually integrated with inspection type. Accurate recording of status of inspection is provided through evidence capturing of images, notes and videos. Our current framework supports all the Android based devices and will be demonstrated on Google Glass, Google Cardboard with smartphone, and Tablet with the help of 3D printer inspection use-case.
21 citations
Cited by
More filters
01 Jan 2004
TL;DR: Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance and describes numerous important application areas such as image based rendering and digital libraries.
Abstract: From the Publisher:
The accessible presentation of this book gives both a general view of the entire computer vision enterprise and also offers sufficient detail to be able to build useful applications. Users learn techniques that have proven to be useful by first-hand experience and a wide range of mathematical methods. A CD-ROM with every copy of the text contains source code for programming practice, color images, and illustrative movies. Comprehensive and up-to-date, this book includes essential topics that either reflect practical significance or are of theoretical importance. Topics are discussed in substantial and increasing depth. Application surveys describe numerous important application areas such as image based rendering and digital libraries. Many important algorithms broken down and illustrated in pseudo code. Appropriate for use by engineers as a comprehensive reference to the computer vision enterprise.
3,627 citations
TL;DR: This work incorporates ring partition and invariant vector distance to image hashing algorithm for enhancing rotation robustness and discriminative capability, and demonstrates that the proposed hashing algorithm is robust at commonly used digital operations to images.
Abstract: Robustness and discrimination are two of the most important objectives in image hashing. We incorporate ring partition and invariant vector distance to image hashing algorithm for enhancing rotation robustness and discriminative capability. As ring partition is unrelated to image rotation, the statistical features that are extracted from image rings in perceptually uniform color space, i.e., CIE L*a*b* color space, are rotation invariant and stable. In particular, the Euclidean distance between vectors of these perceptual features is invariant to commonly used digital operations to images (e.g., JPEG compression, gamma correction, and brightness/contrast adjustment), which helps in making image hash compact and discriminative. We conduct experiments to evaluate the efficiency with 250 color images, and demonstrate that the proposed hashing algorithm is robust at commonly used digital operations to images. In addition, with the receiver operating characteristics curve, we illustrate that our hashing is much better than the existing popular hashing algorithms at robustness and discrimination.
192 citations
TL;DR: An efficient image hashing with a ring partition and a nonnegative matrix factorization (NMF) is designed, which has both the rotation robustness and good discriminative capability.
Abstract: This paper designs an efficient image hashing with a ring partition and a nonnegative matrix factorization (NMF), which has both the rotation robustness and good discriminative capability. The key contribution is a novel construction of rotation-invariant secondary image, which is used for the first time in image hashing and helps to make image hash resistant to rotation. In addition, NMF coefficients are approximately linearly changed by content-preserving manipulations, so as to measure hash similarity with correlation coefficient. We conduct experiments for illustrating the efficiency with 346 images. Our experiments show that the proposed hashing is robust against content-preserving operations, such as image rotation, JPEG compression, watermark embedding, Gaussian low-pass filtering, gamma correction, brightness adjustment, contrast adjustment, and image scaling. Receiver operating characteristics (ROC) curve comparisons are also conducted with the state-of-the-art algorithms, and demonstrate that the proposed hashing is much better than all these algorithms in classification performances with respect to robustness and discrimination.
181 citations
TL;DR: The nature of texts and inherent challenges addressed by word spotting methods are thoroughly examined and the use of retrieval enhancement techniques based on relevance feedback which improve the retrieved results are investigated.
Abstract: This work reviews the word spotting methods for document indexing.The nature of texts addressed by word spotting techniques is analyzed.The core steps that compose a word spotting system are thoroughly explored.Several boosting mechanisms which enhance the retrieved results are examined.Results achieved by the state of the art imply that there are still goals to be reached. Vast collections of documents available in image format need to be indexed for information retrieval purposes. In this framework, word spotting is an alternative solution to optical character recognition (OCR), which is rather inefficient for recognizing text of degraded quality and unknown fonts usually appearing in printed text, or writing style variations in handwritten documents. Over the past decade there has been a growing interest in addressing document indexing using word spotting which is reflected by the continuously increasing number of approaches. However, there exist very few comprehensive studies which analyze the various aspects of a word spotting system. This work aims to review the recent approaches as well as fill the gaps in several topics with respect to the related works. The nature of texts and inherent challenges addressed by word spotting methods are thoroughly examined. After presenting the core steps which compose a word spotting system, we investigate the use of retrieval enhancement techniques based on relevance feedback which improve the retrieved results. Finally, we present the datasets which are widely used for word spotting, we describe the evaluation standards and measures applied for performance assessment and discuss the results achieved by the state of the art.
134 citations
Posted Content•
TL;DR: TimeNet: a deep recurrent neural network trained on diverse time series in an unsupervised manner using sequence to sequence (seq2seq) models to extract features from time series attempts to generalize time series representation across domains by ingesting time series from several domains simultaneously.
Abstract: Inspired by the tremendous success of deep Convolutional Neural Networks as generic feature extractors for images, we propose TimeNet: a deep recurrent neural network (RNN) trained on diverse time series in an unsupervised manner using sequence to sequence (seq2seq) models to extract features from time series. Rather than relying on data from the problem domain, TimeNet attempts to generalize time series representation across domains by ingesting time series from several domains simultaneously. Once trained, TimeNet can be used as a generic off-the-shelf feature extractor for time series. The representations or embeddings given by a pre-trained TimeNet are found to be useful for time series classification (TSC). For several publicly available datasets from UCR TSC Archive and an industrial telematics sensor data from vehicles, we observe that a classifier learned over the TimeNet embeddings yields significantly better performance compared to (i) a classifier learned over the embeddings given by a domain-specific RNN, as well as (ii) a nearest neighbor classifier based on Dynamic Time Warping.
112 citations