scispace - formally typeset
Search or ask a question

Showing papers on "Histogram of oriented gradients published in 2016"


Journal ArticleDOI
TL;DR: An innovative representation learning framework for breast cancer diagnosis in mammography that integrates deep learning techniques to automatically learn discriminative features avoiding the design of specific hand-crafted image-based feature detectors is described.

366 citations


Journal ArticleDOI
01 Sep 2016
TL;DR: A new traffic sign detection and recognition method, which is achieved in three main steps, to use invariant geometric moments to classify shapes instead of machine learning algorithms and the results obtained are satisfactory when compared to the state-of-the-art methods.
Abstract: Graphical abstractDisplay Omitted In this paper we present a new traffic sign detection and recognition (TSDR) method, which is achieved in three main steps. The first step segments the image based on thresholding of HSI color space components. The second step detects traffic signs by processing the blobs extracted by the first step. The last one performs the recognition of the detected traffic signs. The main contributions of the paper are as follows. First, we propose, in the second step, to use invariant geometric moments to classify shapes instead of machine learning algorithms. Second, inspired by the existing features, new ones have been proposed for the recognition. The histogram of oriented gradients (HOG) features has been extended to the HSI color space and combined with the local self-similarity (LSS) features to get the descriptor we use in our algorithm. As a classifier, random forest and support vector machine (SVM) classifiers have been tested together with the new descriptor. The proposed method has been tested on both the German Traffic Sign Detection and Recognition Benchmark and the Swedish Traffic Signs Data sets. The results obtained are satisfactory when compared to the state-of-the-art methods.

137 citations


Journal ArticleDOI
TL;DR: The Histogram of Oriented Gradient is extended and two new feature descriptors are proposed: Co-occurrence HOG (Co-HOG) and Convolutional Co-Hog (ConvCo- HOG) for accurate recognition of scene texts of different languages.

130 citations


Journal ArticleDOI
TL;DR: The results of the experimental work reveal that both approaches offer comparable or even better performances with respect to the best ones reported in the literature and are compatible to real-time operation as well.
Abstract: New methods are proposed for circular traffic sign detection and recognition.Comparable performances are attained with respect to the best performing methods.Compatibility to real-time operation is validated. Automatic traffic sign detection and recognition play crucial roles in several expert systems such as driver assistance and autonomous driving systems. In this work, novel approaches for circular traffic sign detection and recognition on color images are proposed. In traffic sign detection, a new approach, which utilizes a recently developed circle detection algorithm and an RGB-based color thresholding technique, is proposed. In traffic sign recognition, an ensemble of features including histogram of oriented gradients, local binary patterns and Gabor features are employed within a support vector machine classification framework. Performances of the proposed detection and recognition approaches are evaluated on German Traffic Sign Detection and Recognition Benchmark datasets, respectively. The results of the experimental work reveal that both approaches offer comparable or even better performances with respect to the best ones reported in the literature and are compatible to real-time operation as well.

107 citations


Journal ArticleDOI
TL;DR: A novel face recognition algorithm that outperforms when compared with PCA algorithm and Histogram of Oriented Gradient features are extracted both for the test image and also for the training images and given to the Support Vector Machine classifier.
Abstract: A novel face recognition algorithm is presented in this paper. Histogram of Oriented Gradient features are extracted both for the test image and also for the training images and given to the Support Vector Machine classifier. The detailed steps of HOG feature extraction and the classification using SVM is presented. The algorithm is compared with the Eigen feature based face recognition algorithm. The proposed algorithm and PCA are verified using 8 different datasets. Results show that in all the face datasets the proposed algorithm shows higher face recognition rate when compared with the traditional Eigen feature based face recognition algorithm. There is an improvement of 8.75% face recognition rate when compared with PCA based face recognition algorithm. The experiment is conducted on ORL database with 2 face images for testing and 8 face images for training for each person. Three performance curves namely CMC, EPC and ROC are considered. The curves show that the proposed algorithm outperforms when compared with PCA algorithm. IndexTerms: Facial features, Histogram of Oriented Gradients, Support Vector Machine, Principle Component Analysis.

101 citations


Proceedings ArticleDOI
24 Jul 2016
TL;DR: The proposed approach first detects bike riders from surveillance video using background subtraction and object segmentation, then it determines whether bike-rider is using a helmet or not using visual features and binary classifier.
Abstract: In this paper, we propose an approach for automatic detection of bike-riders without helmet using surveillance videos in real time. The proposed approach first detects bike riders from surveillance video using background subtraction and object segmentation. Then it determines whether bike-rider is using a helmet or not using visual features and binary classifier. Also, we present a consolidation approach for violation reporting which helps in improving reliability of the proposed approach. In order to evaluate our approach, we have provided a performance comparison of three widely used feature representations namely histogram of oriented gradients (HOG), scale-invariant feature transform (SIFT), and local binary patterns (LBP) for classification. The experimental results show detection accuracy of 93.80% on the real world surveillance data. It has also been shown that proposed approach is computationally less expensive and performs in real-time with a processing time of 11.58 ms per frame.

94 citations


Journal ArticleDOI
TL;DR: A novel feature, which is a histogram of oriented gradients-like feature for SAR ATR and a classifier based on SDDL and sparse representation, in which both the reconstruction error and the classification error are considered, which achieves the state-of-the-art performance on MSTAR database.
Abstract: Automatic target recognition (ATR) in synthetic aperture radar (SAR) images plays an important role in both national defense and civil applications. Although many methods have been proposed, SAR ATR is still very challenging due to the complex application environment. Feature extraction and classification are key points in SAR ATR. In this paper, we first design a novel feature, which is a histogram of oriented gradients (HOG)-like feature for SAR ATR (called SAR-HOG). Then, we propose a supervised discriminative dictionary learning (SDDL) method to learn a discriminative dictionary for SAR ATR and propose a strategy to simplify the optimization problem. Finally, we propose a SAR ATR classifier based on SDDL and sparse representation (called SDDLSR), in which both the reconstruction error and the classification error are considered. Extensive experiments are performed on the MSTAR database under standard operating conditions and extended operating conditions. The experimental results show that SAR-HOG can reliably capture the structures of targets in SAR images, and SDDL can further capture subtle differences among the different classes. By virtue of the SAR-HOG feature and SDDLSR, the proposed method achieves the state-of-the-art performance on MSTAR database. Especially for the extended operating conditions (EOC) scenario “Training 17 ∘ —Testing 45 ∘ ”, the proposed method improves remarkably with respect to the previous works.

83 citations


Journal ArticleDOI
TL;DR: A state-of-art offline signature verification system that uses a score-level fusion of complementary classifiers that use different local features (histogram of oriented gradients, local binary patterns and scale invariant feature transform descriptors), where each classifier uses a feature- level fusion to represent local features at coarse-to-fine levels is presented.

77 citations


Proceedings ArticleDOI
01 Dec 2016
TL;DR: In this article, the authors proposed hybrid features that consider the local features and their global statistics in the signature image by creating a vocabulary of histogram of oriented gradients (HOGs) and imposing weights on these local features based on the height information of water reservoirs obtained from the signature.
Abstract: This paper considers the offline signature verification problem which is considered to be an important research line in the field of pattern recognition. In this work we propose hybrid features that consider the local features and their global statistics in the signature image. This has been done by creating a vocabulary of histogram of oriented gradients (HOGs). We impose weights on these local features based on the height information of water reservoirs obtained from the signature. Spatial information between local features are thought to play a vital role in considering the geometry of the signatures which distinguishes the originals from the forged ones. Nevertheless, learning a condensed set of higher order neighbouring features based on visual words, e.g., doublets and triplets, continues to be a challenging problem as possible combinations of visual words grow exponentially. To avoid this explosion of size, we create a code of local pairwise features which are represented as joint descriptors. Local features are paired based on the edges of a graph representation built upon the Delaunay triangulation. We reveal the advantage of combining both type of visual codebooks (order one and pairwise) for signature verification task. This is validated through an encouraging result on two benchmark datasets viz. CEDAR and GPDS300.

74 citations


Journal ArticleDOI
TL;DR: This is the first report of results on the MuHAVi-uncut dataset having a large number of action categories and a large set of camera-views with noisy silhouettes which can be used by future workers as a baseline to improve on and compares well to similar state-of-the-art approaches.
Abstract: In this study, a new multi-view human action recognition approach is proposed by exploiting low-dimensional motion information of actions. Before feature extraction, pre-processing steps are performed to remove noise from silhouettes, incurred due to imperfect, but realistic segmentation. Two-dimensional motion templates based on motion history image (MHI) are computed for each view/action video. Histograms of oriented gradients (HOGs) are used as an efficient description of the MHIs which are classified using nearest neighbor (NN) classifier. As compared with existing approaches, the proposed method has three advantages: (i) does not require a fixed number of cameras setup during training and testing stages hence missing camera-views can be tolerated, (ii) requires less memory and bandwidth requirements and hence (iii) is computationally efficient which makes it suitable for real-time action recognition. As far as the authors know, this is the first report of results on the MuHAVi-uncut dataset having a large number of action categories and a large set of camera-views with noisy silhouettes which can be used by future workers as a baseline to improve on. Experimentation results on multi-view with this dataset gives a high-accuracy rate of 95.4% using leave-one-sequence-out cross-validation technique and compares well to similar state-of-the-art approaches.

59 citations


Proceedings ArticleDOI
01 Dec 2016
TL;DR: A facial expression recognition framework which infers the emotional states in real-time, thereby enabling the computers to interact more intelligently with people.
Abstract: This paper presents a facial expression recognition framework which infers the emotional states in real-time, thereby enabling the computers to interact more intelligently with people. The proposed method determines the face as well as the facial landmark points, extracts discriminating features from suitable facial regions, and classifies the expressions in real-time from live webcam feed. The speed of the system is improved by the appropriate combination of the detection and tracking algorithms. Further, instead of the whole face, histogram of oriented gradients (HOG) features are extracted from the active facial patches which makes the system robust against the scale and pose variations. The feature vectors are further fed to a support vector machine (SVM) classifier to classify into neutral or six universal expressions. Experimental results show an accuracy of 95% with 5 folds cross-validation in extended Cohn-Kanade (CK+) dataset.

Patent
21 Apr 2016
TL;DR: In this article, a system for counting stacked items using image analysis is described, where an image of an inventory location with stacked items is obtained and processed to determine the number of items stacked at the inventory location.
Abstract: Described is a system for counting stacked items using image analysis. In one implementation, an image of an inventory location with stacked items is obtained and processed to determine the number of items stacked at the inventory location. In some instances, the item closest to the camera that obtains the image may be the only item viewable in the image. Using image analysis, such as depth mapping or Histogram of Oriented Gradients (HOG) algorithms, the distance of the item from the camera and the shelf of the inventory location can be determined. Using this information, and known dimension information for the item, a count of the number of items stacked at an inventory location may be determined.

Journal ArticleDOI
TL;DR: A breast cancer risk index (BCRI) is developed using significant KLPP features which can discriminate the two classes using a single integrated index and can help the radiologists to discriminate the normal and malignant classes during screening to validate their findings.
Abstract: Breast cancer is one of the prime causes of death in women worldwide. Thermography has shown a great potential in screening the breast cancer and overcomes the limitations of mammography. Moreover, interpretations of thermogram images are dependent on the specialists, which may lead to errors and uneven results. Preliminary screening method should detect the hazardous, destructive tumours effectively to improve the accuracy. The growth of malignant tumour can increase the internal temperature which can be captured by thermograms. Thus in this work, locally normalised histogram of oriented gradients (HOG) based preliminary screening computer aided diagnosis tool is proposed. HOG is able to record the minute internal variations in thermograms. In order to reduce the dimensions of extracted HOG descriptors kernel locality preserving projection (KLPP) is used. The resulting KLPP features are then ranked to form an efficient classification model. Various machine learning algorithms are used to validate...

Journal ArticleDOI
01 Sep 2016
TL;DR: A real-time and energy-efficient multi-scale object detector hardware implementation is presented in this paper, using Histogram of Oriented Gradients features and Support Vector Machine classification to detect objects of different sizes.
Abstract: A real-time and energy-efficient multi-scale object detector hardware implementation is presented in this paper. Detection is done using Histogram of Oriented Gradients (HOG) features and Support Vector Machine (SVM) classification. Multi-scale detection is essential for robust and practical applications to detect objects of different sizes. Parallel detectors with balanced workload are used to increase the throughput, enabling voltage scaling and energy consumption reduction. Image pre-processing is also introduced to further reduce power and area costs of the image scales generation. This design can operate on high definition 1080HD video at 60 fps in real-time with a clock rate of 270 MHz, and consumes 45.3 mW (0.36 nJ/pixel) based on post-layout simulations. The ASIC has an area of 490 kgates and 0.538 Mbit on-chip memory in a 45 nm SOI CMOS process.

Journal ArticleDOI
TL;DR: An appearance-based discrete head pose estimation aiming to determine the driver attention level from monocular visible spectrum images, even if the facial features are not visible, is proposed.
Abstract: A great interest is focused on driver assistance systems using the head pose as an indicator of the visual focus of attention and the mental state. In fact, the head pose estimation is a technique allowing to deduce head orientation relatively to a view of camera and could be performed by model-based or appearance-based approaches. Model-based approaches use a face geometrical model usually obtained from facial features, whereas appearance-based techniques use the whole face image characterized by a descriptor and generally consider the pose estimation as a classification problem. Appearance-based methods are faster and more adapted to discrete pose estimation. However, their performance depends strongly on the head descriptor, which should be well chosen in order to reduce the information about identity and lighting contained in the face appearance. In this paper, we propose an appearance-based discrete head pose estimation aiming to determine the driver attention level from monocular visible spectrum images, even if the facial features are not visible. Explicitly, we first propose a novel descriptor resulting from the fusion of four most relevant orientation-based head descriptors, namely the steerable filters, the histogram of oriented gradients (HOG), the Haar features, and an adapted version of speeded up robust feature (SURF) descriptor. Second, in order to derive a compact, relevant, and consistent subset of descriptor’s features, a comparative study is conducted on some well-known feature selection algorithms. Finally, the obtained subset is subject to the classification process, performed by the support vector machine (SVM), to learn head pose variations. As we show in experiments with the public database (Pointing’04) as well as with our real-world sequence, our approach describes the head with a high accuracy and provides robust estimation of the head pose, compared to state-of-the-art methods.

Proceedings ArticleDOI
Qian Wang1, Jingjun Wang1, Shengshan Hu1, Qin Zou1, Kui Ren2 
30 May 2016
TL;DR: This work proposes two novel privacy-preserving HOG outsourcing protocols, by efficiently encrypting image data by somewhat homomorphic encryption (SHE) integrated with single-instruction multiple-data (SIMD), designing a new batched secure comparison protocol, and carefully redesigning every step of HOG to adapt it to the ciphertext domain.
Abstract: Abundant multimedia data generated in our daily life has intrigued a variety of very important and useful real-world applications such as object detection and recognition etc. Accompany with these applications, many popular feature descriptors have been developed, e.g., SIFT, SURF and HOG. Manipulating massive multimedia data locally, however, is a storage and computation intensive task, especially for resource-constrained clients. In this work, we focus on exploring how to securely outsource the famous feature extraction algorithm--Histogram of Oriented Gradients (HOG) to untrusted cloud servers, without revealing the data owner's private information. For the first time, we investigate this secure outsourcing computation problem under two different models and accordingly propose two novel privacy-preserving HOG outsourcing protocols, by efficiently encrypting image data by somewhat homomorphic encryption (SHE) integrated with single-instruction multiple-data (SIMD), designing a new batched secure comparison protocol, and carefully redesigning every step of HOG to adapt it to the ciphertext domain. Explicit Security and effectiveness analysis are presented to show that our protocols are practically-secure and can approximate well the performance of the original HOG executed in the plaintext domain. Our extensive experimental evaluations further demonstrate that our solutions achieve high efficiency and perform comparably to the original HOG when being applied to human detection.

Proceedings ArticleDOI
Jiawei Chen1, Jonathan Wu1, Kristi Richter1, Janusz Konrad1, Prakash Ishwar1 
06 Mar 2016
TL;DR: A non-linear regression method for the estimation of human head pose from extremely low resolution images captured by a monocular RGB camera, which evaluates the common histogram of oriented gradients (HoG) feature, proposes a new gradient-based feature, and uses Support Vector Regression (SVR) to estimate head pose.
Abstract: The estimation of human head pose is of interest in some surveillance and human-computer interaction scenarios. Traditionally, this is not a difficult task if high- or even standard-definition video cameras are used. However, such cameras cannot be used in scenarios requiring privacy protection. In this paper, we propose a non-linear regression method for the estimation of human head pose from extremely low resolution images captured by a monocular RGB camera. We evaluate the common histogram of oriented gradients (HoG) feature, propose a new gradient-based feature, and use Support Vector Regression (SVR) to estimate head pose. We evaluate our algorithm on the Biwi Kinect Head Pose Dataset by re-sizing full-resolution RGB images to extremely low resolutions. The results are promising. At 10×10-pixel resolution, we achieve 6.95, 9.92 and 12.88 degree mean-absolute errors (MAE) for roll, yaw and pitch angles, respectively. These errors are very close to state-of-the-art results for full-resolution images.

Book ChapterDOI
30 Mar 2016
TL;DR: A new GP method is proposed that automatically detects different regions of an image, extracts HoG features from those regions, and simultaneously evolves a classifier for image classification by extending an existing GP region selection approach to incorporate the HoG algorithm.
Abstract: Image analysis is a key area in the computer vision domain that has many applications. Genetic Programming (GP) has been successfully applied to this area extensively, with promising results. High-level features extracted from methods such as Speeded Up Robust Features (SURF) and Histogram of Oriented Gradients (HoG) are commonly used for object detection with machine learning techniques. However, GP techniques are not often used with these methods, despite being applied extensively to image analysis problems. Combining the training process of GP with the powerful features extracted by SURF or HoG has the potential to improve the performance by generating high-level, domain-tailored features. This paper proposes a new GP method that automatically detects different regions of an image, extracts HoG features from those regions, and simultaneously evolves a classifier for image classification. By extending an existing GP region selection approach to incorporate the HoG algorithm, we present a novel way of using high-level features with GP for image classification. The ability of GP to explore a large search space in an efficient manner allows all stages of the new method to be optimised simultaneously, unlike in existing approaches. The new approach is applied across a range of datasets, with promising results when compared to a variety of well-known machine learning techniques. Some high-performing GP individuals are analysed to give insight into how GP can effectively be used with high-level features for image classification.

Journal ArticleDOI
TL;DR: This paper adopts the conventional camera approach that uses sliding windows and histogram of oriented gradients (HOG) features, and describes how the feature extraction step of the conventional approach should be modified for a theoretically correct and effective use in omnidirectional cameras.
Abstract: In this paper, we present an omnidirectional vision-based method for object detection. We first adopt the conventional camera approach that uses sliding windows and histogram of oriented gradients (HOG) features. Then, we describe how the feature extraction step of the conventional approach should be modified for a theoretically correct and effective use in omnidirectional cameras. Main steps are modification of gradient magnitudes using Riemannian metric and conversion of gradient orientations to form an omnidirectional sliding window. In this way, we perform object detection directly on the omnidirectional images without converting them to panoramic or perspective images. Our experiments, with synthetic and real images, compare the proposed approach with regular (unmodified) HOG computation on both omnidirectional and panoramic images. Results show that the proposed approach should be preferred.

Journal ArticleDOI
TL;DR: The obtained results show that the classical engineered features and CNN-based features can complement each other for recognition purposes, and this study combines the latest successes in both directions.
Abstract: Reliable facial recognition systems are of crucial importance in various applications from entertainment to security. Thanks to the deep-learning concepts introduced in the field, a significant improvement in the performance of the unimodal facial recognition systems has been observed in the recent years. At the same time a multimodal facial recognition is a promising approach. This study combines the latest successes in both directions by applying deep learning convolutional neural networks (CNN) to the multimodal RGB, depth, and thermal (RGB-D-T) based facial recognition problem outperforming previously published results. Furthermore, a late fusion of the CNN-based recognition block with various hand-crafted features (local binary patterns, histograms of oriented gradients, Haar-like rectangular features, histograms of Gabor ordinal measures) is introduced, demonstrating even better recognition performance on a benchmark RGB-D-T database. The obtained results in this study show that the classical engineered features and CNN-based features can complement each other for recognition purposes.

Journal ArticleDOI
TL;DR: This work develops an efficient and effective peak and valley detection algorithm from real-case time series data, and obtains significantly improved classification accuracies over existing approaches, including NNDTW and shapelet transform.
Abstract: Time series classification (TSC) arises in many fields and has a wide range of applications. Here, we adopt the bag-of-words (BoW) framework to classify time series. Our algorithm first samples local subsequences from time series at feature-point locations when available. It then builds local descriptors, and models their distribution by Gaussian mixture models (GMM), and at last it computes a Fisher Vector (FV) to encode each time series. The encoded FV representations of time series are readily used by existing classifiers, e.g., SVM, for training and prediction. In our work, we focus on detecting better feature points and crafting better local representations, while using existing techniques to learn codebook and encode time series. Specifically, we develop an efficient and effective peak and valley detection algorithm from real-case time series data. Subsequences are sampled from these peaks and valleys, instead of sampled randomly or uniformly as was done previously. Then, two local descriptors, Histogram of Oriented Gradients (HOG-1D) and Dynamic time warping-Multidimensional scaling (DTW-MDS), are designed to represent sampled subsequences. Both descriptors complement each other, and their fused representation is shown to be more descriptive than individual ones. We test our approach extensively on 43 UCR time series datasets, and obtain significantly improved classification accuracies over existing approaches, including NNDTW and shapelet transform.

Journal ArticleDOI
TL;DR: A sparse representation based approach is proposed for pedestrian detection from thermal images using the histogram of sparse code to represent image features and detecting pedestrian with the extracted features in an unimodal and a multimodal framework respectively.

Journal ArticleDOI
TL;DR: A novel object localization method by predicting object regions before extracting them that can reduce the search area of targets remarkably and runs fast is proposed.

Proceedings ArticleDOI
01 Sep 2016
TL;DR: This work proposes a novel multi-class image database for hair detection in the wild, called Figaro, which tackles the problem of hair detection without relying on a-priori information related to head shape and location.
Abstract: Hair is one of the elements that mostly characterize people appearance. Being able to detect hair in images can be useful in many applications, such as face recognition, gender classification, and video surveillance. To this purpose we propose a novel multi-class image database for hair detection in the wild, called Figaro. We tackle the problem of hair detection without relying on a-priori information related to head shape and location. Without using any human-body part classifier, we first classify image patches into hair vs. non-hair by relying on Histogram of Gradients (HOG) and Linear Ternary Pattern (LTP) texture features in a random forest scheme. Then we obtain results at pixel level by refining classified patches by a graph-based multiple segmentation method. Achieved segmentation accuracy (85%) is comparable to state-of-the-art on less challenging databases.

Journal ArticleDOI
TL;DR: A computer vision based system for fast robust Traffic Sign Detection and Recognition (TSDR), consisting of three steps, which compares four features descriptors which include Histogram of Oriented Gradients (HOG), Gabor, Local Binary Pattern (LBP), and Local Self-Similarity (LSS).
Abstract: In this paper, we present a computer vision based system for fast robust Traffic Sign Detection and Recognition (TSDR), consisting of three steps. The first step consists on image enhancement and thresholding using the three components of the Hue Saturation and Value (HSV) space. Then we refer to distance to border feature and Random Forests classifier to detect circular, triangular and rectangular shapes on the segmented images. The last step consists on identifying the information included in the detected traffic signs. We compare four features descriptors which include Histogram of Oriented Gradients (HOG), Gabor, Local Binary Pattern (LBP), and Local Self-Similarity (LSS). We also compare their different combinations. For the classifiers we have carried out a comparison between Random Forests and Support Vector Machines (SVMs). The best results are given by the combination HOG with LSS together with the Random Forest classifier. The proposed method has been tested on the Swedish Traffic Signs Data set and gives satisfactory results.

Journal ArticleDOI
TL;DR: A robust encoding method is utilized in which the residuals of local descriptors, with respect to a discriminative model, are aggregated into fixed length vectors, resulting in a powerful vector representation.
Abstract: We focus on the problem of pose-based gait recognition. Our contribution is two-fold. First, we incorporate a local histogram descriptor that allows us to encode the trajectories of selected limbs via a one-dimensional version of histogram of oriented gradients features. In this way, a gait sequence is encoded into a sequence of local gradient descriptors. Second, we utilize a robust encoding method in which the residuals of local descriptors, with respect to a discriminative model, are aggregated into fixed length vectors. This technique combines the advantages of both residual aggregation and soft-assignment techniques, resulting in a powerful vector representation. For classification purposes, we use a nonlinear kernel to map vectors into a reproducing kernel Hilbert space. Then, we classify an encoded gait sequence according to the sparse representation-based classification method. Experimental evaluation on two publicly available datasets demonstrates the effectiveness of the proposed scheme on both recognition and verification tasks.

Journal ArticleDOI
TL;DR: A new supervised learning method comprising of three different stages which are combined into a single framework in a serial manner which successfully detects damaged cars from static images is proposed.
Abstract: In this paper, a novel approach for automatic road accident detection is proposed. The approach is based on detecting damaged vehicles from footage received from surveillance cameras installed in roads and highways which would indicate the occurrence of a road accident. Detection of damaged cars falls under the category of object detection in the field of machine vision and has not been achieved so far. In this paper, a new supervised learning method comprising of three different stages which are combined into a single framework in a serial manner which successfully detects damaged cars from static images is proposed. The three stages use five support vector machines trained with Histogram of gradients (HOG) and Gray level co-occurrence matrix (GLCM) features. Since damaged car detection has not been attempted, two datasets of damaged cars - Damaged Cars Dataset-1 (DCD-1) and Damaged Cars Dataset-2 (DCD-2) – was compiled for public release. Experiments were conducted on DCD-1 and DCD-2 which differ based on the distance at which the image is captured and the quality of the images. The accuracy of the system is 81.83% for DCD-1 captured at approximately 2 meters with good quality and 64.37% for DCD-2 captured at approximately 20 meters with poor quality.

Proceedings ArticleDOI
04 Jul 2016
TL;DR: A novel method to learn rotation-invariant HOG (RIHOG) features for object detection in optical remote sensing images is proposed via optimizing a new objective function, which constrains the training samples before and after rotation to share the similar features to achieve rotation- Invariance.
Abstract: Object detection in very high resolution (VHR) optical remote sensing images is one of the most fundamental but challenging problems in the field of remote sensing image analysis. As object detection is usually carried out in feature space, effective feature representation is very important to construct a high-performance object detection system. During the last decades, a great deal of effort has been made to develop various feature representations for the detection of different types of objects. Among various features developed for visual object detection, the histogram of oriented gradients (HOG) feature is maybe one of the most popular features that has been successfully applied to computer vision community. However, although the HOG feature has achieved great success in nature scene images, it is problematic to directly use it for object detection in optical remote sensing images because it is difficult to effectively handle the problem of object rotation variations. To explore a possible solution to the problem, this paper proposes a novel method to learn rotation-invariant HOG (RIHOG) features for object detection in optical remote sensing images. This is achieved by learning a rotation-invariant transformation model via optimizing a new objective function, which constrains the training samples before and after rotation to share the similar features to achieve rotation-invariance. In the experiments, we evaluate the proposed method on a publicly available 10-class VHR geospatial object detection dataset and comprehensive comparisons with state-of-the-arts demonstrate the effectiveness the proposed method.

Journal ArticleDOI
TL;DR: Comparative analysis on these databases using various descriptors shows the superiority of BSIF with Cosine, Chi square and Cityblock distance measures using 1-NN as classifier over other descriptors and distance measures and even some of the current state-of-art benchmark database results.

Journal ArticleDOI
TL;DR: A new descriptor, Self-Similarity of Gradients (GSS), is proposed, which can effectively describe the similarities in a HOG feature map and achieve improved performance over the state-of-the-arts on the real-world smile dataset (GENKI-4K).