scispace - formally typeset
Search or ask a question

Showing papers on "Histogram of oriented gradients published in 2019"


Journal ArticleDOI
TL;DR: In this paper, a method called invariant attribute profiles (IAPs) is proposed to extract the spatial invariant features by exploiting isotropic filter banks or convolutional kernels on HSI and spatial aggregation techniques in the Cartesian coordinate system.
Abstract: Up to the present, an enormous number of advanced techniques have been developed to enhance and extract the spatially semantic information in hyperspectral image processing and analysis. However, locally semantic change, such as scene composition, relative position between objects, spectral variability caused by illumination, atmospheric effects, and material mixture, has been less frequently investigated in modeling spatial information. As a consequence, identifying the same materials from spatially different scenes or positions can be difficult. In this paper, we propose a solution to address this issue by locally extracting invariant features from hyperspectral imagery (HSI) in both spatial and frequency domains, using a method called invariant attribute profiles (IAPs). IAPs extract the spatial invariant features by exploiting isotropic filter banks or convolutional kernels on HSI and spatial aggregation techniques (e.g., superpixel segmentation) in the Cartesian coordinate system. Furthermore, they model invariant behaviors (e.g., shift, rotation) by the means of a continuous histogram of oriented gradients constructed in a Fourier polar coordinate. This yields a combinatorial representation of spatial-frequency invariant features with application to HSI classification. Extensive experiments conducted on three promising hyperspectral datasets (Houston2013 and Houston2018) demonstrate the superiority and effectiveness of the proposed IAP method in comparison with several state-of-the-art profile-related techniques. The codes will be available from the website: this https URL.

106 citations


Journal ArticleDOI
TL;DR: A framework for integrating a multi-view learning algorithm and a sparse representation method to track ships efficiently and effectively is proposed and shown to outperforms the conventional and typical ship tracking methods.
Abstract: Conventional visual ship tracking methods employ single and shallow features for the ship tracking task, which may fail when a ship presents a different appearance and shape in maritime surveillance videos To overcome this difficulty, we propose to employ a multi-view learning algorithm to extract a highly coupled and robust ship descriptor from multiple distinct ship feature sets First, we explore multiple distinct ship feature sets consisting of a Laplacian-of-Gaussian (LoG) descriptor, a Local Binary Patterns (LBP) descriptor, a Gabor filter, a Histogram of Oriented Gradients (HOG) descriptor and a Canny descriptor, which present geometry structure, texture and contour information, and more Then, we propose a framework for integrating a multi-view learning algorithm and a sparse representation method to track ships efficiently and effectively Finally, our framework is evaluated in four typical maritime surveillance scenarios The experimental results show that the proposed framework outperforms the conventional and typical ship tracking methods

94 citations


Journal ArticleDOI
01 Nov 2019
TL;DR: The results from empirical evaluations and statistical tests indicate the superiority of the proposed models over other advanced PSO variants and classical search methods pertaining to discriminative feature selection and optimal hyper-parameter identification for deep learning networks in lesion classification as well as other disease diagnosis.
Abstract: In this research, we propose an intelligent decision support system for skin cancer detection. Since generating an effective lesion representation is a vital step to ensure the success of lesion classification, the discriminative power of different types of features is exploited. Specifically, we combine clinically important asymmetry, border irregularity, colour and dermoscopic structure features with texture features extracted using Grey Level Run Length Matrix, Local Binary Patterns, and Histogram of Oriented Gradients operators for lesion representation. Then, we propose two enhanced Particle Swarm Optimization (PSO) models for feature optimization. The first model employs adaptive acceleration coefficients, multiple remote leaders, in-depth sub-dimension feature search and re-initialization mechanisms to overcome stagnation. The second model uses random acceleration coefficients, instead of adaptive ones, based on non-linear circle, sine and helix functions, respectively, to increase diversification and intensification. Ensemble classifiers are also constructed with each base model trained using each optimized feature subset. A deep convolutional neural network is devised whose hyper-parameters are fine-tuned using the proposed PSO models. Extensive experimental studies using dermoscopic skin lesion data, medical data from the UCI machine learning repository, and ALL-IDB2 image data are conducted to evaluate the model efficiency systematically. The results from empirical evaluations and statistical tests indicate the superiority of the proposed models over other advanced PSO variants and classical search methods pertaining to discriminative feature selection and optimal hyper-parameter identification for deep learning networks in lesion classification as well as other disease diagnosis.

82 citations


Journal ArticleDOI
TL;DR: In deep learning environment, U-Net segmentation algorithm is found to be the best method for segmentation and it helps to improve the classification performance of melanoma.
Abstract: Objective: The main objective of this study is to improve the classification performance of melanoma using deep learning based automatic skin lesion segmentation. It can be assist medical experts on early diagnosis of melanoma on dermoscopy images. Methods: First A Convolutional Neural Network (CNN) based U-net algorithm is used for segmentation process. Then extract color, texture and shape features from the segmented image using Local Binary Pattern ( LBP), Edge Histogram (EH), Histogram of Oriented Gradients (HOG) and Gabor method. Finally all the features extracted from these methods were fed into the Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbor (KNN) and Naive Bayes (NB) classifiers to diagnose the skin image which is either melanoma or benign lesions. Results: Experimental results show the effectiveness of the proposed method. The Dice co-efficiency value of 77.5% is achieved for image segmentation and SVM classifier produced 85.19% of accuracy. Conclusion: In deep learning environment, U-Net segmentation algorithm is found to be the best method for segmentation and it helps to improve the classification performance.

70 citations


Journal ArticleDOI
20 Mar 2019
TL;DR: A novel instrument for pedestrian detection by combining stereo vision cameras with a thermal camera is presented, and it significantly outperforms the traditional histogram of oriented gradients features.
Abstract: Pedestrian detection is a critical feature of autonomous vehicle or advanced driver assistance system. This paper presents a novel instrument for pedestrian detection by combining stereo vision cameras with a thermal camera. A new dataset for vehicle applications is built from the test vehicle recorded data when driving on city roads. Data received from multiple cameras are aligned using trifocal tensor with pre-calibrated parameters. Candidates are generated from each image frame using sliding windows across multiple scales. A reconfigurable detector framework is proposed, in which feature extraction and classification are two separate stages. The input to the detector can be the color image, disparity map, thermal data, or any of their combinations. When applying to convolutional channel features, feature extraction utilizes the first three convolutional layers of a pre-trained convolutional neural network cascaded with an AdaBoost classifier. The evaluation results show that it significantly outperforms the traditional histogram of oriented gradients features. The proposed pedestrian detector with multi-spectral cameras can achieve 9% log-average miss rate. The experimental dataset is made available at http://computing.wpi.edu/dataset.html .

63 citations


Proceedings ArticleDOI
02 Apr 2019
TL;DR: An automatic system of face expression recognition which is able to recognize all eight basic facial expressions which are normal, happy, angry, contempt, surprise, sad, fear and disgust is presented.
Abstract: Facial Expression Recognition (FER) has been an active topic of papers that were researched during 1990s till now, according to its importance, FER has achieved an extremely role in image processing area. FER typically performed in three stages include, face detection, feature extraction and classification. This paper presents an automatic system of face expression recognition which is able to recognize all eight basic facial expressions which are (normal, happy, angry, contempt, surprise, sad, fear and disgust) while many FER systems were proposed for recognizing only some of face expressions. For validating the method, the Extended Cohn-Kanade (CK+) dataset is used. The presented method uses Viola-Jones algorithm for face detection. Histogram of Oriented Gradients (HOG) is used as a descriptor for feature extraction from the images of expressive faces. Principal Component Analysis (PCA) applied to reduce dimensionality of the Features, to obtaining the most significant features. Finally, the presented method used three different classifiers which are Support Vector Machine (SVM), K-Nearest Neighbor (KNN) and Multilayer Perceptron Neural Network (MLPNN) for classifying the facial expressions and the results of them are compared. The experimental results show that the presented method provides the recognition rate with 93.53% when using SVM classifier, 82.97% when using MLP classifier and 79.97% when using KNN classifier which refers that the presented method provides better results while using SVM as a classifier.

51 citations


Journal ArticleDOI
TL;DR: A novel multi-level ship detection algorithm is proposed to detect various types of offshore ships more precisely and quickly under all possible imaging variations, which can produce more accurate candidate regions compared with the threshold segmentation.
Abstract: Automatic ship detection by Unmanned Airborne Vehicles (UAVs) and satellites is one of the fundamental challenges in maritime research due to the variable appearances of ships and complex sea backgrounds. To address this issue, in this paper, a novel multi-level ship detection algorithm is proposed to detect various types of offshore ships more precisely and quickly under all possible imaging variations. Our object detection system consists of two phases. First, in the category-independent region proposal phase, the steerable pyramid for multi-scale analysis is performed to generate a set of saliency maps in which the candidate region pixels are assigned to high salient values. Then, the set of saliency maps is used for constructing the graph-based segmentation, which can produce more accurate candidate regions compared with the threshold segmentation. More importantly, the proposed algorithm can produce a rather smaller set of candidates in comparison with the classical sliding window object detection paradigm or the other region proposal algorithms. Second, in the target identification phase, a rotation-invariant descriptor, which combines the histogram of oriented gradients (HOG) cells and the Fourier basis together, is investigated to distinguish between ships and non-ships. Meanwhile, the main direction of the ship can also be estimated in this phase. The overall algorithm can account for large variations in scale and rotation. Experiments on optical remote sensing (ORS) images demonstrate the effectiveness and robustness of our detection system.

34 citations


Journal ArticleDOI
TL;DR: A new feature selection (FS) algorithm based on Late Hill Climbing and Memetic Algorithm is proposed, which reduces the feature dimension to a significant amount as well as increases the recognition accuracy as compared to other methods.
Abstract: Facial Emotion Recognition (FER) is an important research domain which allows us to provide a better interactive environment between humans and computers. Some standard and popular features extracted from facial expression images include Uniform Local Binary Pattern (uLBP), Horizontal-Vertical Neighborhood Local Binary Pattern (hvnLBP), Gabor filters, Histogram of Oriented Gradients (HOG) and Pyramidal HOG (PHOG). However, these feature vectors may contain some features that are irrelevant or redundant in nature, thereby increasing the overall computational time as well as recognition error of a classification system. To counter this problem, we have proposed a new feature selection (FS) algorithm based on Late Hill Climbing and Memetic Algorithm (MA). A novel local search technique called Late Acceptance Hill Climbing through Redundancy and Relevancy (LAHCRR) has been used in this regard. It combines the concepts of Local Hill-Climbing and minimal-Redundancy Maximal-Relevance (mRMR) to form a more effective local search mechanism in MA. The algorithm is then evaluated on the said feature vectors extracted from the facial images of two popular FER datasets, namely RaFD and JAFFE. LAHCRR is used as local search in MA to form Late Hill Climbing based Memetic Algorithm (LHCMA). LHCMA is compared with state-of-the-art methods. The experimental outcomes show that the proposed FS algorithm reduces the feature dimension to a significant amount as well as increases the recognition accuracy as compared to other methods.

34 citations


Journal ArticleDOI
TL;DR: A hybrid system for pedestrian detection, in which both thermal and visible images of the same scene are used, and tests are done to check the presence of pedestrians in the generated hypotheses.
Abstract: In this paper, we propose a hybrid system for pedestrian detection, in which both thermal and visible images of the same scene are used. The proposed method is achieved in two basic steps: (1) Hypotheses generation (HG) where the locations of possible pedestrians in an image are determined and (2) hypotheses verification (HV), where tests are done to check the presence of pedestrians in the generated hypotheses. HG step segments the thermal image using a modified version of OTSU thresholding technique. The segmentation results are mapped into the corresponding visible image to obtain the regions of interests (possible pedestrians). A post-processing is done on the resulting regions of interests to keep only significant ones. HV is performed using random forest as classifier and a color-based histogram of oriented gradients (HOG) together with the histograms of oriented optical flow (HOOF) as features. The proposed approach has been tested on OSU Color-Thermal, INO Video Analytics and LITIV data sets and the results justify its effectiveness.

33 citations


Journal ArticleDOI
TL;DR: An efficient method by combining the two most successful local feature descriptors such as Pyramid Histogram of Oriented Gradients and Local Directional Patterns to represent ear images to achieve promising recognition performance in comparison with other existing successful methods is presented.
Abstract: Achieving higher recognition performance in uncontrolled scenarios is a key issue for ear biometric systems. It is almost difficult to generate all discriminative features by using a single feature extraction method. This paper presents an efficient method by combining the two most successful local feature descriptors such as Pyramid Histogram of Oriented Gradients (PHOG) and Local Directional Patterns (LDP) to represent ear images. The PHOG represents spatial shape information and the LDP efficiently encodes local texture information. As the feature sets are curse of high dimension, we used principal component analysis (PCA) to reduce the dimension prior to normalization and fusion. Then, two normalized heterogeneous feature sets are combined to produce single feature vector. Finally, the Kernel Discriminant Analysis (KDA) method is employed to extract nonlinear discriminant features for efficient recognition using a nearest neighbor (NN) classifier. Experiments on three standard datasets IIT Delhi version (I and II) and University of Notre Dame collection E reveal that the proposed method can achieve promising recognition performance in comparison with other existing successful methods.

31 citations


Proceedings ArticleDOI
21 Mar 2019
TL;DR: A comparative study of three commonly used approaches for face detection, namely Haar-like cascade, Histogram of Oriented Gradients with Support Vector Machine and Linear Binary Pattern cascade, shows that HOG+SVM approach is more robust and accurate than LBP and Haar approaches.
Abstract: Face detection is an essential part of any face recognition system as a first step to detect faces. This paper presents a comparative study of three commonly used approaches for face detection, namely Haar-like cascade, Histogram of Oriented Gradients with Support Vector Machine and Linear Binary Pattern cascade. For this aim, video sequences from the Database for Emotion Analysis using Physiological Signals (DEAP) were explored. The proposed methods were developed using Python language with OpenCV and Dlib libraries. The obtained results show that HOG+SVM approach is more robust and accurate than LBP and Haar approaches with an average detection rate of 92.68%.

Journal ArticleDOI
TL;DR: A novel multiclass vehicle detection system based on tensor decomposition and object proposal based on a state-of-the-art object-proposal method, local features, and image region similarity is presented.
Abstract: Night-time vehicle detection is essential in building intelligent transportation systems (ITS) for road safety. Most of current night-time vehicle detection approaches focus on one or two classes of vehicles. In this paper, we present a novel multiclass vehicle detection system based on tensor decomposition and object proposal. Commonly used features such as histogram of oriented gradients and local binary pattern often produce useless image blocks (regions), which can result in unsatisfactory detection performance. Thus, we select blocks via feature ranking after tensor decomposition and only extract features from these selected blocks. To generate windows that contain all vehicles, we propose a novel object-proposal approach based on a state-of-the-art object-proposal method, local features, and image region similarity. The three terms are summed with learned weights to compute the reliability score of each proposal. A bio-inspired image enhancement method is used to enhance the brightness and contrast of input images. We have built a Hong Kong night-time multiclass vehicle dataset for evaluation. Our proposed vehicle detection approach can successfully detect four types of vehicles: 1) car; 2) taxi; 3) bus; and 4) minibus. Occluded vehicles and vehicles in the rain can also be detected. Our proposed method obtains 95.82% detection rate at 0.05 false positives per image, and it outperforms several state-of-the-art night-time vehicle detection approaches.

Journal ArticleDOI
TL;DR: Experimental results indicate that the proposed local maxima of difference image (LMDI) based interest point detection technique provides better performance compared to earlier reported techniques.
Abstract: Human action recognition which needs video processing in real time, requires large memory size and execution time. This work proposes a local maxima of difference image (LMDI) based interest point detection technique, random projection tree with overlapping split and modified voting score for human action recognition. In LMDI based interest point detection method, difference images are obtained using consecutive frame differencing technique and next, 3D peak detection is applied on the bunch of calculated difference images. Histogram of oriented gradients and histogram of optical flow as local features are extracted by defining a block of size 16 × 16 around each of the interest point. These local features are then indexed by random projection trees. Overlapping split is used during tree structuring to reduce failure probability. Hough voting technique is applied on testing video to compute highest similarity matching score with individual training classes. In addition to Hough voting score, the number of matched interest points of a single query video with each training class, is considered for recognition. The proposed method is evaluated on segmented UT-interaction dataset, J-HMDB dataset and UCF101 dataset. The experimental results indicate that the proposed technique provides better performance compared to earlier reported techniques.

Journal ArticleDOI
TL;DR: A framework based on hybrid feature set and hierarchical classification approach to segment blood vessels from digital retinal images that can achieve better results than most state-of-the-art methods is proposed.
Abstract: Retinal blood vessels play an imperative role in detection of many ailments, such as cardiovascular diseases, hypertension, and diabetic retinopathy. The automated way of segmenting vessels from retinal images can help in early detection of many diseases. In this paper, we propose a framework based on hybrid feature set and hierarchical classification approach to segment blood vessels from digital retinal images. Firstly, we apply bidirectional histogram equalization on the inverted green channel to enhance the fundus image. Six discriminative feature extraction methods have been employed comprising of local intensities, local binary patterns, histogram of gradients, divergence of vector field, high-order local autocorrelations, and morphological transformation. The selection of feature sets has been carried out by classifying vessel and background pixels using random forests and evaluating the segmentation performance for each category of features. The selected feature sets are then used in conjunction with our proposed hierarchical classification approach to segment the vessels. The proposed framework has been tested on the DRIVE, STARE, and CHASEDB1 which are the benchmark datasets for retinal vessel segmentation methods. The results obtained from the experimental analysis show that the proposed framework can achieve better results than most state-of-the-art methods.

Journal ArticleDOI
TL;DR: An unsupervised classification method is used to detect and differentiate environmental classes (scene interpretation) in the target or investigated area by using the high-resolution images acquired through autonomous drone navigation aided with landmark detection and recognition.
Abstract: A method is presented for scene detection and estimation using high-resolution imagery acquired through autonomous drone navigation aided with landmark detection and recognition. The proposed system comprises a drone platform that facilitates efficient autonomous flight; it can capture images and provide real-time video streaming of the ground cover using a camera equipped with a 14-megapixel CMOS sensor and a fish-eye lens. In addition, landmark detection and recognition was performed by applying the histogram of oriented gradients and linear support vector machine methods on each frame of the video stream. The high spatial resolution of the acquired drone images makes the detection and interpretation of environments less complicated. First, through image processing, orthomosaic images and 3-D environment reconstruction (point clouds) of the scene are generated from a set of drone images by using an automatic photogrammetric technique called “structure from motion.” Subsequently, an unsupervised classification method is used to detect and differentiate environmental classes (scene interpretation) in the target or investigated area by using the high-resolution images. Finally, the results of the proposed method are evaluated by comparing them against ground-truth points.

Journal ArticleDOI
Zhitao Fu1, Qianqing Qin1, Bin Luo1, Chun Wu1, Hong Sun1 
TL;DR: The experimental results confirm that the proposed HoDM descriptor is robust to the nonlinear intensity changes of multispectral images and has a superior matching performance as well as a much higher computational efficiency.
Abstract: Due to the significant nonlinear intensity changes of multispectral images, automatic image feature point matching is a challenging task. This letter addresses the problem and proposes a novel descriptor combining the structure and texture information to solve the nonlinear intensity variations of multispectral images. We first propose directional maps, i.e., the directional response maps (DMs) and the directional response binary maps (DBMs), which can capture the common structure and texture properties of multispectral images, respectively. We then use the spatial pooling pattern of the histogram of oriented gradients to separately describe the local region of each point of interest based on the DMs and DBMs. In order to speed up the calculation, we apply Gaussian filters to the DMs and average filters to the DBMs to construct the per-pixel histogram bins. Finally, we conjoin the normalized feature vectors corresponding to the structure description and texture description of each point of interest to obtain the histograms of directional maps (HoDMs). The proposed HoDM descriptor was evaluated using three data sets composed of images obtained in both visible light and infrared spectra. The experimental results confirm that the proposed HoDM descriptor is robust to the nonlinear intensity changes of multispectral images and has a superior matching performance as well as a much higher computational efficiency.

Journal ArticleDOI
TL;DR: This paper investigates various feature sets based on the fusion of acoustic and visual feature aggregation for acoustic scene classification based on spectral centroid, spectral entropy, spectral flux, spectral roll-off, short-time energy, zero-crossing rate and Mel-frequency Cepstral coefficients.
Abstract: Acoustic scene classification has gained great interests in recent years due to its diverse applications. Various acoustic and visual features have been proposed and evaluated. However, few studies have investigated acoustic and visual feature aggregation for acoustic scene classification. In this paper, we investigated various feature sets based on the fusion of acoustic and visual features. Specifically, acoustic features are directly extracted from the waveform: spectral centroid, spectral entropy, spectral flux, spectral roll-off, short-time energy, zero-crossing rate, and Mel-frequency Cepstral coefficients. For visual features, we calculate local binary pattern, histogram of gradients, and moments based on the audio scene time-frequency representation. Then, three feature selection algorithms are applied to various feature sets to reduce feature dimensionality: correlation-based feature selection, principal component analysis, and ReliefF. Experimental results show that our proposed system was able to achieve an accuracy improvement of 15.43% compared to the baseline system with the development set. When all development sets are used for training, the performance based on the evaluation set provided by the TUT Acoustic scene 2016 challenge is 87.44%, which is the fourth best among all non-neural network systems.

Journal ArticleDOI
TL;DR: In the proposed system, spatially enhanced local binary pattern (SLBP) and histogram of oriented gradients (HOG) are extracted to classify the human gender with SVM classifier and the combination of two different local descriptors provides good representation of face image and this is given to SVMclassifier which classifies as male or female.
Abstract: Gender classification from facial images plays a significant role in biometric technology viz. gender medicine, surveillance, electronic banking system and human computer interaction. However, it has many challenges due to variations of pose, expression, aging, race, make-up, occlusion and illumination. In the proposed system, spatially enhanced local binary pattern (SLBP) and histogram of oriented gradients (HOG) are extracted to classify the human gender with SVM classifier. This hybrid feature selection has increased the power of the proposed system due to its representation of texture micro-patterns and local shape by capturing the edge or gradient structure form the image. The gender classification accuracy is studied by using the local feature representation of the face images separately and also these features are concatenated to provide a better recognition rate. The combination of two different local descriptors provides good representation of face image and this is given to SVM classifier which classifies as male or female. Also, the proposed work is compared with other two traditional classifiers such as k-nearest neighbor and sparse representation classifier. The performance was evaluated on FERET and LFW database. The highest classification accuracy 99.1% is achieved on FERET database and 95.7% is achieved on LFW database by applying cubic SVM with fusion of SLBP and HOG features.


Journal ArticleDOI
TL;DR: In this article, the Histogram of Oriented Gradients (HOG) was used to capture the edge patterns of arcs of strong lensing systems. But the accuracy of the HOG-based classifier was only 0.6 in the F814 filter image.
Abstract: Forthcoming surveys such as the Large Synoptic Survey Telescope (LSST) and Euclid necessitate automatic and efficient identification methods of strong lensing systems. We present a strong lensing identification approach that utilizes a feature extraction method from computer vision, the Histogram of Oriented Gradients (HOG), to capture edge patterns of arcs. We train a supervised classifier model on the HOG of mock strong galaxy-galaxy lens images similar to observations from the Hubble Space Telescope (HST) and LSST. We assess model performance with the area under the curve (AUC) of a Receiver Operating Characteristic (ROC) curve. Models trained on 10,000 lens and non-lens containing images images exhibit an AUC of 0.975 for an HST-like sample, 0.625 for one exposure of LSST, and 0.809 for 10-year mock LSST observations. Performance appears to continually improve with the training set size. Models trained on fewer images perform better in absence of the lens galaxy light. However, with larger training data sets, information from the lens galaxy actually improves model performance, indicating that HOG captures much of the morphological complexity of the arc finding problem. We test our classifier on data from the Sloan Lens ACS Survey and find that small scale image features reduces the efficiency of our trained model. However, these preliminary tests indicate that some parameterizations of HOG can compensate for differences between observed mock data. One example best-case parameterization results in an AUC of 0.6 in the F814 filter image with other parameterization results equivalent to random performance.

Journal ArticleDOI
Han Liu1, Li Zhang
TL;DR: The proposed ensemble learning framework is used effectively to build an ensemble of ensembles acting as a group of expert systems, which show the capability to achieve more stable performance of pattern recognition, in comparison with building a single classifier that acts as a single expert system.
Abstract: Classification is a special type of machine learning tasks, which is essentially achieved by training a classifier that can be used to classify new instances. In order to train a high performance classifier, it is crucial to extract representative features from raw data, such as text and images. In reality, instances could be highly diverse even if they belong to the same class, which indicates different instances of the same class could represent very different characteristics. For example, in a facial expression recognition task, some instances may be better described by Histogram of Oriented Gradients features, while others may be better presented by Local Binary Patterns features. From this point of view, it is necessary to adopt ensemble learning to train different classifiers on different feature sets and to fuse these classifiers towards more accurate classification of each instance. On the other hand, different algorithms are likely to show different suitability for training classifiers on different feature sets. It shows again the necessity to adopt ensemble learning towards advances in the classification performance. Furthermore, a multi-class classification task would become increasingly more complex when the number of classes is increased, i.e. it would lead to the increased difficulty in terms of discriminating different classes. In this paper, we propose an ensemble learning framework that involves transforming a multi-class classification task into a number of binary classification tasks and fusion of classifiers trained on different feature sets by using different learning algorithms. We report experimental studies on a UCI data set on Sonar and the CK+ data set on facial expression recognition. The results show that our proposed ensemble learning approach leads to considerable advances in classification performance, in comparison with popular learning approaches including decision tree ensembles and deep neural networks. In practice, the proposed approach can be used effectively to build an ensemble of ensembles acting as a group of expert systems, which show the capability to achieve more stable performance of pattern recognition, in comparison with building a single classifier that acts as a single expert system.

Journal ArticleDOI
TL;DR: Experiments conducted on three public kinship databases show that the proposed descriptor can outperform many state-of-the-art kinship verification algorithms and descriptors including those that are based on deep Convolutional Neural Nets.
Abstract: Texture descriptors such as Local Binary Pattern (LBP), Local Phase Quantization (LPQ), and Histogram of Oriented Gradients (HOG) have been widely used for face image analysis. This work introduces a novel framework for image-based kinship verification able to efficiently combine local and global facial information extracted from diverse descriptors. The proposed scheme relies on two main points: (1) we model the face images using a Pyramid Multi-level (PML) representation where local descriptors are extracted from several blocks at different resolution scales; (2) we compute the covariance (second-order statistics) between diverse local features characterizing each individual block in the PML representation. This gives rise to a face descriptor with two interesting properties: (i) thanks to the PML representation, scales and face parts are explicitly encoded in the final descriptor without having to detect the facial landmarks; (ii) the covariance descriptor encodes spatial features of any type allowing the integration of several state-of-the-art texture and color features. Experiments conducted on three public kinship databases show that the proposed descriptor can outperform many state-of-the-art kinship verification algorithms and descriptors including those that are based on deep Convolutional Neural Nets.

Proceedings ArticleDOI
15 Apr 2019
TL;DR: This paper proposes a warning notification diffusion solution related to real-time pedestrian presence detection, through an inter-vehicle communication system, using Histogram of Oriented Gradients descriptor with the linear Support Vector Machine classifier, and Haar feature-based cascade classifier to reach vehicle detection.
Abstract: The ability to perceive and understand surrounding road-users behaviors is crucial for self-driving vehicles to correctly plan reliable reactions. Computer vision that relies mostly on machine learning techniques enables autonomous vehicles to perform several required tasks such as pedestrian detection. Furthermore, within a fully autonomous driving environment, driverless vehicle has to communicate and share perceived data with its neighboring vehicles for more safe navigation. In this context, our paper proposes a warning notification diffusion solution related to real-time pedestrian presence detection, through an inter-vehicle communication system. To achieve this purpose, pedestrian and vehicle recognition is required. Thus, we implemented intended detectors. We used Histogram of Oriented Gradients (HOG) descriptor with the linear Support Vector Machine (SVM) classifier for the pedestrian detector, and Haar feature-based cascade classifier to reach vehicle detection. The performance evaluation of our solution leads to fairly good detection accuracy around 90% for pedestrian and 88% for vehicle.

Journal ArticleDOI
TL;DR: The experimental results show that the pointer extraction method is robust to interferences caused by connected components of digits and also that the established character segmentation classifier has a more accurate detection result.
Abstract: Computer vision based detection approaches are widely employed to detect or calibrate different types of meters nowadays. However, traditional detection algorithms suffer drawbacks in accuracy and adaptability upon detecting various types of automobile dashboards. Plenty of parameters of these algorithms need to be tuned to suit certain types of dashboards. Besides, theses algorithms cannot automatically read the speed value, which requires manual setting operations. In this paper, a novel approach is presented to adaptively detect different types of automobile dashboards. The contour analysis based method is first implemented to extract the connected component of the pointer. A robust character segmentation classifier, which is designed by cascading histogram of oriented gradients (HOG)/support vector machine (SVM) binary classifier, character filter as well as HOG/multiclass SVM digit classifier, is then proposed to recognize digit characters on the dashboard. Simultaneously, tick marks are then extracted based on recognition results. Finally, Newton interpolation linear relationship is established to diagnose the potential responding errors of the pointer. The experimental results show that the pointer extraction method is robust to interferences caused by connected components of digits and also that the established character segmentation classifier has a more accurate detection result. Furthermore, compared with similar algorithms, it has a significant advantage in detecting a vast majority of different dashboards without manual tuning of the parameters.

Journal ArticleDOI
TL;DR: A novel shape characterization, representation scheme is presented by blending phase congruency (PC) with histogram of oriented gradients (HOG), labelled as PC-HOG, which makes it to be invariant to different affine transformations.
Abstract: Shape matching and retrieval is a challenging issue in computer vision owing to the complications in realizing highly accurate descriptors. Herein, a novel shape characterization, representation scheme is presented by blending phase congruency (PC) with histogram of oriented gradients (HOG), labelled as PC-HOG. Firstly, PC is applied on the shapes to obtain contour points that is then operated by HOG to formulate the feature vector. The resulting descriptor is evaluated on shape datasets like MPEG-7 CE shape-1 part B, TARI-1000 and Kimia’s 99. Relatively consistent Bull’s Eye Retrieval rate of 90% was achieved by the proposed descriptor across the diverse datasets. Also, noise analysis of the proposed descriptor in diverse datasets is performed to signify the scheme’s robustness against noise. Furthermore, the inherent nature of PC-HOG makes it to be invariant to different affine transformations.

Journal ArticleDOI
TL;DR: The experimental outcome shows that the proposed methodology improved accuracy in breast cancer classification up to 3% to 9% compared to other existing methods.
Abstract: Breast cancer detection is the most challenging aspect in the field of health monitoring system. In this paper, breast cancer detection was assessed by employing Mammographic Image Analysis Society (MIAS) dataset. The proposed approach contains four major steps, namely, image‐preprocessing, segmentation, feature extraction, and classification. Initially, Laplacian filtering was utilized to identify the area of edges in mammogram images and, also, it was very sensitive to noise. Then, segmentation was carried‐out using modified‐Adaptively Regularized Kernel‐based Fuzzy‐C‐Means (ARKFCM); it was a flexible high level machine learning technique to localize the object in complex template. In conventional ARKFCM, it was hard to segment the ill‐defined masses in mammogram images. To address this concern, the Euclidean distance in ARKFCM was replaced by correlation function in order to improve the segmentation efficiency. The hybrid feature extraction (Histogram of Oriented Gradients (HOG), homogeneity, and energy) was performed on the segmented cancer region to extract feature subsets. The respective feature values were given as the input for a multi‐objective classifier: Deep Neural Network (DNN) for classifying the normal and abnormal regions in mammogram images. The experimental outcome shows that the proposed methodology improved accuracy in breast cancer classification up to 3% to 9% compared to other existing methods.

Journal ArticleDOI
TL;DR: In this paper, a new method is introduced for Facial expression recognition using FER2013 database consisting seven classes consisting (Surprise, Fear, Angry, Neutral, Sad, Disgust, Happy) in past few decades, Exploration of methods to recognize facial expressions have been active research area and many applications have been developed for feature extraction and inference.
Abstract: Objectives: A new method is introduced in this study for Facial expression recognition using FER2013 database consisting seven classes consisting (Surprise, Fear, Angry, Neutral, Sad, Disgust, Happy) in past few decades, Exploration of methods to recognize facial expressions have been active research area and many applications have been developed for feature extraction and inference. However, it is still challenging due to the high-intra class variation. Methods/Statistical Analysis: we deeply analyzed the accuracy of both handcrafted and leaned aspects such as HOG. This study proposed two models; (1) FER using Deep Convolutional Neural Network (FER-CNN) and (2) Histogram of oriented Gradients based Deep Convolutional Neural Network (FER-HOGCNN). the training and testing accuracy of FER-CNN model set 98%, 72%, similarly Losses were 0.02, 2.02 respectively. On the other side, the training and testing accuracy of FER- HOGCNN model set 97%, 70%, similarly Losses were 0.04, 2.04. Findings: It has been found that the accuracy of FER- HOGCNN model is good overall but comparatively not better than Simple FER-CNN. In dataset the quality of images are low and small dimensions, for that reason, the HOG loses some important features during training and testing. Application/Improvements: The study helps for improving the FER System in image processing and furthermore, this work shall be extended in future, and order to extract the important features from images by combining LBP and HOG operator using Deep Learning models. Keywords: Deep Learning, Emotion Recognition, Facial Expression, CNN, FER, HOG

Journal ArticleDOI
TL;DR: The findings of this study indicate that automated density scoring in mammograms can aid clinical diagnosis by introducing artificial intelligence-powered decision-support systems and contribute to the ‘democratization’ of healthcare by overcoming limitations, such as the geographic location of patients or the lack of expert radiologists.
Abstract: Potentially suspicious breast neoplasms could be masked by high tissue density, thus increasing the probability of a false‑negative diagnosis. Furthermore, differentiating breast tissue type enables patient pre‑screening stratification and risk assessment. In this study, we propose and evaluate advanced machine learning methodologies aiming at an objective and reliable method for breast density scoring from routine mammographic images. The proposed image analysis pipeline incorporates texture [Gabor filters and local binary pattern (LBP)] and gradient‑based features [histogram of oriented gradients (HOG) as well as speeded‑up robust features (SURF)]. Additionally, transfer learning approaches with ImageNet trained weights were also used for comparison, as well as a convolutional neural network (CNN). The proposed CNN model was fully trained on two open mammography datasets and was found to be the optimal performing methodology (AUC up to 87.3%). Thus, the findings of this study indicate that automated density scoring in mammograms can aid clinical diagnosis by introducing artificial intelligence‑powered decision‑support systems and contribute to the 'democratization' of healthcare by overcoming limitations, such as the geographic location of patients or the lack of expert radiologists.

Proceedings ArticleDOI
04 Sep 2019
TL;DR: An intelligent vision system embedded on a smartphone and deployed in the wild to detect and recognize British Visual Language (BSL) signs automatically and shown an accuracy of over 99% with an average processing time of 170ms, thus appropriate for real-time visual signing.
Abstract: Developing assistive, cost-effective, non-invasive technologies to aid communication of people with hearing impairments is of prime importance in our society, in order to widen accessibility and inclusiveness. For this purpose, we have developed an intelligent vision system embedded on a smartphone and deployed in the wild. In particular, it integrates both computer vision methods involving Histogram of Oriented Gradients (HOG) and machine learning techniques such as multi-class Support Vector Machine (SVM) to detect and recognize British Visual Language (BSL) signs automatically. Our system was successfully tested on a real-world dataset containing 13,066 samples and shown an accuracy of over 99% with an average processing time of 170ms, thus appropriate for real-time visual signing.

Proceedings ArticleDOI
01 Oct 2019
TL;DR: A set of algorithms to automatically detect floodwater that may be present in an image captured by mobile phones or other types of optical cameras is proposed and investigated superpixel based methods and Fully Convolutional Neural Network.
Abstract: Detecting roadway segments inundated due to floodwater has important applications for vehicle routing and traffic management decisions. This paper proposes a set of algorithms to automatically detect floodwater that may be present in an image captured by mobile phones or other types of optical cameras. For this purpose, image classification and flood area segmentation methods are developed. For the classification task, we used Local Binary Patterns (LBP), Histogram of Oriented Gradients (HOG) and pre-trained deep neural network (VGG-16) as feature extractors and trained logistic regression, k-nearest neighbors, and decision tree classifiers on the extracted features. Pre-trained VGG-16 network with logistic regression classifier outperformed all other methods. For the flood area segmentation task, we investigated superpixel based methods and Fully Convolutional Neural Network (FCN). Similar to the classification task, we trained logistic regression and k-nearest neighbors classifiers on the superpixel areas and compared that with an end-to-end trained FCN. Conditional Random Fields (CRF) method was applied after both segmentation methods to post-process coarse segmentation results. FCN offered the highest scores in all metrics; it was followed by superpixel-based logistic regression and then superpixel-based KNN.