Bio: Fatemeh Zamani is an academic researcher from Sharif University of Technology. The author has contributed to research in topics: Multiple kernel learning & Contextual image classification. The author has an hindex of 2, co-authored 3 publications receiving 15 citations.
TL;DR: A method for CBIR based on the combination of texture, edge map and color using an adaptive edge detector that produces a binary edge image and using the statistics of color in two different color spaces provides complementary information to retrieve images.
Abstract: In this paper we propose a method for CBIR based on the combination of texture, edge map and color. As texture of edges yields important information about the images, we utilized an adaptive edge detector that produces a binary edge image. Also, using the statistics of color in two different color spaces provides complementary information to retrieve images. Our method is time efficient since we have applied texture calculations on the binary edge image. Our experimental results showed both the higher accuracy and lower time complexity of our method with similar related works using SIMPLIcity database.
TL;DR: This paper proposes a feature fusion based multiple kernel learning (MKL) model for image classification by using multiple kernels extracted from multiple features to address the first challenge and provides a solution for the second challenge.
Abstract: Real-world image classification, which aims to determine the semantic class of un-labeled images, is a challenging task. In this paper, we focus on two challenges of image classification and propose a method to address both of them simultaneously. The first challenge is that representing images by heterogeneous features, such as color, shape and texture, helps to provide better classification accuracy. The second challenge comes from dissimilarities in the visual appearance of images from the same class (intra class variance) and similarities between images from different classes (inter class relationship). In addition to these two challenges, we should note that the feature space of real-world images is highly complex so they cannot be linearly classified. The kernel trick is efficacious to classify them. This paper proposes a feature fusion based multiple kernel learning (MKL) model for image classification. By using multiple kernels extracted from multiple features, we address the first challenge. To provide a solution for the second challenge, we use the idea of a localized MKL by assigning separate local weights to each kernel. We employed spatial pyramid match (SPM) representation of images and computed kernel weights based on Χ 2kernel. Experimental results demonstrate that our proposed model has achieved promising results.
TL;DR: An MKL-SRC with non-fixed kernel weights for dictionary atoms is proposed and it is proved that the resulting optimization problem is convex and is solvable via common algorithms.
Abstract: Kernel based Sparse Representation Classifier (KSRC) can classify images with acceptable performance. In addition, Multiple Kernel Learning based SRC (MKL-SRC) computes the weighted sum of multiple kernels in order to construct a unified kernel while the weight of each kernel is calculated as a fixed value in the training phase. In this paper, an MKL-SRC with non-fixed kernel weights for dictionary atoms is proposed. Kernel weights are embedded as new variables to the main KSRC goal function and the resulted optimization problem is solved to find the sparse coefficients and kernel weights simultaneously. As a result, an atom specific multiple kernel dictionary is computed in the training phase which is used by SRC to classify test images. Also, it is proved that the resulting optimization problem is convex and is solvable via common algorithms. The experimental results demonstrate the effectiveness of the proposed approach.
TL;DR: In this article, a joint projection and low-rank dictionary learning method using dual graph constraints (JP-LRDL) is proposed to learn features on top of which dictionaries can be better learned, from the data with large intra-class variability.
Abstract: For an object classification system, the most critical obstacles towards real-world applications are often caused by large intra-class variability, arising from different lightings, occlusion and corruption, in limited sample sets. Most methods in the literature would fail when the training samples are heavily occluded, corrupted or have significant illumination or viewpoint variations. Besides, most of the existing methods and especially deep learning-based methods, need large training sets to achieve a satisfactory recognition performance. Although using the pre-trained network on a generic large-scale dataset and fine-tune it to the small-sized target dataset is a widely used technique, this would not help when the content of base and target datasets are very different. To address these issues, we propose a joint projection and low-rank dictionary learning method using dual graph constraints (JP-LRDL). The proposed joint learning method would enable us to learn the features on top of which dictionaries can be better learned, from the data with large intra-class variability. Specifically, a structured class-specific dictionary is learned and the discrimination is further improved by imposing a graph constraint on the coding coefficients, that maximizes the intra-class compactness and inter-class separability. We also enforce low-rank and structural incoherence constraints on sub-dictionaries to make them more compact and robust to variations and outliers and reduce the redundancy among them, respectively. To preserve the intrinsic structure of data and penalize unfavourable relationship among training samples simultaneously, we introduce a projection graph into the framework, which significantly enhances the discriminative ability of the projection matrix and makes the method robust to small-sized and high-dimensional datasets.
TL;DR: In this study, eight crop types were identified using gamma naught values and polarimetric parameters calculated from TerraSAR-X (or TanDEM-X) dual-polarimetric (HH/VV) data and the classification accuracy of four widely used machine-learning algorithms was evaluated.
Abstract: Cropland maps are useful for the management of agricultural fields and the estimation of harvest yield. Some local governments have documented field properties, including crop type and location, based on site investigations. This process, which is generally done manually, is labor-intensive, and remote-sensing techniques can be used as alternatives. In this study, eight crop types (beans, beetroot, grass, maize, potatoes, squash, winter wheat, and yams) were identified using gamma naught values and polarimetric parameters calculated from TerraSAR-X (or TanDEM-X) dual-polarimetric (HH/VV) data. Three indices (difference (D-type), simple ratio (SR), and normalized difference (ND)) were calculated using gamma naught values and m-chi decomposition parameters and were evaluated in terms of crop classification. We also evaluated the classification accuracy of four widely used machine-learning algorithms (kernel-based extreme learning machine, support vector machine, multilayer feedforward neural network (FNN), and random forest) and two multiple-kernel methods (multiple kernel extreme learning machine (MKELM) and multiple kernel learning (MKL)). MKL performed best, achieving an overall accuracy of 92.1%, and proved useful for the identification of crops with small sample sizes. The difference (raw or normalized) between double-bounce scattering and odd-bounce scattering helped to improve the identification of squash and yams fields.
TL;DR: Experimental results show that the IRAbMC algorithm outperforms Markovian semantic indexing (MSI) method with improved relevance score of retrieved ranked images.
Abstract: Image recommendation is an important feature of search engine, as tremendous amount of images are available online. It is necessary to retrieve relevant images to meet the user’s requirement. In this paper, we present an algorithm image recommendation with absorbing Markov chain (IRAbMC) to retrieve relevant images for a user’s input query. Images are ranked by calculating keyword relevance probability between annotated keywords from log and keywords of user input query. Keyword relevance is computed using absorbing Markov chain. Images are reranked using image visual features. Experimental results show that the IRAbMC algorithm outperforms Markovian semantic indexing (MSI) method with improved relevance score of retrieved ranked images.
TL;DR: This study proposes an approach called as Randomized Distributed Hashing (RDH) which uses Locality Sensitive Hashes (LSH) in a distributed scheme which is promising for searching images in large datasets with multiple nodes.
Abstract: Approximate Nearest Neighbor (ANN) search approaches that use possible neighbors instead of exact neighbors are widely investigated by researchers in recent years. ANN approaches are usually applied in a centralized manner. However, in real world applications data is usually stored in a distributed manner. This situation led to the need for implementing ANN methods in a distributed way. In this study, our goal is to perform fast and accurate search on large size image datasets by using distributed environments. For this purpose, we propose an approach called as Randomized Distributed Hashing (RDH) which uses Locality Sensitive Hashing (LSH) in a distributed scheme. In this approach, we have randomly distributed data to different nodes on a cluster. After the distribution of data, in each node we have used same randomized hash function set for indexing the local data. Then at the query stage, the query sample is locally searched in different nodes. By exploiting from parallelism, the query time performance is significantly increased. We have a speed up of 8 for the query performance in the distributed scheme with 10 nodes. The level of Mean Average Precision (MAP) scores are quite high which are comparable to other methods. We have also investigated the usage of different and selected randomized hash functions in different nodes rather than using same indexing. We create selected hash functions according to their data division property before indexing. Since LSH is data independent method, we have obtained similar results with using same hash functions. We compared our experimental results with state-of-the-art methods given in a recent study. The proposed distributed scheme is promising for searching images in large datasets with multiple nodes.
TL;DR: A novel Neuromorphic Perception Understanding Action (PUA) system is presented, that aims to combine the feature extraction benefits of CNNs with low latency processing of SCNNs, and can deliver robust results of over 96 and 81% for accuracy and Intersection over Union, ensuring such a system can be successfully used within object recognition, classification and tracking problem.
Abstract: Traditionally the Perception Action cycle is the first stage of building an autonomous robotic system and a practical way to implement a low latency reactive system within a low Size, Weight and Power (SWaP) package. However, within complex scenarios, this method can lack contextual understanding about the scene, such as object recognition-based tracking or system attention. Object detection, identification and tracking along with semantic segmentation and attention are all modern computer vision tasks in which Convolutional Neural Networks (CNN) have shown significant success, although such networks often have a large computational overhead and power requirements, which are not ideal in smaller robotics tasks. Furthermore, cloud computing and massively parallel processing like in Graphic Processing Units (GPUs) are outside the specification of many tasks due to their respective latency and SWaP constraints. In response to this, Spiking Convolutional Neural Networks (SCNNs) look to provide the feature extraction benefits of CNNs, while maintaining low latency and power overhead thanks to their asynchronous spiking event-based processing. A novel Neuromorphic Perception Understanding Action (PUA) system is presented, that aims to combine the feature extraction benefits of CNNs with low latency processing of SCNNs. The PUA utilizes a Neuromorphic Vision Sensor for Perception that facilitates asynchronous processing within a Spiking fully Convolutional Neural Network (SpikeCNN) to provide semantic segmentation and Understanding of the scene. The output is fed to a spiking control system providing Actions. With this approach, the aim is to bring features of deep learning into the lower levels of autonomous robotics, while maintaining a biologically plausible STDP rule throughout the learned encoding part of the network. The network will be shown to provide a more robust and predictable management of spiking activity with an improved thresholding response. The reported experiments show that this system can deliver robust results of over 96 and 81% for accuracy and Intersection over Union, ensuring such a system can be successfully used within object recognition, classification and tracking problem. This demonstrates that the attention of the system can be tracked accurately, while the asynchronous processing means the controller can give precise track updates with minimal latency.