scispace - formally typeset
Search or ask a question
Author

Mingliang Gao

Bio: Mingliang Gao is an academic researcher from Shandong University of Technology. The author has contributed to research in topics: Computer science & Artificial intelligence. The author has an hindex of 12, co-authored 51 publications receiving 402 citations. Previous affiliations of Mingliang Gao include Sichuan University & Minzu University of China.

Papers published on a yearly basis

Papers
More filters
Journal ArticleDOI
TL;DR: Experimental results demonstrate that the proposed fusion solution, i.e., SEDRFuse, outperforms the state-of-the-art fusion methods in terms of both subjective and objective evaluations.
Abstract: Image fusion is an important task for computer vision as a diverse range of applications are benefiting from the fusion operation. The existing image fusion methods are largely implemented at the pixel level, which may introduce artifacts and/or inconsistencies, while the computational complexity is relatively high. In this article, we propose a symmetric encoder–decoder with residual block (SEDRFuse) network to fuse infrared and visible images for night vision applications. At the training stage, the SEDRFuse network is trained to create a fixed feature extractor. At the fusing stage, the trained extractor is utilized to extract the intermediate and compensation features, which are generated by the residual block and the first two convolutional layers from the input source images, respectively. Two attention maps, which are derived from the intermediate features, are then multiplied by the intermediate features for fusion. The salient compensation features obtained through elementwise selection are passed to the corresponding deconvolutional layers for processing. Finally, the fused intermediate features and the selected compensation features are decoded to reconstruct the fused image. Experimental results demonstrate that the proposed fusion solution, i.e., SEDRFuse, outperforms the state-of-the-art fusion methods in terms of both subjective and objective evaluations.

74 citations

Journal ArticleDOI
TL;DR: Comparative results show that the BA-based tracker outperforms the other three trackers, namely, particle filter, meanshift and particle swarm optimization.

52 citations

Journal ArticleDOI
TL;DR: A general optimisation-based tracking architecture is proposed and the parameters' sensitivity and adjustment of the FA in tracking system are studied, showing that the FA-based tracker can robustly track an arbitrary target in various challenging conditions.
Abstract: Firefly algorithm (FA) is a new meta-heuristic optimisation algorithm that mimics the social behaviour of fireflies flying in the tropical and temperate summer sky. In this study, a novel application of FA is presented as it is applied to solve tracking problem. A general optimisation-based tracking architecture is proposed and the parameters' sensitivity and adjustment of the FA in tracking system are studied. Experimental results show that the FA-based tracker can robustly track an arbitrary target in various challenging conditions. The authors compare the speed and accuracy of the FA with three typical tracking algorithms including the particle filter, meanshift and particle swarm optimisation. Comparative results show that the FA-based tracker outperforms the other three trackers.

51 citations

Journal ArticleDOI
01 Sep 2015-Optik
TL;DR: An improved particle filter based on firefly algorithm is proposed to solve the problem of sample impoverishment, where the number of meaningful particles can be increased, and the particles can approximate the true state of the target more accurately.

42 citations

Journal ArticleDOI
TL;DR: This survey will not only enable researchers to get a good overview of the state-of-the-art methods for RGB-D-based object recognition but also provide a reference for other multimodal machine learning applications, e.g., multimodals medical image fusion, audio-visual speech recognition, and multimedia retrieval and generation.
Abstract: Object recognition in real-world environments is one of the fundamental and key tasks in computer vision and robotics communities. With the advanced sensing technologies and low-cost depth sensors, the high-quality RGB and depth images can be recorded synchronously, and the object recognition performance can be improved by jointly exploiting them. RGB-D-based object recognition has evolved from early methods that using hand-crafted representations to the current state-of-the-art deep learning-based methods. With the undeniable success of deep learning, especially convolutional neural networks (CNNs) in the visual domain, the natural progression of deep learning research points to problems involving larger and more complex multimodal data. In this paper, we provide a comprehensive survey of recent multimodal CNNs (MMCNNs)-based approaches that have demonstrated significant improvements over previous methods. We highlight two key issues, namely, training data deficiency and multimodal fusion. In addition, we summarize and discuss the publicly available RGB-D object recognition datasets and present a comparative performance evaluation of the proposed methods on these benchmark datasets. Finally, we identify promising avenues of research in this rapidly evolving field. This survey will not only enable researchers to get a good overview of the state-of-the-art methods for RGB-D-based object recognition but also provide a reference for other multimodal machine learning applications, e.g., multimodal medical image fusion, audio-visual speech recognition, and multimedia retrieval and generation.

39 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The background knowledge and the available features related to crowded scenes are provided and existing models, popular algorithms, evaluation protocols, and system performance are provided corresponding to different aspects of the crowded scene analysis.
Abstract: Automated scene analysis has been a topic of great interest in computer vision and cognitive science. Recently, with the growth of crowd phenomena in the real world, crowded scene analysis has attracted much attention. However, the visual occlusions and ambiguities in crowded scenes, as well as the complex behaviors and scene semantics, make the analysis a challenging task. In the past few years, an increasing number of works on the crowded scene analysis have been reported, which covered different aspects including crowd motion pattern learning, crowd behavior and activity analyses, and anomaly detection in crowds. This paper surveys the state-of-the-art techniques on this topic. We first provide the background knowledge and the available features related to crowded scenes. Then, existing models, popular algorithms, evaluation protocols, and system performance are provided corresponding to different aspects of the crowded scene analysis. We also outline the available datasets for performance evaluation. Finally, some research problems and promising future directions are presented with discussions.

457 citations

Posted Content
TL;DR: In this article, a spatiotemporal architecture for anomaly detection in videos including crowded scenes is proposed, which includes two main components, one for spatial feature representation, and one for learning the temporal evolution of the spatial features.
Abstract: We present an efficient method for detecting anomalies in videos. Recent applications of convolutional neural networks have shown promises of convolutional layers for object detection and recognition, especially in images. However, convolutional neural networks are supervised and require labels as learning signals. We propose a spatiotemporal architecture for anomaly detection in videos including crowded scenes. Our architecture includes two main components, one for spatial feature representation, and one for learning the temporal evolution of the spatial features. Experimental results on Avenue, Subway and UCSD benchmarks confirm that the detection accuracy of our method is comparable to state-of-the-art methods at a considerable speed of up to 140 fps.

332 citations

Journal ArticleDOI
TL;DR: Different levels of an intelligent video surveillance system (IVVS) are studied in this paper, where techniques related to feature extraction and description for behavior representation are reviewed, and available datasets and metrics for performance evaluation are presented.
Abstract: Different levels of an intelligent video surveillance system (IVVS) are studied in this review.Existing approaches for abnormal behavior recognition relative to each level of an IVVS are extensively reviewed.Challenging datasets for IVVS evaluation are presented.Limitations of the abnormal behavior recognition area are discussed. With the increasing number of surveillance cameras in both indoor and outdoor locations, there is a grown demand for an intelligent system that detects abnormal events. Although human action recognition is a highly reached topic in computer vision, abnormal behavior detection is lately attracting more research attention. Indeed, several systems are proposed in order to ensure human safety. In this paper, we are interested in the study of the two main steps composing a video surveillance system which are the behavior representation and the behavior modeling. Techniques related to feature extraction and description for behavior representation are reviewed. Classification methods and frameworks for behavior modeling are also provided. Moreover, available datasets and metrics for performance evaluation are presented. Finally, examples of existing video surveillance systems used in real world are described.

243 citations

Journal ArticleDOI
TL;DR: This paper presents the state-of-the art dimensionality reduction techniques and their suitability for different types of data and application areas and the issues of dimensionality Reduction techniques that can affect the accuracy and relevance of results.

212 citations

Journal ArticleDOI
TL;DR: Results prove that the proposed Hybrid SCA-DE-based tracker can robustly track an arbitrary target in various challenging conditions than the other trackers and is very competitive compared to the state-of-the-art metaheuristic algorithms.

195 citations