scispace - formally typeset
Search or ask a question
Author

Zhengming Li

Bio: Zhengming Li is an academic researcher from Harbin Institute of Technology. The author has contributed to research in topics: Computer science & K-SVD. The author has an hindex of 5, co-authored 7 publications receiving 438 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: A discriminative dictionary learning algorithm, called the locality-constrained and label embedding dictionary learning (LCLE-DL) algorithm, was proposed for image classification, which can achieve better performance than some state-of-the-art algorithms.
Abstract: Locality and label information of training samples play an important role in image classification. However, previous dictionary learning algorithms do not take the locality and label information of atoms into account together in the learning process, and thus their performance is limited. In this paper, a discriminative dictionary learning algorithm, called the locality-constrained and label embedding dictionary learning (LCLE-DL) algorithm, was proposed for image classification. First, the locality information was preserved using the graph Laplacian matrix of the learned dictionary instead of the conventional one derived from the training samples. Then, the label embedding term was constructed using the label information of atoms instead of the classification error term, which contained discriminating information of the learned dictionary. The optimal coding coefficients derived by the locality-based and label-based reconstruction were effective for image classification. Experimental results demonstrated that the LCLE-DL algorithm can achieve better performance than some state-of-the-art algorithms.

163 citations

Journal ArticleDOI
TL;DR: This paper proposes to exploit the symmetry of the face to generate new samples and devise a representation based method to perform face recognition that outperforms state-of-the-art face recognition methods including the sparse representation classification (SRC), linear regression classification (LRC), collaborative representation (CR) and two-phase test sample sparse representation (TPTSSR).

160 citations

Journal ArticleDOI
TL;DR: A survey of dictionary learning algorithms for face recognition is provided to understand the profiles of this subject and to grasp the theoretical rationales and potentials as well as their applicability to different cases of face recognition.
Abstract: During the past several years, as one of the most successful applications of sparse coding and dictionary learning, dictionary-based face recognition has received significant attention. Although some surveys of sparse coding and dictionary learning have been reported, there is no specialized survey concerning dictionary learning algorithms for face recognition. This paper provides a survey of dictionary learning algorithms for face recognition. To provide a comprehensive overview, we not only categorize existing dictionary learning algorithms for face recognition but also present details of each category. Since the number of atoms has an important impact on classification performance, we also review the algorithms for selecting the number of atoms. Specifically, we select six typical dictionary learning algorithms with different numbers of atoms to perform experiments on face databases. In summary, this paper provides a broad view of dictionary learning algorithms for face recognition and advances study in this field. It is very useful for readers to understand the profiles of this subject and to grasp the theoretical rationales and potentials as well as their applicability to different cases of face recognition.

118 citations

Journal ArticleDOI
TL;DR: Experimental results demonstrate that the proposed algorithm framework outperforms some previous state-of-the-art dictionary learning and sparse coding algorithms in face recognition and can be applied to other pattern classification tasks.

66 citations

Journal ArticleDOI
Ruijun Ma1, Bob Zhang1, Yicong Zhou1, Zhengming Li, Fangyuan Lei 
TL;DR: Zhang et al. as mentioned in this paper proposed a novel network, namely, PID controller guide attention neural network (PAN-Net), taking advantage of both the proportional-integral-derivative (PID) controller and attention neural networks for real photograph denoising.
Abstract: Real photograph denoising is extremely challenging in low-level computer vision since the noise is sophisticated and cannot be fully modeled by explicit distributions. Although deep-learning techniques have been actively explored for this issue and achieved convincing results, most of the networks may cause vanishing or exploding gradients, and usually entail more time and memory to obtain a remarkable performance. This article overcomes these challenges and presents a novel network, namely, PID controller guide attention neural network (PAN-Net), taking advantage of both the proportional-integral-derivative (PID) controller and attention neural network for real photograph denoising. First, a PID-attention network (PID-AN) is built to learn and exploit discriminative image features. Meanwhile, we devise a dynamic learning scheme by linking the neural network and control action, which significantly improves the robustness and adaptability of PID-AN. Second, we explore both the residual structure and share-source skip connections to stack the PID-ANs. Such a framework provides a flexible way to feature residual learning, enabling us to facilitate the network training and boost the denoising performance. Extensive experiments show that our PAN-Net achieves superior denoising results against the state-of-the-art in terms of image quality and efficiency.

19 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A novel discriminative sparse representation method is proposed and its noticeable performance in image classification is demonstrated by the experimental results, and the proposed method outperforms the existing state-of-the-art sparse representation methods.
Abstract: Sparse representation has shown an attractive performance in a number of applications. However, the available sparse representation methods still suffer from some problems, and it is necessary to design more efficient methods. Particularly, to design a computationally inexpensive, easily solvable, and robust sparse representation method is a significant task. In this paper, we explore the issue of designing the simple, robust, and powerfully efficient sparse representation methods for image classification. The contributions of this paper are as follows. First, a novel discriminative sparse representation method is proposed and its noticeable performance in image classification is demonstrated by the experimental results. More importantly, the proposed method outperforms the existing state-of-the-art sparse representation methods. Second, the proposed method is not only very computationally efficient but also has an intuitive and easily understandable idea. It exploits a simple algorithm to obtain a closed-form solution and discriminative representation of the test sample. Third, the feasibility, computational efficiency, and remarkable classification accuracy of the proposed $l_{2}$ regularization-based representation are comprehensively shown by extensive experiments and analysis. The code of the proposed method is available at http://www.yongxu.org/lunwen.html .

171 citations

Journal ArticleDOI
TL;DR: Softmax regression-based deep sparse autoencoder network (SRDSAN) is proposed to recognize facial emotion in human-robot interaction and aims to handle large data in the output of deep learning by using SR, to overcome local extrema and gradient diffusion problems in the training process.

158 citations

Journal ArticleDOI
Rejeesh M R1
TL;DR: The performance of the proposed ANFIS-ABC technique is evaluated using an ORL database with 400 images of 40 individuals, YALE-B database with 165 images of 15 individuals and finally with real time video the detection rate and false alarm rate is compared with proposed and existing methods to prove the system efficiency.
Abstract: In this paper, an efficient face recognition method using AGA and ANFIS-ABC has been proposed. At first stage, the face images gathered from the database are preprocessed. At Second stage, an interest point which is used to improve the detection rate consequently. The parameters used in the interest point determination are optimized using the Adaptive Genetic Algorithm. Finally using ANFIS, face images are classified by using extracted features. During the training process, the parameters of ANFIS are optimized using Artificial Bee Colony Algorithm (ABC) in order to improve the accuracy. The performance of the proposed ANFIS-ABC technique is evaluated using an ORL database with 400 images of 40 individuals, YALE-B database with 165 images of 15 individuals and finally with real time video the detection rate and false alarm rate is compared with proposed and existing methods to prove the system efficiency.

151 citations

Journal ArticleDOI
TL;DR: The classification approach of the ADDL model is very efficient, because it can avoid the extra time-consuming sparse reconstruction process with trained dictionary for each new test data as most existing DL algorithms.
Abstract: In this paper, we propose an analysis mechanism-based structured analysis discriminative dictionary learning analysis discriminative dictionary learning, framework. The ADDL seamlessly integrates ADDL, analysis representation, and analysis classifier training into a unified model. The applied analysis mechanism can make sure that the learned dictionaries, representations, and linear classifiers over different classes are independent and discriminating as much as possible. The dictionary is obtained by minimizing a reconstruction error and an analytical incoherence promoting term that encourages the subdictionaries associated with different classes to be independent. To obtain the representation coefficients, ADDL imposes a sparse $l_{2,1}$ -norm constraint on the coding coefficients instead of using $l_{0}$ or $l_{1}$ norm, since the $l_{0}$ - or $l_{1}$ -norm constraint applied in most existing DL criteria makes the training phase time consuming. The code-extraction projection that bridges data with the sparse codes by extracting special features from the given samples is calculated via minimizing a sparse code approximation term. Then we compute a linear classifier based on the approximated sparse codes by an analysis mechanism to simultaneously consider the classification and representation powers. Thus, the classification approach of our model is very efficient, because it can avoid the extra time-consuming sparse reconstruction process with trained dictionary for each new test data as most existing DL algorithms. Simulations on real image databases demonstrate that our ADDL model can obtain superior performance over other state of the arts.

140 citations

Journal ArticleDOI
TL;DR: The four Vs of multi-output learning are characterized, i.e., volume, velocity, variety, and veracity, and the ways in which the four Vs both benefit and bring challenges to multi- output learning by taking inspiration from big data are examined.
Abstract: The aim of multi-output learning is to simultaneously predict multiple outputs given an input. It is an important learning problem for decision-making since making decisions in the real world often involves multiple complex factors and criteria. In recent times, an increasing number of research studies have focused on ways to predict multiple outputs at once. Such efforts have transpired in different forms according to the particular multi-output learning problem under study. Classic cases of multi-output learning include multi-label learning, multi-dimensional learning, multi-target regression, and others. From our survey of the topic, we were struck by a lack in studies that generalize the different forms of multi-output learning into a common framework. This article fills that gap with a comprehensive review and analysis of the multi-output learning paradigm. In particular, we characterize the four Vs of multi-output learning, i.e., volume, velocity, variety, and veracity, and the ways in which the four Vs both benefit and bring challenges to multi-output learning by taking inspiration from big data. We analyze the life cycle of output labeling, present the main mathematical definitions of multi-output learning, and examine the field’s key challenges and corresponding solutions as found in the literature. Several model evaluation metrics and popular data repositories are also discussed. Last but not least, we highlight some emerging challenges with multi-output learning from the perspective of the four Vs as potential research directions worthy of further studies.

124 citations