scispace - formally typeset
Search or ask a question
Author

Bo Ning

Bio: Bo Ning is an academic researcher from University of Science and Technology of China. The author has contributed to research in topics: Codebook & Bag-of-words model. The author has an hindex of 2, co-authored 3 publications receiving 54 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: A novel approach for key poses selection is proposed, which models the descriptor space utilizing a manifold learning technique to recover the geometric structure of the descriptors on a lower dimensional manifold and develops a PageRank-based centrality measure.
Abstract: In action recognition, bag of visual words based approaches have been shown to be successful, for which the quality of codebook is critical. In a large vocabulary of poses (visual words), some key poses play a more decisive role than others in the codebook. This paper proposes a novel approach for key poses selection, which models the descriptor space utilizing a manifold learning technique to recover the geometric structure of the descriptors on a lower dimensional manifold. A PageRank-based centrality measure is developed to select key poses according to the recovered geometric structure. In each step, a key pose is selected from the manifold and the remaining model is modified to maximize the discriminative power of selected codebook. With the obtained codebook, each action can be represented with a histogram of the key poses. To solve the ambiguity between some action classes, a pairwise subdivision is executed to select discriminative codebooks for further recognition. Experiments on benchmark datasets showed that our method is able to obtain better performance compared with other state-of-the-art methods.

52 citations

Journal ArticleDOI
TL;DR: A novel bag of visual words based method is proposed to detect pedestrians in unseen scenes by dynamically updating the key words by using three strategies covering key word selection, detector invariance, and codebook update.

6 citations

Proceedings ArticleDOI
01 Nov 2011
TL;DR: A novel approach to select key poses for the codebook is proposed, which models the descriptor space utilizing manifold learning to recover the geometric structure of the descriptors on a lower dimensional manifold space.
Abstract: In action recognition, bag of words based approaches have been shown to be successful, for which the quality of codebook is critical. This paper proposes a novel approach to select key poses for the codebook, which models the descriptor space utilizing manifold learning to recover the geometric structure of the descriptors on a lower dimensional manifold space. A PageRank based centrality measure is developed to select key poses on the manifold. In each step, a key pose is selected and the remaining model is modified to maximize the discriminative power of selected codebook. In classification, the ambiguity of each action couple is evaluated through cross validation. An additional subdivision will be executed for ambiguous pairs. Experiments on ut-tower dataset showed that our method is able to obtain better performance than the state-of-the-art methods.

Cited by
More filters
Journal ArticleDOI
TL;DR: This approach improves traditional methods by adopting multiview locality-sensitive sparse coding in the retrieving process, and incorporates a local similarity preserving term into the objective of sparse coding, which groups similar silhouettes to alleviate the instability of sparse codes.
Abstract: Image-based 3-D human pose recovery is usually conducted by retrieving relevant poses with image features. However, it suffers from the high dimensionality of image features and the low efficiency of the retrieving process. Particularly for multiview data, the integration of different types of features is difficult. In this paper, a novel approach is proposed to recover 3-D human poses from silhouettes. This approach improves traditional methods by adopting multiview locality-sensitive sparse coding in the retrieving process. First, it incorporates a local similarity preserving term into the objective of sparse coding, which groups similar silhouettes to alleviate the instability of sparse codes. Second, the objective function of sparse coding is improved by integrating multiview data. The experimental results show that the retrieval error has been reduced by 20% to 50%, which demonstrate the effectiveness of the proposed method.

242 citations

Journal ArticleDOI
TL;DR: A novel method for human action recognition based on boosted key-frame selection and correlated pyramidal motion feature representations and the correlogram, which focuses not only on probabilistic distributions within one frame but also on the temporal relationships of the action sequence is proposed.

127 citations

Journal ArticleDOI
TL;DR: The experimental results demonstrate that the proposed pLSC algorithm outperforms the manifold regularized sparse coding algorithms including the standard Laplacian regularization sparse coding algorithm with a proper p.
Abstract: Human activity analysis in videos has increasingly attracted attention in computer vision research with the massive number of videos now accessible online. Although many recognition algorithms have been reported recently, activity representation is challenging. Recently, manifold regularized sparse coding has obtained promising performance in action recognition, because it simultaneously learns the sparse representation and preserves the manifold structure. In this paper, we propose a generalized version of Laplacian regularized sparse coding for human activity recognition called $p$ -Laplacian regularized sparse coding (pLSC). The proposed method exploits $p$ -Laplacian regularization to preserve the local geometry. The $p$ -Laplacian is a nonlinear generalization of standard graph Laplacian and has tighter isoperimetric inequality. As a result, pLSC provides superior theoretical evidence than standard Laplacian regularized sparse coding with a proper $p$ . We also provide a fast iterative shrinkage-thresholding algorithm for the optimization of pLSC. Finally, we input the sparse codes learned by the pLSC algorithm into support vector machines and conduct extensive experiments on the unstructured social activity attribute dataset and human motion database (HMDB51) for human activity recognition. The experimental results demonstrate that the proposed pLSC algorithm outperforms the manifold regularized sparse coding algorithms including the standard Laplacian regularized sparse coding algorithm with a proper $p$ .

112 citations

Journal ArticleDOI
TL;DR: The proposed scheme takes advantages of local and global features and, therefore, provides a discriminative representation for human actions and outperforms the state-of-the-art methods on the IXMAS action recognition dataset.
Abstract: In this paper, we propose a novel scheme for human action recognition that combines the advantages of both local and global representations. We explore human silhouettes for human action representation by taking into account the correlation between sequential poses in an action. A modified bag-of-words model, named bag of correlated poses, is introduced to encode temporally local features of actions. To utilize the property of visual word ambiguity, we adopt the soft assignment strategy to reduce the dimensionality of our model and circumvent the penalty of computational complexity and quantization error. To compensate for the loss of structural information, we propose an extended motion template, i.e., extensions of the motion history image, to capture the holistic structural features. The proposed scheme takes advantages of local and global features and, therefore, provides a discriminative representation for human actions. Experimental results prove the viability of the complimentary properties of two descriptors and the proposed approach outperforms the state-of-the-art methods on the IXMAS action recognition dataset.

97 citations

Journal ArticleDOI
TL;DR: A new classifier named weighted local naive Bayes nearest neighbor is proposed for the final action classification, which is demonstrated to be more accurate and robust than other classifiers, e.g., support vector machine (SVM) and naive Baye nearest neighbor.
Abstract: In this paper, we present a new approach for human action recognition based on key-pose selection and representation. Poses in video frames are described by the proposed extensive pyramidal features (EPFs), which include the Gabor, Gaussian, and wavelet pyramids. These features are able to encode the orientation, intensity, and contour information and therefore provide an informative representation of human poses. Due to the fact that not all poses in a sequence are discriminative and representative, we further utilize the AdaBoost algorithm to learn a subset of discriminative poses. Given the boosted poses for each video sequence, a new classifier named weighted local naive Bayes nearest neighbor is proposed for the final action classification, which is demonstrated to be more accurate and robust than other classifiers, e.g., support vector machine (SVM) and naive Bayes nearest neighbor. The proposed method is systematically evaluated on the KTH data set, the Weizmann data set, the multiview IXMAS data set, and the challenging HMDB51 data set. Experimental results manifest that our method outperforms the state-of-the-art techniques in terms of recognition rate.

96 citations