scispace - formally typeset
Search or ask a question
Author

Zhihua Chen

Bio: Zhihua Chen is an academic researcher from East China University of Science and Technology. The author has contributed to research in topics: Image segmentation & Pixel. The author has an hindex of 9, co-authored 55 publications receiving 298 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This work presents a novel real-time method for hand gesture recognition in which the hand region is extracted from the background with the background subtraction method and the palm and fingers are segmented so as to detect and recognize the fingers.
Abstract: Hand gesture recognition is very significant for human-computer interaction. In this work, we present a novel real-time method for hand gesture recognition. In our framework, the hand region is extracted from the background with the background subtraction method. Then, the palm and fingers are segmented so as to detect and recognize the fingers. Finally, a rule classifier is applied to predict the labels of hand gestures. The experiments on the data set of 1300 images show that our method performs well and is highly efficient. Moreover, our method shows better performance than a state-of-art method on another data set of hand gestures.

99 citations

Journal ArticleDOI
TL;DR: An optimization approach that integrates the multiclass cues of the image that makes full use of the cues in the given image instead of an extra requirement for the scene, and the qualitative results show that the approach outperformed other methods with similar conditions.
Abstract: Illumination is a significant component of an image, and illumination estimation of an outdoor scene from given images is still challenging yet it has wide applications. Most of the traditional illumination estimating methods require prior knowledge or fixed objects within the scene, which makes them often limited by the scene of a given image. We propose an optimization approach that integrates the multiclass cues of the image(s) [a main input image and optional auxiliary input image(s)]. First, Sun visibility is estimated by the efficient broad learning system. And then for the scene with visible Sun, we classify the information in the image by the proposed classification algorithm, which combines the geometric information and shadow information to make the most of the information. And we apply a respective algorithm for every class to estimate the illumination parameters. Finally, our approach integrates all of the estimating results by the Markov random field. We make full use of the cues in the given image instead of an extra requirement for the scene, and the qualitative results are presented and show that our approach outperformed other methods with similar conditions.

23 citations

Journal ArticleDOI
TL;DR: A robust exemplar-based inpainting algorithm that combines color feature and space distance between two patches to search for the optimized patch to avoid texture inconsistency is proposed.

22 citations

Journal ArticleDOI
TL;DR: A new framework for seamless video composition based on a propagation model based on contour flow to yield each trimap that provides each frame with a pre-segmentation: definite foreground, definite background and unknown and achieves optimized mean-value cloning.
Abstract: As the process of pasting a source video patch into a target video sequence, seamless video composition is an important and useful video editing operation. Recently, a novel composition approach based on Mean-Value Coordinates has been presented. However, its composition result is often degraded by smudging and discoloration artifacts. Thus we propose optimized mean-value cloning to eliminate these artifacts by matting technique and interpolation constraint. On the basis of this optimized approach, we further present a new framework for seamless video composition. In the framework, we first develop a propagation model based on contour flow to yield each trimap that provides each frame with a pre-segmentation: definite foreground, definite background and unknown. This propagation model constructs the contour flow of inter-frame by minimizing a cost function, and employs it to relabel the trimap. Moreover, when the trimap propagation model is inefficient due to abrupt feature change and complex scene pattern, our framework has also implemented a convenient interactive tool to create and modify trimap. Then, we can use the high-quality trimap to achieve the optimized mean-value cloning. Experimental results demonstrate the effectiveness of our seamless video composition framework.

21 citations

Journal ArticleDOI
TL;DR: The proposed ESR model and region-matching algorithm are highly effective at image retrieval, and can achieve more accurate query results than current state-of-the-art methods.

18 citations


Cited by
More filters
Proceedings ArticleDOI
21 Jul 2017
TL;DR: A simplified convolutional neural network which combines local and global information through a multi-resolution 4×5 grid structure is proposed which implements a loss function inspired by the Mumford-Shah functional which penalizes errors on the boundary, enabling near real-time, high performance saliency detection.
Abstract: Saliency detection aims to highlight the most relevant objects in an image. Methods using conventional models struggle whenever salient objects are pictured on top of a cluttered background while deep neural nets suffer from excess complexity and slow evaluation speeds. In this paper, we propose a simplified convolutional neural network which combines local and global information through a multi-resolution 4×5 grid structure. Instead of enforcing spacial coherence with a CRF or superpixels as is usually the case, we implemented a loss function inspired by the Mumford-Shah functional which penalizes errors on the boundary. We trained our model on the MSRA-B dataset, and tested it on six different saliency benchmark datasets. Results show that our method is on par with the state-of-the-art while reducing computation time by a factor of 18 to 100 times, enabling near real-time, high performance saliency detection.

505 citations

Journal ArticleDOI
TL;DR: A diverse NMF (DiNMF) approach is proposed, which enhances the diversity, reduces the redundancy among multiview representations with a novel defined diversity term and enables the learning process in linear execution time, and a locality preserved DiNMF (LP-Di NMF) for more accurate learning.
Abstract: Non-negative matrix factorization (NMF), a method for finding parts-based representation of non-negative data, has shown remarkable competitiveness in data analysis. Given that real-world datasets are often comprised of multiple features or views which describe data from various perspectives, it is important to exploit diversity from multiple views for comprehensive and accurate data representations. Moreover, real-world datasets often come with high-dimensional features, which demands the efficiency of low-dimensional representation learning approaches. To address these needs, we propose a diverse NMF (DiNMF) approach. It enhances the diversity, reduces the redundancy among multiview representations with a novel defined diversity term and enables the learning process in linear execution time. We further propose a locality preserved DiNMF (LP-DiNMF) for more accurate learning, which ensures diversity from multiple views while preserving the local geometry structure of data in each view. Efficient iterative updating algorithms are derived for both DiNMF and LP-DiNMF, along with proofs of convergence. Experiments on synthetic and real-world datasets have demonstrated the efficiency and accuracy of the proposed methods against the state-of-the-art approaches, proving the advantages of incorporating the proposed diversity term into NMF.

107 citations

Journal ArticleDOI
TL;DR: Recent research as regards processing of large collections of images and video, including work on analysis, manipulation, and synthesis is surveyed, and possible future directions in this emerging research area are suggested.
Abstract: In recent years, the computer graphics and computer vision communities have devoted significant attention to research based on Internet visual media resources. The huge number of images and videos continually being uploaded by millions of people have stimulated a variety of visual media creation and editing applications, while also posing serious challenges of retrieval, organization, and utilization. This article surveys recent research as regards processing of large collections of images and video, including work on analysis, manipulation, and synthesis. It discusses the problems involved, and suggests possible future directions in this emerging research area.

93 citations

Journal ArticleDOI
TL;DR: By the survey, it is found that computer vision plays an important role and has a large potential to address the challenges related to the agricultural fields and among all existing machine learning techniques SVM give better classification accuracy.

68 citations