scispace - formally typeset
Search or ask a question
Author

Huchuan Lu

Other affiliations: Adobe Systems
Bio: Huchuan Lu is an academic researcher from Dalian University of Technology. The author has contributed to research in topics: Computer science & Feature (computer vision). The author has an hindex of 71, co-authored 395 publications receiving 26261 citations. Previous affiliations of Huchuan Lu include Adobe Systems.


Papers
More filters
Proceedings ArticleDOI
23 Jun 2013
TL;DR: This work considers both foreground and background cues in a different way and ranks the similarity of the image elements with foreground cues or background cues via graph-based manifold ranking, defined based on their relevances to the given seeds or queries.
Abstract: Most existing bottom-up methods measure the foreground saliency of a pixel or region based on its contrast within a local context or the entire image, whereas a few methods focus on segmenting out background regions and thereby salient objects Instead of considering the contrast between the salient objects and their surrounding regions, we consider both foreground and background cues in a different way We rank the similarity of the image elements (pixels or regions) with foreground cues or background cues via graph-based manifold ranking The saliency of the image elements is defined based on their relevances to the given seeds or queries We represent the image as a close-loop graph with super pixels as nodes These nodes are ranked based on the similarity to background and foreground queries, based on affinity matrices Saliency detection is carried out in a two-stage scheme to extract background regions and foreground salient objects efficiently Experimental results on two large benchmark databases demonstrate the proposed method performs well when against the state-of-the-art methods in terms of accuracy and speed We also create a more difficult benchmark database containing 5,172 images to test the proposed saliency model and make this database publicly available with this paper for further studies in the saliency field

2,278 citations

Proceedings ArticleDOI
16 Jun 2012
TL;DR: A simple yet robust tracking method based on the structural local sparse appearance model which exploits both partial information and spatial information of the target based on a novel alignment-pooling method and employs a template update strategy which combines incremental subspace learning and sparse representation.
Abstract: Sparse representation has been applied to visual tracking by finding the best candidate with minimal reconstruction error using target templates. However most sparse representation based trackers only consider the holistic representation and do not make full use of the sparse coefficients to discriminate between the target and the background, and hence may fail with more possibility when there is similar object or occlusion in the scene. In this paper we develop a simple yet robust tracking method based on the structural local sparse appearance model. This representation exploits both partial information and spatial information of the target based on a novel alignment-pooling method. The similarity obtained by pooling across the local patches helps not only locate the target more accurately but also handle occlusion. In addition, we employ a template update strategy which combines incremental subspace learning and sparse representation. This strategy adapts the template to the appearance change of the target with less possibility of drifting and reduces the influence of the occluded target template as well. Both qualitative and quantitative evaluations on challenging benchmark image sequences demonstrate that the proposed tracking algorithm performs favorably against several state-of-the-art methods.

1,305 citations

Proceedings ArticleDOI
17 Dec 2018
TL;DR: In this paper, a deep mutual learning (DML) strategy is proposed to transfer knowledge from a teacher to a student network, where an ensemble of students learn collaboratively and teach each other throughout the training process.
Abstract: Model distillation is an effective and widely used technique to transfer knowledge from a teacher to a student network The typical application is to transfer from a powerful large network or ensemble to a small network, in order to meet the low-memory or fast execution requirements In this paper, we present a deep mutual learning (DML) strategy Different from the one-way transfer between a static pre-defined teacher and a student in model distillation, with DML, an ensemble of students learn collaboratively and teach each other throughout the training process Our experiments show that a variety of network architectures benefit from mutual learning and achieve compelling results on both category and instance recognition tasks Surprisingly, it is revealed that no prior powerful teacher network is necessary - mutual learning of a collection of simple student networks works, and moreover outperforms distillation from a more powerful yet static teacher

1,286 citations

Proceedings ArticleDOI
16 Jun 2012
TL;DR: A robust appearance model that exploits both holistic templates and local representations is proposed and the update scheme considers both the latest observations and the original template, thereby enabling the tracker to deal with appearance change effectively and alleviate the drift problem.
Abstract: In this paper we propose a robust object tracking algorithm using a collaborative model. As the main challenge for object tracking is to account for drastic appearance change, we propose a robust appearance model that exploits both holistic templates and local representations. We develop a sparsity-based discriminative classifier (SD-C) and a sparsity-based generative model (SGM). In the S-DC module, we introduce an effective method to compute the confidence value that assigns more weights to the foreground than the background. In the SGM module, we propose a novel histogram-based method that takes the spatial information of each patch into consideration with an occlusion handing scheme. Furthermore, the update scheme considers both the latest observations and the original template, thereby enabling the tracker to deal with appearance change effectively and alleviate the drift problem. Numerous experiments on various challenging videos demonstrate that the proposed tracker performs favorably against several state-of-the-art algorithms.

1,069 citations

Proceedings ArticleDOI
07 Dec 2015
TL;DR: An in-depth study on the properties of CNN features offline pre-trained on massive image data and classification task on ImageNet shows that the proposed tacker outperforms the state-of-the-art significantly.
Abstract: We propose a new approach for general object tracking with fully convolutional neural network. Instead of treating convolutional neural network (CNN) as a black-box feature extractor, we conduct in-depth study on the properties of CNN features offline pre-trained on massive image data and classification task on ImageNet. The discoveries motivate the design of our tracking system. It is found that convolutional layers in different levels characterize the target from different perspectives. A top layer encodes more semantic features and serves as a category detector, while a lower layer carries more discriminative information and can better separate the target from distracters with similar appearance. Both layers are jointly used with a switch mechanism during tracking. It is also found that for a tracking target, only a subset of neurons are relevant. A feature map selection method is developed to remove noisy and irrelevant feature maps, which can reduce computation redundancy and improve tracking accuracy. Extensive evaluation on the widely used tracking benchmark [36] shows that the proposed tacker outperforms the state-of-the-art significantly.

1,061 citations


Cited by
More filters
Proceedings ArticleDOI
07 Jun 2015
TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

40,257 citations

Proceedings ArticleDOI
23 Jun 2013
TL;DR: Large scale experiments are carried out with various evaluation criteria to identify effective approaches for robust tracking and provide potential future research directions in this field.
Abstract: Object tracking is one of the most important components in numerous applications of computer vision. While much progress has been made in recent years with efforts on sharing code and datasets, it is of great importance to develop a library and benchmark to gauge the state of the art. After briefly reviewing recent advances of online object tracking, we carry out large scale experiments with various evaluation criteria to understand how these algorithms perform. The test image sequences are annotated with different attributes for performance evaluation and analysis. By analyzing quantitative results, we identify effective approaches for robust tracking and provide potential future research directions in this field.

3,828 citations

Journal ArticleDOI
TL;DR: A broad survey of the recent advances in convolutional neural networks can be found in this article, where the authors discuss the improvements of CNN on different aspects, namely, layer design, activation function, loss function, regularization, optimization and fast computation.

3,125 citations

Journal ArticleDOI
TL;DR: In this article, a review of deep learning-based object detection frameworks is provided, focusing on typical generic object detection architectures along with some modifications and useful tricks to improve detection performance further.
Abstract: Due to object detection’s close relationship with video analysis and image understanding, it has attracted much research attention in recent years. Traditional object detection methods are built on handcrafted features and shallow trainable architectures. Their performance easily stagnates by constructing complex ensembles that combine multiple low-level image features with high-level context from object detectors and scene classifiers. With the rapid development in deep learning, more powerful tools, which are able to learn semantic, high-level, deeper features, are introduced to address the problems existing in traditional architectures. These models behave differently in network architecture, training strategy, and optimization function. In this paper, we provide a review of deep learning-based object detection frameworks. Our review begins with a brief introduction on the history of deep learning and its representative tool, namely, the convolutional neural network. Then, we focus on typical generic object detection architectures along with some modifications and useful tricks to improve detection performance further. As distinct specific detection tasks exhibit different characteristics, we also briefly survey several specific tasks, including salient object detection, face detection, and pedestrian detection. Experimental analyses are also provided to compare various methods and draw some meaningful conclusions. Finally, several promising directions and tasks are provided to serve as guidelines for future work in both object detection and relevant neural network-based learning systems.

3,097 citations