Top 13 papers published by Guo-Jun Qi from Huawei in 2008

Proceedings Article•DOI•

Joint multi-label multi-instance learning for image classification

[...]

Zheng-Jun Zha¹, Xian-Sheng Hua², Tao Mei², Jingdong Wang², Guo-Jun Qi¹, Zengfu Wang¹ - Show less +2 more•Institutions (2)

University of Science and Technology of China¹, Microsoft²

23 Jun 2008

TL;DR: This work proposes an integrated multi- label multi-instance learning (MLMIL) approach based on hidden conditional random fields (HCRFs), which simultaneously captures both the connections between semantic labels and regions, and the correlations among the labels in a single formulation.

...read moreread less

Abstract: In real world, an image is usually associated with multiple labels which are characterized by different regions in the image. Thus image classification is naturally posed as both a multi-label learning and multi-instance learning problem. Different from existing research which has considered these two problems separately, we propose an integrated multi- label multi-instance learning (MLMIL) approach based on hidden conditional random fields (HCRFs), which simultaneously captures both the connections between semantic labels and regions, and the correlations among the labels in a single formulation. We apply this MLMIL framework to image classification and report superior performance compared to key existing approaches over the MSR Cambridge (MSRC) and Corel data sets.

...read moreread less

245 citations

Proceedings Article•DOI•

Two-Dimensional Active Learning for image classification

[...]

Guo-Jun Qi¹, Xian-Sheng Hua², Yong Rui², Jinhui Tang¹, Hong-Jiang Zhang² - Show less +1 more•Institutions (2)

University of Science and Technology of China¹, Microsoft²

23 Jun 2008

TL;DR: This paper proposes a two-dimensional active learning scheme that not only considers the sample dimension but also the label dimension, and it is shown that the traditional active learning formulation is a special case of 2DAL when there is only one label.

...read moreread less

Abstract: In this paper, we propose a two-dimensional active learning scheme and show its application in image classification. Traditional active learning methods select samples only along the sample dimension. While this is the right strategy in binary classification, it is sub-optimal for multi-label classification. In multi-label classification, we argue that, for each selected sample, only a part of more effective labels are necessary to be annotated while others can be inferred by exploring the correlations among the labels. The reason is that the contributions of different labels to minimizing the classification error are different due to the inherent label correlations. To this end, we propose to select sample-label pairs, rather than only samples, to minimize a multi-label Bayesian classification error bound. This new active learning strategy not only considers the sample dimension but also the label dimension, and we call it Two-Dimensional Active Learning (2DAL). We also show that the traditional active learning formulation is a special case of 2DAL when there is only one label. Extensive experiments conducted on two real-world applications show that the 2DAL significantly outperforms the best existing approaches which did not take label correlation into account.

...read moreread less

144 citations

Journal Article•DOI•

Correlative multilabel video annotation with temporal kernels

[...]

Guo-Jun Qi¹, Xian-Sheng Hua², Yong Rui², Jinhui Tang¹, Tao Mei², Meng Wang¹, Hong-Jiang Zhang² - Show less +3 more•Institutions (2)

University of Science and Technology of China¹, Microsoft²

30 Oct 2008-ACM Transactions on Multimedia Computing, Communications, and Applications

TL;DR: This article proposes another paradigm of the video annotation method that simultaneously annotates the concepts as well as model correlations between them in one step by the proposed Correlative Multilabel (CML) method, which benefits from the compensation of complementary information between different labels.

...read moreread less

Abstract: Automatic video annotation is an important ingredient for semantic-level video browsing, search and navigation. Much attention has been paid to this topic in recent years. These researches have evolved through two paradigms. In the first paradigm, each concept is individually annotated by a pre-trained binary classifier. However, this method ignores the rich information between the video concepts and only achieves limited success. Evolved from the first paradigm, the methods in the second paradigm add an extra step on the top of the first individual classifiers to fuse the multiple detections of the concepts. However, the performance of these methods can be degraded by the error propagation incurred in the first step to the second fusion one. In this article, another paradigm of the video annotation method is proposed to address these problems. It simultaneously annotates the concepts as well as model correlations between them in one step by the proposed Correlative Multilabel (CML) method, which benefits from the compensation of complementary information between different labels. Furthermore, since the video clips are composed by temporally ordered frame sequences, we extend the proposed method to exploit the rich temporal information in the videos. Specifically, a temporal-kernel is incorporated into the CML method based on the discriminative information between Hidden Markov Models (HMMs) that are learned from the videos. We compare the performance between the proposed approach and the state-of-the-art approaches in the first and second paradigms on the widely used TRECVID data set. As to be shown, superior performance of the proposed method is gained.

...read moreread less

68 citations

Proceedings Article•DOI•

Online multi-label active annotation: towards large-scale content-based video search

[...]

Xian-Sheng Hua¹, Guo-Jun Qi²•Institutions (2)

Microsoft¹, University of Science and Technology of China²

26 Oct 2008

TL;DR: This paper proposes a scalable framework for annotation-based video search, as well as a novel approach to enable large-scale semantic concept annotation, that is, online multi-label active learning, scalable to both the video sample dimension and concept label dimension.

...read moreread less

Abstract: Existing video search engines have not taken the advantages of video content analysis and semantic understanding. Video search in academia uses semantic annotation to approach content-based indexing. We argue this is a promising direction to enable real content-based video search. However, due to the complexity of both video data and semantic concepts, existing techniques on automatic video annotation are still not able to handle large-scale video set and large-scale concept set, in terms of both annotation accuracy and computation cost. To address this problem, in this paper, we propose a scalable framework for annotation-based video search, as well as a novel approach to enable large-scale semantic concept annotation, that is, online multi-label active learning. This framework is scalable to both the video sample dimension and concept label dimension. Large-scale unlabeled video samples are assumed to arrive consecutively in batches with an initial pre-labeled training set, based on which a preliminary multi-label classifier is built. For each arrived batch, a multi-label active learning engine is applied, which automatically selects and manually annotates a set of unlabeled sample-label pairs. And then an online learner updates the original classifier by taking the newly labeled sample-label pairs into consideration. This process repeats until all data are arrived. During the process, new labels, even without any pre-labeled training samples, can be incorporated into the process anytime. Experiments on TRECVID dataset demonstrate the effectiveness and efficiency of the proposed framework.

...read moreread less

46 citations

Journal Article•DOI•

Video Annotation Based on Kernel Linear Neighborhood Propagation

[...]

Jinhui Tang¹, Xian-Sheng Hua², Guo-Jun Qi, Yan Song, Xiuqing Wu - Show less +1 more•Institutions (2)

University of Science and Technology of China¹, Microsoft²

01 Jun 2008-IEEE Transactions on Multimedia

TL;DR: KLNP improves a recently proposed method linear neighborhood propagation by tackling the limitation of its local linear assumption on the distribution of semantics by combining the consistency assumption and the local linear embedding method in a nonlinear kernel-mapped space.

...read moreread less

Abstract: The insufficiency of labeled training data for representing the distribution of the entire dataset is a major obstacle in automatic semantic annotation of large-scale video database. Semi-supervised learning algorithms, which attempt to learn from both labeled and unlabeled data, are promising to solve this problem. In this paper, a novel graph-based semi-supervised learning method named kernel linear neighborhood propagation (KLNP) is proposed and applied to video annotation. This approach combines the consistency assumption, which is the basic assumption in semi-supervised learning, and the local linear embedding (LLE) method in a nonlinear kernel-mapped space. KLNP improves a recently proposed method linear neighborhood propagation (LNP) by tackling the limitation of its local linear assumption on the distribution of semantics. Experiments conducted on the TRECVID data set demonstrate that this approach outperforms other popular graph-based semi-supervised learning methods for video semantic annotation.

...read moreread less

44 citations

Proceedings Article•DOI•

Integrated graph-based semi-supervised multiple/single instance learning framework for image annotation

[...]

Jinhui Tang¹, Haojie Li¹, Guo-Jun Qi², Tat-Seng Chua¹•Institutions (2)

National University of Singapore¹, University of Science and Technology of China²

26 Oct 2008

TL;DR: This paper proposes an integrated graph-based semi-supervised learning framework to utilize these two types of representations simultaneously, and explores an effective and computationally efficient strategy to convert the multiple-instance representation into a single-instance one.

...read moreread less

Abstract: Recently, many learning methods based on multiple-instance (local) or single-instance (global) representations of images have been proposed for image annotation. Their performances on image annotation, however, are mixed as for certain concepts the single-instance representations of images are more suitable, while for some other concepts the multiple-instance representations are better. Thus in this paper, we explore an unified learning framework that combines the multiple-instance and single-instance representations for image annotation. More specifically, we propose an integrated graph-based semi-supervised learning framework to utilize these two types of representations simultaneously, and explore an effective and computationally efficient strategy to convert the multiple-instance representation into a single-instance one. Experiments conducted on the Coral image dataset show the effectiveness and efficiency of the proposed integrated framework.

...read moreread less

25 citations

Patent•

Correlative multi-label image annotation

[...]

Guo-Jun Qi¹, Xian-Sheng Hua¹, Yong Rui¹, Hong-Jiang Zhang¹, Shipeng Li¹ - Show less +1 more•Institutions (1)

Microsoft¹

13 Feb 2008

TL;DR: In this article, a classifier is used to annotate an image by implementing a labeling function that maps an input feature space and a label space to a combination feature vector, which models both features of individual ones of the concepts and correlations among the concepts.

...read moreread less

Abstract: Correlative multi-label image annotation may entail annotating an image by indicating respective labels for respective concepts. In an example embodiment, a classifier is to annotate an image by implementing a labeling function that maps an input feature space and a label space to a combination feature vector. The combination feature vector models both features of individual ones of the concepts and correlations among the concepts.

...read moreread less

22 citations

Patent•

Online multi-label active annotation of data files

[...]

Xian-Sheng Hua¹, Guo-Jun Qi¹, Shipeng Li¹•Institutions (1)

Microsoft¹

25 Sep 2008

TL;DR: In this paper, a preliminary classifier is constructed from a pre-labeled training set included with an initial batch of annotated data samples, and a first batch of sample-label pairs is selected by using a samplelabel pair selection module.

...read moreread less

Abstract: Online multi-label active annotation may include building a preliminary classifier from a pre-labeled training set included with an initial batch of annotated data samples, and selecting a first batch of sample-label pairs from the initial batch of annotated data samples. The sample-label pairs may be selected by using a sample-label pair selection module. The first batch of sample-label pairs may be provided to online participants to manually annotate the first batch of sample-label pairs based on the preliminary classifier. The preliminary classifier may be updated to form a first updated classifier based on an outcome of the providing the first batch of sample-label pairs to the online participants.

...read moreread less

21 citations

MSRA atT TRECVID 2008: High-Level Feature Extraction and Automatic Search.

[...]

Tao Mei, Zheng-Jun Zha, Yuan Liu, Meng Wang, Guo-Jun Qi, Xinmei Tian, Jingdong Wang, Linjun Yang, Xian-Sheng Hua - Show less +5 more

01 Jan 2008

TL;DR: This paper describes the MSRA experiments for TRECVID 2008, which representatively investigated the benefit of global and local low-level features by a variety of learning-based methods, including supervised and semi-supervised learning algorithms.

...read moreread less

Abstract: This paper describes the MSRA experiments for TRECVID 2008. We performed the experiments in high-level feature extraction and automatic search tasks. For high-level feature extraction, we representatively investigated the benefit of global and local low-level features by a variety of learning-based methods, including supervised and semi-supervised learning algorithms. For automatic search, we focused on text and visual baseline, query-independent learning, and various reranking methods.

...read moreread less

18 citations

Proceedings Article•DOI•

A joint appearance-spatial distance for kernel-based image categorization

[...]

Guo-Jun Qi¹, Xian-Sheng Hua², Yong Rui², Jinhui Tang¹, Zheng-Jun Zha¹, Hong-Jiang Zhang² - Show less +2 more•Institutions (2)

University of Science and Technology of China¹, Microsoft²

23 Jun 2008

TL;DR: A new distance measure is proposed that integrates joint appearance-spatial image features and is computed as an upper bound of an information-theoretic discrimination, and can be computed efficiently in a recursive formulation that scales well to image size.

...read moreread less

Abstract: The goal of image categorization is to classify a collection of unlabeled images into a set of predefined classes to support semantic-level image retrieval. The distance measures used in most existing approaches either ignored the spatial structures or used them in a separate step. As a result, these distance measures achieved only limited success. To address these difficulties, in this paper, we propose a new distance measure that integrates joint appearance-spatial image features. Such a distance measure is computed as an upper bound of an information-theoretic discrimination, and can be computed efficiently in a recursive formulation that scales well to image size. In addition, the upper bound approximation can be further tightened via adaption learning from a universal reference model. Extensive experiments on two widely-used data sets show that the proposed approach significantly outperforms the state-of-the-art approaches.

...read moreread less

10 citations

Patent•

Kernelized spatial-contextual image classification

[...]

Xian-Sheng Hua¹, Guo-Jun Qi¹, Yong Rui¹, Hong-Jiang Zhang¹•Institutions (1)

Microsoft¹

24 Sep 2008

TL;DR: In this article, a kernelized spatial-contextual image classification is described, which consists of generating a first spatial contextual model to represent a first image, the first spatialcontextual model having a plurality of interconnected nodes arranged in a first pattern of connections with each node connected to at least one other node.

...read moreread less

Abstract: Kernelized spatial-contextual image classification is disclosed One embodiment comprises generating a first spatial-contextual model to represent a first image, the first spatial-contextual model having a plurality of interconnected nodes arranged in a first pattern of connections with each node connected to at least one other node, generating a second spatial-contextual model to represent a second image using the first pattern of connections, and estimating the distance between corresponding nodes in the first spatial-contextual model and the second spatial-contextual model based on a relationship with adjacent connected nodes to determine a distance between the first image and the second image

...read moreread less

Online Multi-Label Active Learning for Large-Scale Multimedia Annotation

[...]

Xian-Sheng Hua¹, Guo-Jun Qi²•Institutions (2)

Microsoft¹, University of Science and Technology of China²

01 Jun 2008

TL;DR: This paper proposes a scalable framework for annotation-based video search, as well as a novel approach to enable large-scale semantic concept annotation, that is, online multi-label active learning, scalable to both the video sample dimension and concept label dimension.

...read moreread less

Abstract: Existing video search engines have not taken the advantages of video content analysis and semantic understanding. Video search in academia uses semantic annotation to approach content-based indexing. We argue this is a promising direction to enable real content-based video search. However, due to the complexity of both video data and semantic concepts, existing techniques on automatic video annotation are still not able to handle large-scale video set and large-scale concept set, in terms of both annotation accuracy and computation cost. To address this problem, in this paper, we propose a scalable framework for annotation-based video search, as well as a novel approach to enable large-scale semantic concept annotation, that is, online multi-label active learning. This framework is scalable to both the video sample dimension and concept label dimension. Large-scale unlabeled video samples are assumed to arrive consecutively in batches with an initial pre-labeled training set, based on which a preliminary multi-label classifier is built. For each arrived batch, a multi-label active learning engine is applied, which automatically selects and manually annotates a set of unlabeled sample-label pairs. And then an online learner updates the original classifier by taking the newly labeled sample-label pairs into consideration. This process repeats until all data are arrived. During the process, new labels, even without any pre-labeled training samples, can be incorporated into the process anytime. Experiments on TRECVID dataset demonstrate the effectiveness and efficiency of the proposed framework.

...read moreread less

Proceedings Article•DOI•

Query-independent learning for video search

[...]

Yuan Liu¹, Tao Mei², Guo-Jun Qi¹, Xiuqing Wu¹, Xian-Sheng Hua² - Show less +1 more•Institutions (2)

University of Science and Technology of China¹, Microsoft²

26 Aug 2008

TL;DR: The proposed approach takes a query-document pair as a sample and extracts a set of query-independent textual and visual features from each pair, suitable for a real-world video search system since the learned relevance relation is independent on any query.

...read moreread less

Abstract: Most of existing learning-based methods for query-by-example take the query examples as ldquopositiverdquo and build a model for each query. These methods, referred to as query-dependent, only achieved limited success as they can hardly be applied to real-world applications, in which an arbitrary query is usually given. To address this problem, we propose to learn a query-independent model by exploiting the relevance information which exists in the pair of query-document. The proposed approach takes a query-document pair as a sample and extracts a set of query-independent textual and visual features from each pair. It is general and suitable for a real-world video search system since the learned relevance relation is independent on any query. We conducted extensive experiments over TRECVID 2005-2007 corpus and shown superior performance (+37% in Mean Average Precision) to the query-dependent learning approaches.

...read moreread less

Showing papers by "Guo-Jun Qi published in 2008"