Conference

Conference on Image and Video Retrieval

About: Conference on Image and Video Retrieval is an academic conference. The conference publishes majorly in the area(s): Image retrieval & Image processing. Over the lifetime, 596 publications have been published by the conference receiving 20374 citations.

...read moreread less

Topics: Image retrieval, Image processing, Visual Word, TRECVID, Automatic image annotation ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

NUS-WIDE: a real-world web image database from National University of Singapore

[...]

Tat-Seng Chua¹, Jinhui Tang¹, Richang Hong¹, Haojie Li¹, Zhiping Luo¹, Yan-Tao Zheng¹ - Show less +2 more•Institutions (1)

National University of Singapore¹

08 Jul 2009

TL;DR: The benchmark results indicate that it is possible to learn effective models from sufficiently large image dataset to facilitate general image retrieval and four research issues on web image annotation and retrieval are identified.

...read moreread less

Abstract: This paper introduces a web image dataset created by NUS's Lab for Media Search. The dataset includes: (1) 269,648 images and the associated tags from Flickr, with a total of 5,018 unique tags; (2) six types of low-level features extracted from these images, including 64-D color histogram, 144-D color correlogram, 73-D edge direction histogram, 128-D wavelet texture, 225-D block-wise color moments extracted over 5x5 fixed grid partitions, and 500-D bag of words based on SIFT descriptions; and (3) ground-truth for 81 concepts that can be used for evaluation. Based on this dataset, we highlight characteristics of Web image collections and identify four research issues on web image annotation and retrieval. We also provide the baseline results for web image annotation by learning from the tags using the traditional k-NN algorithm. The benchmark results indicate that it is possible to learn effective models from sufficiently large image dataset to facilitate general image retrieval.

...read moreread less

2,648 citations

Proceedings Article•DOI•

Representing shape with a spatial pyramid kernel

[...]

Anna Bosch¹, Andrew Zisserman², X. Munoz¹•Institutions (2)

University of Girona¹, University of Oxford²

09 Jul 2007

TL;DR: This work introduces a descriptor that represents local image shape and its spatial layout, together with a spatial pyramid kernel that is designed so that the shape correspondence between two images can be measured by the distance between their descriptors using the kernel.

...read moreread less

Abstract: The objective of this paper is classifying images by the object categories they contain, for example motorbikes or dolphins. There are three areas of novelty. First, we introduce a descriptor that represents local image shape and its spatial layout, together with a spatial pyramid kernel. These are designed so that the shape correspondence between two images can be measured by the distance between their descriptors using the kernel. Second, we generalize the spatial pyramid kernel, and learn its level weighting parameters (on a validation set). This significantly improves classification performance. Third, we show that shape and appearance kernels may be combined (again by learning parameters on a validation set).Results are reported for classification on Caltech-101 and retrieval on the TRECVID 2006 data sets. For Caltech-101 it is shown that the class specific optimization that we introduce exceeds the state of the art performance by more than 10%.

...read moreread less

1,496 citations

Proceedings Article•DOI•

Towards optimal bag-of-features for object categorization and semantic video retrieval

[...]

Yu-Gang Jiang¹, Chong-Wah Ngo¹, Jun Yang²•Institutions (2)

City University of Hong Kong¹, Carnegie Mellon University²

09 Jul 2007

TL;DR: This paper evaluates various factors which govern the performance of Bag-of-features, and proposes a novel soft-weighting method to assess the significance of a visual word to an image and experimentally shows it can consistently offer better performance than other popular weighting methods.

...read moreread less

Abstract: Bag-of-features (BoF) deriving from local keypoints has recently appeared promising for object and scene classification. Whether BoF can naturally survive the challenges such as reliability and scalability of visual classification, nevertheless, remains uncertain due to various implementation choices. In this paper, we evaluate various factors which govern the performance of BoF. The factors include the choices of detector, kernel, vocabulary size and weighting scheme. We offer some practical insights in how to optimize the performance by choosing good keypoint detector and kernel. For the weighting scheme, we propose a novel soft-weighting method to assess the significance of a visual word to an image. We experimentally show that the proposed soft-weighting scheme can consistently offer better performance than other popular weighting methods. On both PASCAL-2005 and TRECVID-2006 datasets, our BoF setting generates competitive performance compared to the state-of-the-art techniques. We also show that the BoF is highly complementary to global features. By incorporating the BoF with color and texture features, an improvement of 50% is reported on TRECVID-2006 dataset.

...read moreread less

694 citations

Proceedings Article•DOI•

Evaluation of GIST descriptors for web-scale image search

[...]

Matthijs Douze¹, Hervé Jégou¹, Harsimrat Sandhawalia¹, Laurent Amsaleg², Cordelia Schmid¹ - Show less +1 more•Institutions (2)

French Institute for Research in Computer Science and Automation¹, Centre national de la recherche scientifique²

08 Jul 2009

TL;DR: This paper evaluates the search accuracy and complexity of the global GIST descriptor for two applications, for which a local description is usually preferred: same location/object recognition and copy detection, and proposes an indexing strategy for global descriptors that optimizes the trade-off between memory usage and precision.

...read moreread less

Abstract: The GIST descriptor has recently received increasing attention in the context of scene recognition. In this paper we evaluate the search accuracy and complexity of the global GIST descriptor for two applications, for which a local description is usually preferred: same location/object recognition and copy detection. We identify the cases in which a global description can reasonably be used.The comparison is performed against a state-of-the-art bag-of-features representation. To evaluate the impact of GIST's spatial grid, we compare GIST with a bag-of-features restricted to the same spatial grid as in GIST.Finally, we propose an indexing strategy for global descriptors that optimizes the trade-off between memory usage and precision. Our scheme provides a reasonable accuracy in some widespread application cases together with very high efficiency: In our experiments, querying an image database of 110 million images takes 0.18 second per image on a single machine. For common copyright attacks, this efficiency is obtained without noticeably sacrificing the search accuracy compared with state-of-the-art approaches.

...read moreread less

429 citations

Proceedings Article•DOI•

Video copy detection: a comparative study

[...]

Julien Law-To, Li Chen, Alexis Joly¹, Ivan Laptev¹, Olivier Buisson, Valérie Gouet-Brunet¹, Nozha Boujemaa¹, Fred Stentiford - Show less +4 more•Institutions (1)

French Institute for Research in Computer Science and Automation¹

09 Jul 2007

TL;DR: Local methods demonstrate their superior performance over the global ones, when detecting video copies subjected to various transformations, as well as for different lengths of video segments.

...read moreread less

Abstract: This paper presents a comparative study of methods for video copy detection Different state-of-the-art techniques, using various kinds of descriptors and voting functions, are described: global video descriptors, based on spatial and temporal features; local descriptors based on spatial, temporal as well as spatio-temporal information Robust voting functions is adapted to these techniques to enhance their performance and to compare them Then, a dedicated framework for evaluating these systems is proposed All the techniques are tested and compared within the same framework, by evaluating their robustness under single and mixed image transformations, as well as for different lengths of video segments We discuss the performance of each approach according to the transformations and the applications considered Local methods demonstrate their superior performance over the global ones, when detecting video copies subjected to various transformations

...read moreread less

301 citations

Collapse

Performance

Metrics

596

Papers

20,374

Citations

No. of papers from the Conference in previous years
Year	Papers
2011	1
2010	62
2009	53
2008	83
2007	101
2006	62