scispace - formally typeset
Search or ask a question

Showing papers on "Content-based image retrieval published in 2005"


Proceedings ArticleDOI
10 Nov 2005
TL;DR: Some of the key contributions in the current decade related to image retrieval and automated image annotation are discussed, spanning 120 references, and a study on the trends in volume and impact of publications in the field with respect to venues/journals and sub-topics is concluded.
Abstract: The last decade has witnessed great interest in research on content-based image retrieval. This has paved the way for a large number of new techniques and systems, and a growing interest in associated fields to support such systems. Likewise, digital imagery has expanded its horizon in many directions, resulting in an explosion in the volume of image data required to be organized. In this paper, we discuss some of the key contributions in the current decade related to image retrieval and automated image annotation, spanning 120 references. We also discuss some of the key challenges involved in the adaptation of existing image retrieval techniques to build useful systems that can handle real-world data. We conclude with a study on the trends in volume and impact of publications in the field with respect to venues/journals and sub-topics.

500 citations


Journal ArticleDOI
01 Dec 2005
TL;DR: A novel approach for texture image retrieval is proposed by using a set of dual-tree rotated complex wavelet filter (DT-RCWF) andDual-tree-complex wavelet transform ( DT-CWT) jointly, which obtains texture features in 12 different directions.
Abstract: A new set of two-dimensional (2-D) rotated complex wavelet filters (RCWFs) are designed with complex wavelet filter coefficients, which gives texture information strongly oriented in six different directions (45/spl deg/ apart from complex wavelet transform). The 2-D RCWFs are nonseparable and oriented, which improves characterization of oriented textures. Most texture image retrieval systems are still incapable of providing retrieval result with high retrieval accuracy and less computational complexity. To address this problem, we propose a novel approach for texture image retrieval by using a set of dual-tree rotated complex wavelet filter (DT-RCWF) and dual-tree-complex wavelet transform (DT-CWT) jointly, which obtains texture features in 12 different directions. The information provided by DT-RCWF complements the information generated by DT-CWT. Features are obtained by computing the energy and standard deviation on each subband of the decomposed image. To check the retrieval performance, texture database D1 of 1856 textures from Brodatz album and database D2 of 640 texture images from VisTex image database is created. Experimental results indicates that the proposed method improves retrieval rate from 69.61% to 77.75% on database D1, and from 64.83% to 82.81% on database D2, in comparing with traditional discrete wavelet transform based approach. The proposed method also retains comparable levels of computational complexity.

259 citations


Proceedings ArticleDOI
20 Jun 2005
TL;DR: This work presents a method for efficiently comparing images based on their discrete distributions of distinctive local invariant features, without clustering descriptors, and evaluates the method with scene, object, and texture recognition tasks.
Abstract: Sets of local features that are invariant to common image transformations are an effective representation to use when comparing images; current methods typically judge feature sets' similarity via a voting scheme (which ignores co-occurrence statistics) or by comparing histograms over a set of prototypes (which must be found by clustering). We present a method for efficiently comparing images based on their discrete distributions (bags) of distinctive local invariant features, without clustering descriptors. Similarity between images is measured with an approximation of the Earth Mover's Distance (EMD), which quickly computes minimal-cost correspondences between two bags of features. Each image's feature distribution is mapped into a normed space with a low-distortion embedding of EMD. Examples most similar to a novel query image are retrieved in time sublinear in the number of examples via approximate nearest neighbor search in the embedded space. We evaluate our method with scene, object, and texture recognition tasks.

184 citations


Proceedings ArticleDOI
10 Nov 2005
TL;DR: A localized CBIR system that uses labeled images in conjunction with a multiple-instance learning algorithm to first identify the desired object and weight the features accordingly, and then to rank images in the database using a similarity measure that is based upon only the relevant portions of the image.
Abstract: Classic Content-Based Image Retrieval (CBIR) takes a single non-annotated query image, and retrieves similar images from an image repository. Such a search must rely upon a holistic (or global) view of the image. Yet often the desired content of an image is not holistic, but is localized. Specifically, we define Localized Content-Based Image Retrieval as a CBIR task where the user is only interested in a portion of the image, and the rest of the image is irrelevant. Many classic CBIR systems use relevance feedback to obtain images labeled as desirable or not desirable. Yet, these labeled images are typically used only to re-weight the features used within a global similarity measure. In this paper we present a localized CBIR system, acciop, that uses labeled images in conjunction with a multiple-instance learning algorithm to first identify the desired object and re-weight the features, and then to rank images in the database using a similarity measure that is based upon individual regions within the image. We evaluate our system using a five-category natural scenes image repository, and benchmark data set, SIVAL, that we have constructed with 25 object categories.

135 citations


Journal ArticleDOI
TL;DR: A comprehensive and well-normalized description of the ranking performance compared to the performance of an ideal retrieval system defined by ground-truth for a large number of predefined queries is proposed.
Abstract: The performance of a content-based image retrieval (CBIR) system, presented in the form of precision-recall or precision-scope graphs, offers an incomplete overview of the system under study: the influence of the irrelevant items (embedding) is obscured. We propose a comprehensive and well-normalized description of the ranking performance compared to the performance of an ideal retrieval system defined by ground-truth for a large number of predefined queries. We advocate normalization with respect to relevant class size and restriction to specific normalized scope values (the number of retrieved items). We also propose new three and two-dimensional performance graphs for total recall studies in a range of embeddings.

113 citations


Proceedings ArticleDOI
01 Jan 2005
TL;DR: Retrieval of images, based on similarities between feature vectors of querying image and those from database, is considered and images recognized from user as the best matched to a query are labeled and used for updating the query feature vector through a RBF (radial basis function) neural network.
Abstract: Retrieval of images, based on similarities between feature vectors of querying image and those from database, is considered The searching procedure was performed through the two basic steps: an objective one, based on the Euclidean distances and a subjective one based on the user's relevance feedback Images recognized from user as the best matched to a query are labeled and used for updating the query feature vector through a RBF (radial basis function) neural network The searching process is repeated from such subjectively refined feature vectors In practice, several iterative steps are sufficient, as confirmed by intensive simulations

103 citations


Journal Article
TL;DR: A salient (characteristic) point detection algorithm is presented so that texture parameters are computed only in a neighborhood of salient points and used as image content descriptors and efficiently emply them to retrieve images.
Abstract: Content Based Image Retrieval (CBIR) is now a widely investigated issue that aims at allowing users of multimedia information systems to automatically retrieve images coherent with a sample image. A way to achieve this goal is the computation of image features such as the color, texture, shape, and position of objects within images, and the use of those features as query terms. We propose to use Gabor filtration properties in order to find such appropriate features. The article presents multichannel Gabor filtering and a hierarchical image representation. Then a salient (characteristic) point detection algorithm is presented so that texture parameters are computed only in a neighborhood of salient points. We use Gabor texture features as image content descriptors and efficiently emply them to retrieve images.

101 citations


Journal ArticleDOI
TL;DR: An image relevance reinforcement learning (IRRL) model for integrating existing RF techniques in a content-based image retrieval system is proposed and a concept digesting method is proposed to reduce the complexity of storage demand.
Abstract: Relevance feedback (RF) is an interactive process which refines the retrievals to a particular query by utilizing the user's feedback on previously retrieved results. Most researchers strive to develop new RF techniques and ignore the advantages of existing ones. In this paper, we propose an image relevance reinforcement learning (IRRL) model for integrating existing RF techniques in a content-based image retrieval system. Various integration schemes are presented and a long-term shared memory is used to exploit the retrieval experience from multiple users. Also, a concept digesting method is proposed to reduce the complexity of storage demand. The experimental results manifest that the integration of multiple RF approaches gives better retrieval performance than using one RF technique alone, and that the sharing of relevance knowledge between multiple query sessions significantly improves the performance. Further, the storage demand is significantly reduced by the concept digesting technique. This shows the scalability of the proposed model with the increasing-size of database.

100 citations


Journal ArticleDOI
TL;DR: A texture-based approach for palmprint feature representation is introduced to define both global and local features of a palmprint, which are characterized with high convergence of inner-palm similarities and good dispersion of inter-Palm discrimination.
Abstract: This paper presents a new approach to palmprint retrieval for personal identification. Three key issues in image retrieval are considered: feature extraction, similarity measurement and fast search for the best match of the queried image in an image database. We propose a texture-based approach for palmprint feature representation. The concept of texture energy is introduced to define both global and local features of a palmprint, which are characterized with high convergence of inner-palm similarities and good dispersion of inter-palm discrimination. The searching is carried out in a layered fashion: the global features are first used to guide the fast selection of a small set of similar candidates from the database and then the local features are applied to determine the final output from the selected set of similar candidates. The experimental results illustrate the effectiveness of the proposed approach.

84 citations


Journal ArticleDOI
TL;DR: This research demonstrates that the proposed DCP approach provides a new way, which is both robust to scale and environmental changes, and efficient in computation, for retrieving human faces in single model databases.

82 citations


Journal ArticleDOI
TL;DR: A generalisation of the multiple-instance learning model in which a bag's label is not based on a single instance's proximity to a single target point, but on a collection of instances, each near one of a set of target points.
Abstract: We describe a generalisation of the multiple-instance learning model in which a bag's label is not based on a single instance's proximity to a single target point. Rather, a bag is positive if and only if it contains a collection of instances, each near one of a set of target points. We then adapt a learning-theoretic algorithm for learning in this model and present empirical results on data from robot vision, content-based image retrieval, and protein sequence identification.

Proceedings ArticleDOI
10 Nov 2005
TL;DR: A new method for automated large scale gathering of Web images relevant to specified concepts to build a knowledge base associated with as many concepts as possible for large scale object recognition studies and supporting the building of more accurate text-based indexes for Web images.
Abstract: We propose a new method for automated large scale gathering of Web images relevant to specified concepts. Our main goal is to build a knowledge base associated with as many concepts as possible for large scale object recognition studies. A second goal is supporting the building of more accurate text-based indexes for Web images. In our method, good quality candidate sets of images for each keyword are gathered as a function of analysis of the surrounding HTML text. The gathered images are then segmented into regions, and a model for the probability distribution of regions for the concept is computed using an iterative algorithm based on the previous work on statistical image annotation. The learned model is then applied to identify which images are visually relevant to the concept implied by the keyword. Implicitly, which regions or the images are relevant is also determined. Our experiments reveal that the new method performs much better than Google Image Search and a simple method based on more standard content based image retrieval methods.

Book ChapterDOI
21 Mar 2005
TL;DR: Fractional distance measures proposed by Aggarwal et al. are applied to content-based image retrieval and show that retrieval performances of these measures consistently outperform the more usual Manhattan and Euclidean distance metrics when used with a wide range of high-dimensional visual features.
Abstract: We have applied the concept of fractional distance measures, proposed by Aggarwal et al. [1], to content-based image retrieval. Our experiments show that retrieval performances of these measures consistently outperform the more usual Manhattan and Euclidean distance metrics when used with a wide range of high-dimensional visual features. We used the parameters learnt from a Corel dataset on a variety of different collections, including the TRECVID 2003 and ImageCLEF 2004 datasets. We found that the specific optimum parameters varied but the general performance increase was consistent across all 3 collections. To squeeze the last bit of performance out of a system it would be necessary to train a distance measure for a specific collection. However, a fractional distance measure with parameter p = 0.5 will consistently outperform both L1 and L2 norms.

Proceedings ArticleDOI
02 Apr 2005
TL;DR: A mobile image-based search system that takes images of objects as queries and finds relevant web pages by matching them to similar images on the web, and demonstrates the effectiveness of a simple interactive paradigm for obtaining a segmented object boundary.
Abstract: Finding information based on an object's visual appearance is useful when specific keywords for the object are not known. We have developed a mobile image-based search system that takes images of objects as queries and finds relevant web pages by matching them to similar images on the web. Image-based search works well when matching full scenes, such as images of buildings or landmarks, and for matching objects when the boundary of the object in the image is available. We demonstrate the effectiveness of a simple interactive paradigm for obtaining a segmented object boundary, and show how a shape-based image matching algorithm can use the object outline to find similar images on the web.


Book ChapterDOI
21 Sep 2005
TL;DR: The methods used in the 2005 ImageCLEF content-based image retrieval evaluation are described, which combined several low-level image features with textual information retrieval for the medical retrieval task, and showed clear improvements over the use of one of these sources alone.
Abstract: In this paper the methods we used in the 2005 ImageCLEF content-based image retrieval evaluation are described. For the medical retrieval task, we combined several low-level image features with textual information retrieval. Combining these two information sources, clear improvements over the use of one of these sources alone are possible. Additionally we participated in the automatic annotation task, where our content-based image retrieval system, FIRE, was used as well as a second subimage based method for object classification. The results we achieved are very convincing. Our submissions ranked first and the third in the automatic annotation task out of a total of 44 submissions from 12 groups.

Proceedings ArticleDOI
07 Nov 2005
TL;DR: A novel approach for describing image color semantic including regional and global semantic description is developed and presented, which allows the users to query images with emotional semantic words.
Abstract: Describing images in semantic terms is an important and challenging problem in content-based image retrieval. According to the strong relationship between colors and human emotions, an emotional semantic query model based on image color semantic description is proposed in this study. First, images are segmented into regions through a new color image segmentation algorithm. Then, term sets are generated through a fuzzy clustering algorithm so that colors can be interpreted in semantic terms. We extend the method to extract the color semantic of image regions, and develop a novel approach for describing image color semantic including regional and global semantic description. Finally, we present an image query scheme through image color semantic description, which allows the users to query images with emotional semantic words. Experimental results demonstrate the effectiveness of our approach.

Proceedings ArticleDOI
23 May 2005
TL;DR: A content based image retrieval system, called MPEG-7 image retrieval refinement based on relevance feedback (MIRROR), is developed for evaluating MPEG- 7 visual descriptors and developing new retrieval algorithms.
Abstract: A content based image retrieval system, called MPEG-7 image retrieval refinement based on relevance feedback (MIRROR), is developed for evaluating MPEG-7 visual descriptors and developing new retrieval algorithms. The system core is based on MPEG-7 experimentation mode (XM) with Web-based user interface for query by image example retrieval. A new merged color palette approach for MPEG-7 dominant color descriptor similarity measure and relevance feedback are also developed in this system. Several MPEG-7 visual descriptors are adopted in MIRROR for performance comparison purposes.

Proceedings ArticleDOI
10 Nov 2005
TL;DR: A system named Photo-to-Search which allows users to input multimodal queries and can also search for very similar images on the Web, such as movie posters or photos of film stars, to find related information.
Abstract: Nowadays, mobile phones with the digital camera are getting more and more popular. With necessary technologies, they are possible to become a powerful tool to search the Web on the go. Most Web search engines only support text queries. Therefore, users have to convert their information needs into words. However, it is sometimes difficult to describe the needs in text and the text input is inconvenient on small devices. To solve the problem, we propose a system named Photo-to-Search which allows users to input multimodal queries. Particularly, we study queries with captured images and optional text messages in this paper. For example, the user can simply take a photo of the flower and input a few terms like "flower". Textually relevant Web images are retrieved according to the query terms. Afterwards, the snapped picture is compared with these images by the CBIR (Content Based Image Retrieval) method. According to the context of the visually similar images, related key phrases are extracted. Finally, the search results are returned in multiple forms. Our system can also search for very similar images on the Web, such as movie posters or photos of film stars, to find related information. Experimental results on the large scale data showed our system achieved satisfactory efficiency and performance.

Proceedings ArticleDOI
06 Jul 2005
TL;DR: An approach based on One-Class Support Vector Machine (SVM) to solve MIL problem in the region-based Content Based Image Retrieval (CBIR).
Abstract: Multiple Instance Learning (MIL) is a special kind of supervised learning problem that has been studied actively in recent years In this paper, we propose an approach based on One-Class Support Vector Machine (SVM) to solve MIL problem in the region-based Content Based Image Retrieval (CBIR) Relevance Feedback technique is incorporated to provide progressive guidance to the learning process Performance is evaluated and the effectiveness of our retrieval algorithm has been shown through comparative studies

Journal ArticleDOI
01 Jun 2005
TL;DR: Retrieval experiments on two benchmark image databases demonstrate the effectiveness of proposed criterion for KBDA to achieve the best possible performance at the cost of a small fractional computational overhead.
Abstract: A criterion is proposed to optimize the kernel parameters in kernel-based biased discriminant analysis (KBDA) for image retrieval. Kernel parameter optimization is performed by optimizing the kernel space such that the positive images are well clustered while the negative ones are pushed far away from the positives. The proposed criterion measures the goodness of a kernel space, and the optimal kernel parameter set is obtained by maximizing this criterion. Retrieval experiments on two benchmark image databases demonstrate the effectiveness of proposed criterion for KBDA to achieve the best possible performance at the cost of a small fractional computational overhead.

Journal ArticleDOI
TL;DR: A novel approach to relevance feedback which can return semantically related images in different visual clusters by merging the result sets of multiple queries is presented.
Abstract: Conventional approaches to image retrieval are based on the assumption that relevant images are physically near the query image in some feature space. This is the basis of the cluster hypothesis. However, semantically related images are often scattered across several visual clusters. Although traditional Content-based Image Retrieval (CBIR) technologies may utilize the information contained in multiple queries (gotten in one step or through a feedback process), this is often only a reformulation of the original query. As a result most of these strategies only get the images in some neighborhood of the original query as the retrieval result. This severely restricts the system performance. Relevance feedback techniques are generally used to mitigate this problem. In this paper, we present a novel approach to relevance feedback which can return semantically related images in different visual clusters by merging the result sets of multiple queries. We also provide experimental results to demonstrate the effectiveness of our approach.

Journal ArticleDOI
TL;DR: Image informatics at the Communications Engineering Branch of the Lister Hill National Center for Biomedical Communications (LHNCBC), an R&D division of the National Library of Medicine (NLM), includes document and biomedical images.

Proceedings ArticleDOI
16 Sep 2005
TL;DR: The combination of edges and interest points brings efficient feature detection and high recognition ratio to the image retrieval system.
Abstract: This paper presents a novel approach using combined features to retrieve images containing specific objects, scenes or buildings The content of an image is characterized by two kinds of features: Harris-Laplace interest points described by the SIFT descriptor and edges described by the edge color histogram Edges and corners contain the maximal amount of information necessary for image retrieval The feature detection in this work is an integrated process: edges are detected directly based on the Harris function; Harris interest points are detected at several scales and Harris-Laplace interest points are found using the Laplace function The combination of edges and interest points brings efficient feature detection and high recognition ratio to the image retrieval system Experimental results show this system has good performance

Journal ArticleDOI
TL;DR: A re-ranking algorithm using post-retrieval clustering for content-based image retrieval (CBIR) that achieves an improvement of retrieval effectiveness of over 10% on average in the average normalized modified retrieval rank (ANMRR) measure.
Abstract: In this paper, we propose a re-ranking algorithm using post-retrieval clustering for content-based image retrieval (CBIR). In conventional CBIR systems, it is often observed that images visually dissimilar to a query image are ranked high in retrieval results. To remedy this problem, we utilize the similarity relationship of the retrieved results via post-retrieval clustering. In the first step of our method, images are retrieved using visual features such as color histogram. Next, the retrieved images are analyzed using hierarchical agglomerative clustering methods (HACM) and the rank of the results is adjusted according to the distance of a cluster from a query. In addition, we analyze the effects of clustering methods, querycluster similarity functions, and weighting factors in the proposed method. We conducted a number of experiments using several clustering methods and cluster parameters. Experimental results show that the proposed method achieves an improvement of retrieval effectiveness of over 10% on average in the average normalized modified retrieval rank (ANMRR) measure.

Journal ArticleDOI
TL;DR: This paper proposes a soft relevance notion to integrate the users' fuzzy perception of visual contents into the framework of relevance feedback and proposes a progressive fuzzy radial basis function network to learn the user information need by optimizing a cost function.
Abstract: This paper presents a novel framework called fuzzy relevance feedback in interactive content-based image retrieval systems. Conventional binary labeling in relevance feedback requires crisp decisions to be made on the relevance of the retrieved images. This is restrictive as user interpretation of image similarity is imprecise and nonstationary in nature and may vary with respect to different information needs and perceptual subjectivity. It is, therefore, inadequate to model the user perception of image similarity with crisp binary logic. In view of this, we propose a soft relevance notion to integrate the users' fuzzy perception of visual contents into the framework of relevance feedback. A progressive fuzzy radial basis function network is proposed to learn the user information need by optimizing a cost function. An efficient gradient descent-based learning strategy is then employed to estimate the underlying network parameters. Experimental results based on a database of 10 000 images demonstrate the effectiveness of the proposed method.

Proceedings ArticleDOI
15 Apr 2005
TL;DR: In this paper, a general content-based image retrieval (CBIR) framework is proposed to aid radiologists in the retrieval of images with similar contents, which are not readily inter-applicable among different kinds of medical images.
Abstract: In the field of medical imaging, content-based image retrieval (CBIR) techniques are employed to aid radiologists in the retrieval of images with similar contents. However, CBIR methods are usually developed based on specific features of images so that those methods are not readily inter-applicable among different kinds of medical images. This work proposes a general CBIR framework in attempt to alleviate this limitation. The framework is consisted of two parts: image analysis and image retrieval. In the image analysis part, normal and abnormal regions of interest (ROIs) in a number of images are selected to form a ROI dataset. These two groups of ROIs are used to analyze 11 textural features based on gray level co-occurrence matrices. The multivariate T test is then applied to identify the features with significant discriminating power for inclusion in a feature descriptor. In the image retrieval part, each feature of the descriptor is normalized by clipping the values of the largest 5% of the same feature component, and then projecting each normalized feature onto the unit sphere. The L2 norm is then employed to determine the similarity between the query image and each ROI in the dataset. This system works in the manner of query-by-example (QBE). Query images were selected from different classes of abnormal ROIs. A maximum precision of 51% and a maximum recall of 19% were obtained. The averages of precision and recall are 49% and 18% in this experiment.

Proceedings ArticleDOI
21 Sep 2005
TL;DR: The proposed algorithm can eliminate the conversion from Cartesian to log-polar coordinates, avoid the process of interpolation required in the conversion, and obtain a more significant improvement than the conventional method using cross-correlation.
Abstract: Fourier-Mellin transform (FMT) is frequently used in content-based image retrieval and digital image watermarking. This paper extends the application of FMT into image registration and proposes an improved registration algorithm based on FMT for the alignment of images differing in translation, rotation angle, and uniform scale factor. The proposed algorithm can eliminate the conversion from Cartesian to log-polar coordinates, avoid the process of interpolation required in the conversion, and obtain a more significant improvement than the conventional method using cross-correlation. Experiments show that the algorithm is accurate and robust regardless of white noise

Proceedings ArticleDOI
TL;DR: The methodology uses techniques inspired from the information retrieval community in order to aid efficient indexing and retrieval of content-based image retrieval and object recognition from query images that have been degraded by noise and subjected to transformations through the imaging system.
Abstract: Given the large amount of research into content-based image retrieval currently taking place, new interfaces to systems that perform queries based on image content need to be considered. A new paradigm for content-based image retrieval is introduced, in which a mobile device is used to capture the query image and display the results. The system consists of a client-server architecture in which query images are captured on a mobile device and then transferred to a server for further processing. The server then returns the results of the query to the mobile device. The use of a mobile device as an interface to a content-based image retrieval or object recognition system presents a number of challenges because the query image from the device will have been degraded by noise and subjected to transformations through the imaging system. A methodology is presented that uses techniques inspired from the information retrieval community in order to aid efficient indexing and retrieval. In particular, a vector-space model is used in the efficient indexing of each image, and a two-stage pruning/ranking procedure is used to determine the correct matching image. The retrieval algorithm is shown to outperform existing algorithms when used with query images from the device.

Journal ArticleDOI
TL;DR: A new adaptive classification and cluster-merging method to find multiple regions and their arbitrary shapes of a complex image query and achieves the same high retrieval quality regardless of the shapes of query regions since the measures used in the method are invariant under linear transformations.