Image Retrieval Using Textual Cues

doi:10.1109/ICCV.2013.378

Open AccessProceedings ArticleDOI

Image Retrieval Using Textual Cues

Anand Mishra, +2 more

- pp 3040-3047

Chats0

TLDR

An approach for the text-to-image retrieval problem based on textual content present in images, where the retrieval performance is evaluated on public scene text datasets as well as three large datasets, namely IIIT scene text retrieval, Sports-10K and TV series-1M.

Abstract:

We present an approach for the text-to-image retrieval problem based on textual content present in images. Given the recent developments in understanding text in images, an appealing approach to address this problem is to localize and recognize the text, and then query the database, as in a text retrieval problem. We show that such an approach, despite being based on state-of-the-art methods, is insufficient, and propose a method, where we do not rely on an exact localization and recognition pipeline. We take a query-driven search approach, where we find approximate locations of characters in the text query, and then impose spatial constraints to generate a ranked list of images in the database. The retrieval performance is evaluated on public scene text datasets as well as three large datasets, namely IIIT scene text retrieval, Sports-10K and TV series-1M, we introduce.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Cross-Modality Retrieval by Joint Correlation Learning

Shuo Wang, +4 more

- 03 Jul 2019 -

ACM Transactions on Multimedia Computing...

TL;DR: This article proposes a cross-modal learning model with joint correlative calculation learning that achieves comparable performance to the state-of-the-art on three benchmarks, i.e., Flickr8k, Flickr30k, and MS-COCO.

...read moreread less

Journal ArticleDOI

An Efficient Industrial System for Vehicle Tyre (Tire) Detection and Text Recognition Using Deep Learning

Wajahat Kazmi, +4 more

- 01 Feb 2021 -

IEEE Transactions on Intelligent Transpo...

TL;DR: This paper presents first of its kind, a full scale industrial system which can read tyre codes when installed along driveways such as at gas stations or parking lots with vehicles driving under 10mph, and shows promise for the intended application.

...read moreread less

Journal ArticleDOI

Real-time Lexicon-free Scene Text Retrieval

Andres Mafla, +6 more

- 01 Feb 2021 -

Pattern Recognition

TL;DR: The proposed model uses a single shot CNN architecture that predicts bounding boxes and builds a compact representation of spotted words, which can be modeled as a nearest neighbor search of the textual representation of a query over the outputs of the CNN collected from the totality of an image database.

...read moreread less

Posted Content

Robust Scene Text Recognition Using Sparse Coding based Features.

Da-Han Wang, +4 more

- 29 Dec 2015 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: In this article, the HSC features are extracted by computing sparse codes with dictionaries that are learned from data using K-SVD, and aggregating per-pixel sparse codes to form local histograms, and the final recognition results are obtained by searching for the words which correspond to the maximum value of the objective function.

...read moreread less

Journal ArticleDOI

A survey of methods, datasets and evaluation metrics for visual question answering

Himanshu Sharma, +1 more

- 01 Dec 2021 -

Image and Vision Computing

TL;DR: This paper has discussed some of the core concepts used in VQA systems and presented a comprehensive survey of efforts in the past to address this problem, and discussed some new datasets developed in 2019 and 2020.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Histograms of oriented gradients for human detection

Navneet Dalal, +1 more

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.

...read moreread less

Proceedings ArticleDOI

Rapid object detection using a boosted cascade of simple features

Paul A. Viola, +1 more

TL;DR: A machine learning approach for visual object detection which is capable of processing images extremely rapidly and achieving high detection rates and the introduction of a new image representation called the "integral image" which allows the features used by the detector to be computed very quickly.

...read moreread less

Book

Introduction to Information Retrieval

Christopher D. Manning, +2 more

TL;DR: In this article, the authors present an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections.

...read moreread less

Journal ArticleDOI

Object Detection with Discriminatively Trained Part-Based Models

Pedro F. Felzenszwalb, +3 more

- 01 Sep 2010 -

IEEE Transactions on Pattern Analysis an...

TL;DR: An object detection system based on mixtures of multiscale deformable part models that is able to represent highly variable object classes and achieves state-of-the-art results in the PASCAL object detection challenges is described.

...read moreread less

Proceedings ArticleDOI

Video Google: a text retrieval approach to object matching in videos

Sivic, +1 more

TL;DR: An approach to object and scene retrieval which searches for and localizes all the occurrences of a user outlined object in a video, represented by a set of viewpoint invariant region descriptors so that recognition can proceed successfully despite changes in viewpoint, illumination and partial occlusion.

...read moreread less

International Journal of Computer Vision

VQA: Visual Question Answering

Stanislaw Antol, +6 more

Image Retrieval Using Textual Cues

Citations

Cross-Modality Retrieval by Joint Correlation Learning

An Efficient Industrial System for Vehicle Tyre (Tire) Detection and Text Recognition Using Deep Learning

Real-time Lexicon-free Scene Text Retrieval

Robust Scene Text Recognition Using Sparse Coding based Features.

A survey of methods, datasets and evaluation metrics for visual question answering

References

Histograms of oriented gradients for human detection

Rapid object detection using a boosted cascade of simple features

Introduction to Information Retrieval

Object Detection with Discriminatively Trained Part-Based Models

Video Google: a text retrieval approach to object matching in videos

Related Papers (5)

ICDAR 2013 Robust Reading Competition

ICDAR 2015 competition on Robust Reading

ImageNet: A large-scale hierarchical image database

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

VQA: Visual Question Answering