scispace - formally typeset
Search or ask a question
Book ChapterDOI

Info-Graphics Retrieval: A Multi-kernel Distance Based Hashing Scheme

19 Dec 2016-pp 288-298
TL;DR: This paper presents a multi-modal document image retrieval framework by learning an optimal fusion of information from text and info-graphics regions and demonstrates the evaluation of the proposed concept on documents collected from various sources.
Abstract: Information retrieval research has shown significant improvement and provided techniques that retrieve documents in image or text form. However, retrieval of multi-modal documents has been given very less attention. We aim to build a system for retrieval of documents with embedded information graphics (Info-graphics). Info-graphics are images of bar charts and line graphs appearing with textual components in magazines, newspapers, and journals. In this paper, we present multi-modal document image retrieval framework by learning an optimal fusion of information from text and info-graphics regions. The evaluation of the proposed concept is demonstrated on documents collected from various sources such as magazines and journals.
References
More filters
Proceedings ArticleDOI
28 Jul 2013
TL;DR: This paper presents the first steps toward a system for retrieving bar charts and line graphs that reasons about the content of the graphic itself in deciding its relevance to the user query, and achieves accuracy higher than 80\% on a corpus of collected user queries.
Abstract: Information retrieval research has made significant progress in the retrieval of text documents and images. However, relatively little attention has been given to the retrieval of information graphics (non-pictorial images such as bar charts and line graphs) despite their proliferation in popular media such as newspapers and magazines. Our goal is to build a system for retrieving bar charts and line graphs that reasons about the content of the graphic itself in deciding its relevance to the user query. This paper presents the first steps toward such a system, with a focus on identifying the category of intended message of potentially relevant bar charts and line graphs. Our learned model achieves accuracy higher than 80\% on a corpus of collected user queries.

12 citations

Journal ArticleDOI
01 Nov 2015
TL;DR: A novel methodology for retrieving infographics from a digital library that takes into account a graphic's structural and message content is presented, and it significantly outperforms a baseline method that treats queries and graphics as bags of words.
Abstract: Information graphics (infographics) in popular media are highly structured knowledge representations that are generally designed to convey an intended message This paper presents a novel methodology for retrieving infographics from a digital library that takes into account a graphic's structural and message content The retrieval methodology can be summarized thus: 1) hypothesize requisite structural and message content from a natural language query, 2) measure the relevance of each candidate infographic to the requisite structural and message content hypothesized from the user query, and 3) integrate these relevance measurements via a linear combination model in order to produce a ranked list of infographics in response to the user query The methodology has been implemented and evaluated, and it significantly outperforms a baseline method that treats queries and graphics as bags of words

12 citations

01 Jan 2012
TL;DR: A learned model for segmenting a line graph into visually distinguishable trends and a Bayesian network inference model that hypothesizes the intended message of the graph based on communicative signals in the graphic are presented.
Abstract: Information graphics (line graphs, bar charts, etc.) are common in popular media and periodicals. They are usually included in such documents to convey a message. This dissertation discusses the processing of one kind of information graphic, namely a line graph. It presents a learned model for segmenting a line graph into visually distinguishable trends and a Bayesian network inference model that hypothesizes the intended message of the graph based on communicative signals in the graphic. Besides recognizing the intended message of line graphs, this dissertation also presents a method for identifying the paragraph in the document that is most relevant to its information graphic. The research results provided by this dissertation can be used for several purposes: to give blind individuals access to information graphics in an article, to provide the basis for a longer summary of the graphic, to build a summary that captures both the article and its containing information graphics, and to indicate a graphic's content when indexing it for retrieval in a digital library.

7 citations

Book ChapterDOI
28 Mar 2010
TL;DR: A probabilistic framework based on the assumption that images and their co-occurring textual data are generated by mixtures of latent topics is described, which shows performance gains over previously proposed approaches, despite the noisy nature of the dataset.
Abstract: Image annotation, the task of automatically generating description words for a picture, is a key component in various image search and retrieval applications. Creating image databases for model development is, however, costly and time consuming, since the keywords must be hand-coded and the process repeated for new collections. In this work we exploit the vast resource of images and documents available on the web for developing image annotation models without any human involvement. We describe a probabilistic framework based on the assumption that images and their co-occurring textual data are generated by mixtures of latent topics. Applications of this framework to image annotation and retrieval show performance gains over previously proposed approaches, despite the noisy nature of our dataset. We also discuss how the proposed model can be used for story picturing, i.e., to find images that appropriately illustrate a text and demonstrate its utility when interfaced with an image caption generator.

6 citations

Proceedings ArticleDOI
25 Aug 2013
TL;DR: A novel multi-modal document indexing framework for retrieval of old and degraded text documents by combining OCR'ed text and image based representation using learning is proposed.
Abstract: The paper proposes a novel multi-modal document image retrieval framework by exploiting the information of text and graphics regions. The framework applies multiple kernel learning based hashing formulation for generation of composite document indexes using different modalities. The existing multimedia management methods for imaged text documents have not addressed the requirement of old and degraded documents. In the subsequent contribution, we propose novel multi-modal document indexing framework for retrieval of old and degraded text documents by combining OCR'ed text and image based representation using learning. The evaluation of proposed concepts is demonstrated on sampled magazine cover pages, and documents of Devanagari script.

5 citations