Towards an All-Purpose Content-Based Multimedia Information Retrieval System

Open AccessPosted Content

Towards an All-Purpose Content-Based Multimedia Information Retrieval System

Ralph Gasser, +2 more

- 11 Feb 2019 -

arXiv: Multimedia

Chats0

TLDR

The full vitrivr stack is unique in that it is the first multimedia retrieval system that seamlessly integrates support for four different types of media, and paves the way towards an all-purpose, content-based multimedia information retrieval system.

Abstract:

The growth of multimedia collections - in terms of size, heterogeneity, and variety of media types - necessitates systems that are able to conjointly deal with several forms of media, especially when it comes to searching for particular objects. However, existing retrieval systems are organized in silos and treat different media types separately. As a consequence, retrieval across media types is either not supported at all or subject to major limitations. In this paper, we present vitrivr, a content-based multimedia information retrieval stack. As opposed to the keyword search approach implemented by most media management systems, vitrivr makes direct use of the object's content to facilitate different types of similarity search, such as Query-by-Example or Query-by-Sketch, for and, most importantly, across different media types - namely, images, audio, videos, and 3D models. Furthermore, we introduce a new web-based user interface that enables easy-to-use, multimodal retrieval from and browsing in mixed media collections. The effectiveness of vitrivr is shown on the basis of a user study that involves different query and media types. To the best of our knowledge, the full vitrivr stack is unique in that it is the first multimedia retrieval system that seamlessly integrates support for four different types of media. As such, it paves the way towards an all-purpose, content-based multimedia information retrieval system.

Towards an All-Purpose Content-Based Multimedia Information Retrieval System

Citations

Retrieval of Structured and Unstructured Data with vitrivr

Multimodal Multimedia Retrieval with vitrivr

Melody extraction from polyphonic music signal susing pitch contour characteristics

A Multimodal End-to-End Deep Learning Architecture for Music Popularity Prediction

Multi-Stage Queries and Temporal Scoring in Vitrivr

References

Distinctive Image Features from Scale-Invariant Keypoints

Object recognition from local scale-invariant features

Speeded-Up Robust Features (SURF)

On the use of windows for harmonic analysis with the discrete Fourier transform

Content-based image retrieval at the end of the early years