scispace - formally typeset
Search or ask a question
Topic

Multimedia database

About: Multimedia database is a research topic. Over the lifetime, 1404 publications have been published within this topic receiving 19856 citations. The topic is also known as: Multimedia database & MMDB.


Papers
More filters
Journal ArticleDOI
TL;DR: The “adaptation-chain” of (MPEG-conformant metadata based adaptation is described: from the creation stage at the server side, through its usage in the network, up to the consumption at the client, and how the metadata are used to steer the adaptation processes.
Abstract: The ADMITS project (Adaptation in Distributed Multimedia IT Systems) is building an experimental distributed multimedia system for investigations into adaptation, which we consider an increasingly important tool for multimedia systems. A number of possible adaptation entities (server, proxy, clients, routers) are being explored, different algorithms for media, component and application-level adaptations are being implemented and evaluated, and experimental data are being derived to gain insight into when, where and how to adapt, and how individual, distributed adaptation steps interoperate and interact with each other. In this paper the “adaptation-chain” of (MPEG-conforming) metadata based adaptation is described: from the creation stage at the server side, through its usage in the network (actually in a proxy), up to the consumption at the client. The metadata are used to steer the adaptation processes. MPEG-conformant metadata, the so-called variation descriptions, are introduced; an example of a complete MPEG-7 document describing temporal scaling of an MPEG-4 video is given. The meta-database designed to store the metadata is briefly discussed. We describe how the metadata can be extracted from MPEG-4 visual elementary streams and initial results from a temporal video scaling experiment are given. We further present how the metadata can be utilized by enhanced cache replacement algorithms in a proxy server in order to realize quality-based caching; experimental results using these algorithms are also given. Finally, an adaptive query and presentation interface to the meta- and media database is outlined.

48 citations

Journal ArticleDOI
TL;DR: An abstract semantic model based on an augmented transition network (ATN) is presented, which provides three major capabilities: multimedia presentations, temporal/spatial multimedia database searching, and multimedia browsing.
Abstract: As more information sources become available in multimedia systems, the development of abstract semantic models for video, audio, text, and image data is becoming very important. An abstract semantic model has two requirements: it should be rich enough to provide a friendly interface of multimedia presentation synchronization schedules to the users and it should be a good programming data structure for implementation in order to control multimedia playback. An abstract semantic model based on an augmented transition network (ATN) is presented. The inputs for ATNs are modeled by multimedia input strings. Multimedia input strings provide an efficient means for iconic indexing of the temporal/spatial relations of media streams and semantic objects. An ATN and its subnetworks are used to represent the appearing sequence of media streams and semantic objects. The arc label is a substring of a multimedia input string. In this design, a presentation is driven by a multimedia input string. Each subnetwork has its own multimedia input string. Database queries relative to text, image, and video can be answered via substring matching at subnetworks. Multimedia browsing allows users the flexibility to select any part of the presentation they prefer to see. This means that the ATN and its subnetworks can be included in multimedia database systems which are controlled by a database management system (DBMS). User interactions and loops are also provided in an ATN. Therefore, ATNs provide three major capabilities: multimedia presentations, temporal/spatial multimedia database searching, and multimedia browsing.

48 citations

Proceedings ArticleDOI
01 Feb 2000
TL;DR: This paper proposes 3 techniques, namely Reuse (RU), Full Reconstruction (FR) and Selective Reconstruction (SR), that cache certain information during the execution of the previous iterations of the query and use that cached information to save the execution cost (both I/O and CPU costs) during the subsequent iterations.
Abstract: ness of query refinement, there exists no work on how to implement query refinement efficiently in a multimedia database system. We explore such approaches in this paper. The proposed approaches are independent of the refinement model used (e.g., QPM or QEX) and hence work for all models. We assume that each feature is indexed using a multidimensional index structure (called the F-index). A similarity query can then be answered by executing a k-NN query on each F-index and merging the individual feature results to obtain the final results. Our first contribution is to generalize the notion of similarity queries and allow multiple query points in a query (referred to as multipoint queries). This generalization is necessary since refined queries cannot be always expressed as single point queries. We develop a k-NN algorithm that can handle multipoint queries and show that it performs significantly better than the naive approach (i.e. execute several single point queries using the ‘single-point’ kNN algorithm and merge results). The second and the main problem we address is how to evaluate refined queries efficiently. A naive approach is to treat a refined query just like a starting query and execute it from scratch. We observe that the refined queries are not modified drastically from one iteration to another. As a result, most of the execution cost can be saved by appropriately exploiting the information generated during the previous iterations of the query. We propose 3 techniques, namely Reuse (RU), Full Reconstruction (FR) and Selective Reconstruction (SR), that cache certain information during the execution of the previous iterations of the query and use that cached information to save the execution cost (both I/O and CPU costs) during the subsequent iterations. We define notions of I/O optimality and CPU optimality and evaluate the three schemes in terms of these criteria. We find that although RU is significantly better than the naive approach, it is not I/O optimal. So we propose FR and show that it is I/O optimal and is significantly more efficient compared to RU. However, FR is not CPU optimal. We finally propose SR and show that SR, like FR, is I/O optimal, and, under certain conditions, also CPU optimal. Even though SR is not always CPU optimal, it is always better than FR in terms of CPU cost and is hence the best technique. Our experiments show that the above techniques speed up the execution of refined queries by several orders of magnitude compared to the naive technique.

47 citations

Book ChapterDOI
01 Jan 2005
TL;DR: In this chapter, an innovative shot boundary detection method using an unsupervised segmentation algorithm and the technique of object tracking based on the segmentation mask maps is presented and results show that the method can obtain object-level information of the video frames as well as accurateShot boundary detection, which are very useful for video content indexing.
Abstract: Recently, multimedia information, especially video data, has been made overwhelmingly accessible with the rapid advances in communication and multimedia computing technologies. Video is popular in many applications, which makes the efficient management and retrieval of the growing amount of video information very important. Toward such a demand, an effective video shot boundary detection method is necessary, which is a fundamental operation required in many multimedia applications. In this chapter, an innovative shot boundary detection method using an unsupervised segmentation algorithm and the technique of object tracking based on the segmentation mask maps is presented. A series of experiments on various types of video types are performed, and the experimental results show that our method can obtain object-level information of the video frames as well as accurate shot boundary detection, which are very useful for video content indexing. 701 E. Chocolate Avenue, Suite 200, Hershey PA 17033-1240, USA Tel: 717/533-8845; Fax 717/533-8661; URL-http://www.irm-press.com IRM PRE S This chapter appears in the book, Video Data Management and Information Retrieval by Sagarmay Deb. Copyright © 2005, IRM Press, an imprint of Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited. 218 Chen, Shyu, & Zhang Copyright © 2005, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited. INTRODUCTION Unlike traditional database systems that have text or numerical data, a multimedia database or information system may contain different media such as text, image, audio, and video. Video, in particular, has become more and more popular in many applications such as education and training, video conferencing, video-on-demand (VOD), and news services. The traditional way for the users to search for certain content in a video is to fast-forward or rewind, which are sequential processes, making it difficult for the users to browse a video sequence directly based on their interests. Hence, it becomes important to be able to organize video data and provide the visual content in compact forms in multimedia applications (Zabih, Miller, & Mai, 1995). In many multimedia applications such as digital libraries and VOD, video shot boundary detection is fundamental and must be performed prior to all other processes (Shahraray, 1995; Zhang & Smoliar, 1994). A video shot is a video sequence that consists of continuous video frames for one action, and shot boundary detection is an operation to divide the video data into physical video shots. Many video shot boundary detection methods have been proposed in the literature. Most of them use low-level global features in the matching process between two consecutive frames for shot boundary detection, for example, using the luminance pixel-wise difference (Zhang, Kankanhalli, & Smoliar, 1993), luminance or color histogram difference (Swanberg, Shu, & Jain, 1993), edge difference (Zabih et al., 1995), and the orientation histogram (Ngo, Pong, & Chin, 2000). However, these low-level features cannot provide satisfactory results for shot boundary detection since luminance or color is sensitive to small changes. For example, Yeo and Liu (1995) proposed a method that uses the luminance histogram difference of DC images, which is very sensitive to luminance changes. There are also approaches focusing on the compressed video data domain. For example, Lee, Kim, and Choi (2000) proposed a fast scene/shot change detection method, and Hwang and Jeong (1998) proposed the directional information retrieving method by using the discrete cosine transform (DCT) coefficients in MPEG video data. In addition, dynamic and adaptive threshold determination is also applied to enhance the accuracy and robustness of the existing techniques in shot cuts detection (Alattar, 1997; Gunsel, Ferman, & Tekalp, 1998; Truong, Dorai, & Venkatesh, 2000). In Gunsel et al. (1998), the unsupervised clustering algorithm proposed a generic technique that does not need threshold setting and allows multiple features to be used simultaneously; while an adaptive threshold determination method that reduces the artifacts created by noise and motion in shot change detection was proposed by Truong et al. (2000). In this chapter, we present an innovative shot boundary detection method using an unsupervised image-segmentation algorithm and the object-tracking technique on the uncompressed video data. In our method, the image-segmentation algorithm extracts the segmentation mask map of each video frame automatically, which can be deemed as the clustering feature map of each frame and where the pixels in each frame have been grouped into different classes (e.g., two classes). Then the difference between the segmentation mask maps of two frames is checked. Moreover, due to camera panning and tilting, we propose an object-tracking method based on the segmentation results to enhance the matching. The cost for object tracking is almost trivial since the segmentation results are already available. In addition, the bounding boxes and the positions of 18 more pages are available in the full version of this document, which may be purchased using the "Add to Cart" button on the product's webpage: www.igi-global.com/chapter/innovative-shot-boundarydetection-video/30767?camid=4v1 This title is available in InfoSci-Books, InfoSci-Database Technologies, Library Science, Information Studies, and Education, InfoSci-Library Information Science and Technology. Recommend this product to your librarian: www.igi-global.com/e-resources/libraryrecommendation/?id=1

47 citations

Journal ArticleDOI
01 Sep 1996
TL;DR: This work fixes the query execution strategy and develops a site-independent MDO dependency graph representation to model the dependencies among the MDOs accessed by a query, and formulates the data allocation problem as an optimization problem.
Abstract: A major cost in retrieving multimedia data from multiple sites is the cost incurred in transferring multimedia data objects (MDOs) from different sites to the site where the query is initiated. The objective of a data allocation algorithm is to locate the MDOs at different sites so as to minimize the total data transfer cost incurred in executing a given set of queries. The optimal allocation of MDOs depends on the query execution strategy employed by a distributed multimedia system while the query execution strategy optimizes a query based on this allocation. We fix the query execution strategy and develop a site-independent MDO dependency graph representation to model the dependencies among the MDOs accessed by a query. Given the MDO dependency graphs as well as the set of multimedia database sites, data transfer costs between the sites, the allocation limit on the number of MDOs that can be allocated at a site, and the query execution frequencies from the sites, an allocation scheme is generated. We formulate the data allocation problem as an optimization problem. We solve this problem with a number of techniques that broadly belong to three classes: max-flow min-cut, state-space search, and graph partitioning heuristics. The max-flow min-cut technique formulates the data allocation problem as a network-flow problem, and uses a hill-climbing approach to try to find the optimal solution. For the state-space search approach, the problem is solved using a best-first search algorithm. The graph partitioning approach uses two clustering heuristics, the agglomerative clustering and divisive clustering. We evaluate and compare these approaches, and assess their cost-performance trade-offs. All algorithms are also compared with optimal solutions obtained through exhaustive search. Conclusions are also made on the suitability of these approaches to different scenarios.

46 citations


Network Information
Related Topics (5)
Server
79.5K papers, 1.4M citations
77% related
Graph (abstract data type)
69.9K papers, 1.2M citations
75% related
Wireless sensor network
142K papers, 2.4M citations
75% related
Mobile computing
51.3K papers, 1M citations
75% related
Feature extraction
111.8K papers, 2.1M citations
74% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20232
20224
202113
20206
201911
201824