scispace - formally typeset
Search or ask a question
JournalISSN: 1520-9210

IEEE Transactions on Multimedia 

Institute of Electrical and Electronics Engineers
About: IEEE Transactions on Multimedia is an academic journal published by Institute of Electrical and Electronics Engineers. The journal publishes majorly in the area(s): Computer science & Artificial intelligence. It has an ISSN identifier of 1520-9210. Over the lifetime, 4355 publications have been published receiving 175248 citations. The journal is also known as: Transactions on multimedia & Institute of Electrical and Electronics Engineers transactions on multimedia.


Papers
More filters
Journal ArticleDOI
Xiaojun Hei, Chao Liang1, Jian Liang1, Yong Liu1, Keith W. Ross1 
TL;DR: In this paper, an in-depth measurement study of one of the most popular P2P IPTV systems, namely, PPLive, has been conducted, which enables the authors to study the global characteristics of the mesh-pull peer-to-peer IPTV system.
Abstract: An emerging Internet application, IPTV, has the potential to flood Internet access and backbone ISPs with massive amounts of new traffic. Although many architectures are possible for IPTV video distribution, several mesh-pull P2P architectures have been successfully deployed on the Internet. In order to gain insights into mesh-pull P2P IPTV systems and the traffic loads they place on ISPs, we have undertaken an in-depth measurement study of one of the most popular IPTV systems, namely, PPLive. We have developed a dedicated PPLive crawler, which enables us to study the global characteristics of the mesh-pull PPLive system. We have also collected extensive packet traces for various different measurement scenarios, including both campus access networks and residential access networks. The measurement results obtained through these platforms bring important insights into P2P IPTV systems. Specifically, our results show the following. 1) P2P IPTV users have the similar viewing behaviors as regular TV users. 2) During its session, a peer exchanges video data dynamically with a large number of peers. 3) A small set of super peers act as video proxy and contribute significantly to video data uploading. 4) Users in the measured P2P IPTV system still suffer from long start-up delays and playback lags, ranging from several seconds to a couple of minutes. Insights obtained in this study will be valuable for the development and deployment of future P2P IPTV systems.

1,070 citations

Journal ArticleDOI
TL;DR: The Rotation Region Proposal Networks are designed to generate inclined proposals with text orientation angle information that are adapted for bounding box regression to make the proposals more accurately fit into the text region in terms of the orientation.
Abstract: This paper introduces a novel rotation-based framework for arbitrary-oriented text detection in natural scene images. We present the Rotation Region Proposal Networks , which are designed to generate inclined proposals with text orientation angle information. The angle information is then adapted for bounding box regression to make the proposals more accurately fit into the text region in terms of the orientation. The Rotation Region-of-Interest pooling layer is proposed to project arbitrary-oriented proposals to a feature map for a text region classifier. The whole framework is built upon a region-proposal-based architecture, which ensures the computational efficiency of the arbitrary-oriented text detection compared with previous text detection systems. We conduct experiments using the rotation-based framework on three real-world scene text detection datasets and demonstrate its superiority in terms of effectiveness and efficiency over previous approaches.

1,002 citations

Journal ArticleDOI
TL;DR: A novel watermarking algorithm based on singular value decomposition (SVD) is proposed and results show that the newwatermarking method performs well in both security and robustness.
Abstract: Digital watermarking has been proposed as a solution to the problem of copyright protection of multimedia documents in networked environments. There are two important issues that watermarking algorithms need to address. First, watermarking schemes are required to provide trustworthy evidence for protecting rightful ownership. Second, good watermarking schemes should satisfy the requirement of robustness and resist distortions due to common image manipulations (such as filtering, compression, etc.). In this paper, we propose a novel watermarking algorithm based on singular value decomposition (SVD). Analysis and experimental results show that the new watermarking method performs well in both security and robustness.

978 citations

Journal ArticleDOI
TL;DR: This paper addresses the problem of streaming packetized media over a lossy packet network in a rate-distortion optimized way, and derives a fast practical algorithm for nearly optimal streaming and a general purpose iterative descent algorithm for locally optimal streaming in arbitrary scenarios.
Abstract: This paper addresses the problem of streaming packetized media over a lossy packet network in a rate-distortion optimized way. We show that although the data units in a media presentation generally depend on each other according to a directed acyclic graph, the problem of rate-distortion optimized streaming of an entire presentation can be reduced to the problem of error-cost optimized transmission of an isolated data unit. We show how to solve the latter problem in a variety of scenarios, including the important common scenario of sender-driven streaming with feedback over a best-effort network, which we couch in the framework of Markov decision processes. We derive a fast practical algorithm for nearly optimal streaming in this scenario, and we derive a general purpose iterative descent algorithm for locally optimal streaming in arbitrary scenarios. Experimental results show that systems based on our algorithms have steady-state gains of 2-6 dB or more over systems that are not rate-distortion optimized. Furthermore, our systems essentially achieve the best possible performance: the operational distortion-rate function of the source at the capacity of the packet erasure channel.

736 citations

Journal ArticleDOI
TL;DR: SAF R-CNN as discussed by the authors introduces multiple built-in subnetworks which detect pedestrians with scales from disjoint ranges, and outputs from all of the sub-networks are then adaptively combined to generate the final detection results that are shown to be robust to large variance in instance scales.
Abstract: In this paper, we consider the problem of pedestrian detection in natural scenes. Intuitively, instances of pedestrians with different spatial scales may exhibit dramatically different features. Thus, large variance in instance scales, which results in undesirable large intracategory variance in features, may severely hurt the performance of modern object instance detection methods. We argue that this issue can be substantially alleviated by the divide-and-conquer philosophy. Taking pedestrian detection as an example, we illustrate how we can leverage this philosophy to develop a Scale-Aware Fast R-CNN (SAF R-CNN) framework. The model introduces multiple built-in subnetworks which detect pedestrians with scales from disjoint ranges. Outputs from all of the subnetworks are then adaptively combined to generate the final detection results that are shown to be robust to large variance in instance scales, via a gate function defined over the sizes of object proposals. Extensive evaluations on several challenging pedestrian detection datasets well demonstrate the effectiveness of the proposed SAF R-CNN. Particularly, our method achieves state-of-the-art performance on Caltech [P. Dollar, C. Wojek, B. Schiele, and P. Perona, “Pedestrian detection: An evaluation of the state of the art,” IEEE Trans. Pattern Anal. Mach. Intell. , vol. 34, no. 4, pp. 743–761, Apr. 2012], and obtains competitive results on INRIA [N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. , 2005, pp. 886–893], ETH [A. Ess, B. Leibe, and L. V. Gool, “Depth and appearance for mobile scene analysis,” in Proc. Int. Conf. Comput. Vis ., 2007, pp. 1–8], and KITTI [A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? The KITTI vision benchmark suite,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit ., 2012, pp. 3354–3361].

716 citations

Performance
Metrics
No. of papers from the Journal in previous years
YearPapers
2023632
2022860
2021497
2020269
2019248
2018242