scispace - formally typeset
Search or ask a question

Showing papers in "Signal Processing-image Communication in 2011"


Journal ArticleDOI
TL;DR: The need of a color filter array is discussed and a survey of several techniques proposed to demosaicking is presented, discussing their performances.
Abstract: Demosaicking is the process of reconstructing a full-resolution color image from the sampled data acquired by a digital camera that apply a color filter array to a single sensor. This paper discusses the need of a color filter array and presents a survey of several techniques proposed to demosaicking. A comparison between the different methods is also provided, discussing their performances.

162 citations


Journal ArticleDOI
TL;DR: The commutative property of the proposed method allows to cipher a watermarked image without interfering with the embedded signal or to watermark an encrypted image still allowing a perfect deciphering.
Abstract: In this paper a commutative watermarking and ciphering scheme for digital images is presented. The commutative property of the proposed method allows to cipher a watermarked image without interfering with the embedded signal or to watermark an encrypted image still allowing a perfect deciphering. Both operations are performed on a parametric transform domain: the Tree Structured Haar transform. The key dependence of the adopted transform domain increases the security of the overall system. In fact, without the knowledge of the generating key it is not possible to extract any useful information from the ciphered-watermarked image. Experimental results show the effectiveness of the proposed scheme.

136 citations


Journal ArticleDOI
TL;DR: Experimental results show that the proposed algorithm can tolerate almost all the typical image processing manipulations, including JPEG compression, geometric distortion, blur, addition of noise, and enhancement.
Abstract: Image authentication has become an emergency issue in the digital world as it can be easily tampered with the image editing techniques. In this paper, a novel robust hashing method for image authentication is proposed. The reported scheme first performs Radon transform (RT) on the image, and calculates the moment features which are invariant to translation and scaling in the projection space. Then discrete Fourier transform (DFT) is applied on the moment features to resist rotation. Finally, the magnitude of the significant DFT coefficients is normalized and quantized as the image hash bits. Experimental results show that the proposed algorithm can tolerate almost all the typical image processing manipulations, including JPEG compression, geometric distortion, blur, addition of noise, and enhancement. Compared with other approaches in the literature, the reported method is more effective for image authentication in terms of detection performance and the hash size.

124 citations


Journal ArticleDOI
TL;DR: A view-based 3D model retrieval algorithm, where many-to-many matching method, weighted bipartite graph matching, is employed for comparison between two 3D models.
Abstract: In this paper, we propose a view-based 3D model retrieval algorithm, where many-to-many matching method, weighted bipartite graph matching, is employed for comparison between two 3D models. In this work, each 3D model is represented by a set of 2D views. Representative views are first selected from the query model and the corresponding initial weights are provided. These initial weights are further updated based on the relationship among these representative views. The weighted bipartite graph is built with these selected 2D views, and the matching result is used to measure the similarity between two 3D models. Experimental results and comparison with existing methods show the effectiveness of the proposed algorithm.

117 citations


Journal ArticleDOI
TL;DR: This paper proposes a method to compute a set of optimal client strategies for continuous video playback by optimizing the overall video quality by proper selection of the next chunk from the encoded versions.
Abstract: In state-of-the-art adaptive streaming solutions, to cope with varying network conditions, the client side can switch between several video copies encoded at different bit-rates during streaming. Each video copy is divided into chunks of equal duration. To achieve continuous video playback, each chunk needs to arrive at the client before its playback deadline. The perceptual quality of a chunk increases with the chunk size in bits, whereas bigger chunks require more transmission time and, as a result, have a higher risk of missing transmission deadline. Therefore, there is a trade-off between the overall video quality and continuous playback, which can be optimized by proper selection of the next chunk from the encoded versions. This paper proposes a method to compute a set of optimal client strategies for this purpose.

82 citations


Journal ArticleDOI
TL;DR: Experimental results show that the proposed scheme can discriminate the malicious tampering from the mild signal processing and the tampered location can also be approximately determined according to the glide window and the predefined threshold.
Abstract: As the H.264/AVC-based video products become more and more popular, issues of copyright protection and authentication that are appropriate for this standard will be very important. In this paper, a content-based authentication watermarking scheme for H.264/AVC video is proposed. Considering the new feature of H.264/AVC, the content-based authentication code for spatial tampering is firstly generated using the reliable features extracted from video frame blocks. The authentication code, which can detect malicious manipulations but allow recompression, is embedded into the DCT coefficients in diagonal positions using a novel modulation method. Spatial tampering can be located by comparing the extracted and the original feature-based watermarks. In addition, combining ECC and interleaving coding, the frame index of each video frame is used as watermark information and embedded in the residual coefficients. Temporal tampering can be detected by the mismatch between the extracted and the observed frame index. Experimental results show that the proposed scheme can discriminate the malicious tampering from the mild signal processing. The tampered location can also be approximately determined according to the glide window and the predefined threshold.

76 citations


Journal ArticleDOI
TL;DR: This work shows that the new video QA algorithms are highly responsive to packet loss errors, and proposes a general framework for constructing temporal video quality assessment (QA) algorithms that seek to assess transient temporal errors, such as packet losses.
Abstract: We examine the effect that variations in the temporal quality of videos have on global video quality. We also propose a general framework for constructing temporal video quality assessment (QA) algorithms that seek to assess transient temporal errors, such as packet losses. The proposed framework modifies simple frame-based quality assessment algorithms by incorporating a temporal quality variance factor. We use packet loss from channel errors as a specific study of practical significance. Using the PSNR and the SSIM index as exemplars, we are able to show that the new video QA algorithms are highly responsive to packet loss errors.

67 citations


Journal ArticleDOI
TL;DR: Kurtosis-based NR quality measures for JPEG2000 compressed images are developed based on either 1-D or 2-D kurtosis in the discrete cosine transform (DCT) domain of general image blocks and demonstrate good consistency with subjective quality scores.
Abstract: No-reference (NR) image quality assessment (QA) presumes no prior knowledge of reference (distortion-free) images and seeks to quantitatively predict visual quality solely from the distorted images. We develop kurtosis-based NR quality measures for JPEG2000 compressed images in this paper. The proposed measures are based on either 1-D or 2-D kurtosis in the discrete cosine transform (DCT) domain of general image blocks. Comprehensive testing demonstrates their good consistency with subjective quality scores as well as satisfactory performance in comparison with both the representative full-reference (FR) and state-of-the-art NR image quality measures.

47 citations


Journal ArticleDOI
TL;DR: The experimental results show that TiBS does not provide high compression ratios, but it enables energy-efficient image communication, even for the source camera node, and even for high packet loss rates.
Abstract: This article presents a lightweight image compression algorithm explicitly designed for resource-constrained wireless camera sensors, called TiBS (tiny block-size image coding). TiBS operates on blocks of 2x2 pixels (this makes it easy for the end-user to conceal missing blocks due to packet losses) and is based on pixel removal. Furthermore, TiBS is combined with a chaotic pixel mixing scheme to reinforce the robustness of image communication against packet losses. For validation purposes, TiBS as well as a JPEG-like algorithm have been implemented on a real wireless camera sensor composed of a Mica2 mote and a Cyclops imager. The experimental results show that TiBS does not provide high compression ratios, but it enables energy-efficient image communication, even for the source camera node, and even for high packet loss rates. Considering an original 8-bpp grayscale image for instance, the amount of energy consumed by the Cyclops/Mica2 can be reduced by around 60% when the image is compressed using TiBS, compared to the scenario without compression. Moreover, the visual quality of reconstructed images is usually acceptable under packet losses conditions up to 40-50%. In comparison, the JPEG-like algorithm results in clearly more energy consumption than TiBS at similar image quality and, of course, its resilience to packet losses is lower because of the larger size of encoded blocks. Adding redundant packets to the JPEG-encoded data packets may be considered to deal with packet losses, but the energy problem remains.

45 citations


Journal ArticleDOI
TL;DR: A simple visual quality metric is designed by considering the ABT-based JND masking properties and confirms that the ABTs JND consists well with the HVS, which is comparable to the state-of-the-art metrics.
Abstract: In this paper, we propose a novel Adaptive Block-size Transform (ABT) based Just-Noticeable Difference (JND) model for images/videos. Extension from 8x8 Discrete Cosine Transform (DCT) based JND model to 16x16 DCT based JND is firstly performed by considering both the spatial and temporal Human Visual System (HVS) properties. For still images or INTRA video frames, a new spatial selection strategy based on the Spatial Content Similarity (SCS) between a macroblock and its sub-blocks is proposed to determine the transform size to be employed to generate the JND map. For the INTER video frames, a temporal selection strategy based on the Motion Characteristic Similarity (MCS) between a macroblock and its sub-blocks is presented to decide the transform size for the JND. Compared with other JND models, our proposed scheme can tolerate more distortions while preserving better perceptual quality. In order to demonstrate the efficiency of the ABT-based JND in modeling the HVS properties, a simple visual quality metric is designed by considering the ABT-based JND masking properties. Evaluating on the image and video subjective databases, the proposed metric delivers a performance comparable to the state-of-the-art metrics. It confirms that the ABT-based JND consists well with the HVS. The proposed quality metric also is applied on ABT-based H.264/Advanced Video Coding (AVC) for the perceptual video coding. The experimental results demonstrate that the proposed method can deliver video sequences with higher visual quality at the same bit-rates.

43 citations


Journal ArticleDOI
Lihua Tian1, Nanning Zheng1, Jianru Xue1, Ce Li1, Xiaofeng Wang1 
TL;DR: Experimental results on standard benchmark demonstrate that compared with the state-of-the-art watermarking scheme, the proposed method is more robust to white noise, filtering and JPEG compression attacks and can effectively detect tamper and locate forgery.
Abstract: This paper proposes an integrated visual saliency-based watermarking approach, which can be used for both synchronous image authentication and copyright protection. Firstly, regions of interest (ROIs), which are not in a fixed size and can present the most important information of one image, would be extracted automatically using the proto-object based saliency attention model. Secondly, to resist common signal processing attacks, for each ROI, an improved quantization method is employed to embed the copyright information into its DCT coefficients. Finally, the edge map of one ROI is chosen as the fragile watermark, and is then embedded into the DWT domain of the watermarked image to further resist the tampering attacks. Using ROI-based visual saliency as a bridge, this proposed method can achieve image authentication and copyright protection synchronously, and it can also preserve much more robust information. Experimental results on standard benchmark demonstrate that compared with the state-of-the-art watermarking scheme, the proposed method is more robust to white noise, filtering and JPEG compression attacks. Furthermore, it also shows that the proposed method can effectively detect tamper and locate forgery.

Journal ArticleDOI
TL;DR: The experimental results demonstrate the feasibility of the proposed scheme as the embedded watermarks can survive the allowed transcoding processes while the edited segments in the tampered video can be located.
Abstract: In this research, we propose a practical digital video watermarking scheme mainly used for authenticating the H264/AVC compressed videos to ensure their correct content order The watermark signals, which represent the serial numbers of video segments, are embedded into nonzero quantization indices of frames to achieve both the effectiveness of watermarking and the compact data size The human visual characteristics are taken into account to guarantee the imperceptibility of watermark signals and to attain an efficient implementation in H264/AVC The issues of synchronized watermark detections are settled by selecting the shot-change frames for calculating the distortion-resilient hash, which helps to determine the watermark sequence The experimental results demonstrate the feasibility of the proposed scheme as the embedded watermarks can survive the allowed transcoding processes while the edited segments in the tampered video can be located

Journal ArticleDOI
TL;DR: A new algorithm has been proposed for reducing the number of search locations in block matching based motion estimation that uses spatial correlation to eliminate neighboring blocks having low probability of being best match to the candidate block.
Abstract: A new algorithm has been proposed for reducing the number of search locations in block matching based motion estimation. This algorithm uses spatial correlation to eliminate neighboring blocks having low probability of being best match to the candidate block. Existing fast BMAs use a fixed pattern to find the motion vector of a macroblock. On the contrary, the proposed algorithm is independent of any such initially fixed search patterns. The decision to eliminate the neighborhood is taken dynamically based on a preset threshold Th. The extent to which the neighborhood can be eliminated is configured using the shift parameter @d. Thus, reduction in the number of search positions changes dynamically depending on input content. Experiments have been carried out for comparing the performance of the proposed algorithm with other existing BMAs. In addition, an Adaptive Neighborhood Elimination Algorithm (ANEA) has been proposed whereby the Th and @d parameters are updated adaptively.

Journal ArticleDOI
TL;DR: By analysing an extensive dataset from an operational IPTV provider – comprising 255 thousand users, 150 TV channels, and covering a 6-month period – it is observed that most channel switching events are relatively predictable: users very frequently switch linearly, up or down to the next TV channel.
Abstract: One of the major concerns of IPTV network deployment is channel change delay (also known as zapping delay). This delay can add up to 2 s or more, and its main culprits are synchronisation and buffering of the media streams. Proving the importance of the problem is the already significant amount of literature addressing it. We start this paper with a survey of techniques proposed to reduce IPTV channel change delay. Then, by analysing an extensive dataset from an operational IPTV provider – comprising 255 thousand users, 150 TV channels, and covering a 6-month period – we have observed that most channel switching events are relatively predictable: users very frequently switch linearly, up or down to the next TV channel. This fact motivated us to use this dataset to analyse in detail a specific type of solutions to this problem, namely, predictive pre-joining of TV channels. In these schemes each set top box (STB) simultaneously joins additional multicast groups (TV channels) along with the one that is requested by the user. If the user switches to any of these channels the switching latency is virtually eliminated, not affecting therefore user's experience. We start by evaluating a simple scheme, where the neighbouring channels (i.e., channels adjacent to the requested one) are pre-joined by the STB during zapping periods. Notwithstanding the simplicity of this scheme, trace-driven simulations show that the zapping delay can be virtually eliminated for a significant percentage of channel switching requests. For example, when sending the previous and the next channel concurrently with the requested one, for only 1 min after a zapping event, switching delay is eliminated for close to half of all channel switching requests. Importantly, this result is achieved with a negligible increase of bandwidth utilisation in the access link. Other more complex schemes where user behaviour is tracked were also evaluated, but the improvement over the simple scheme was insignificant.

Journal ArticleDOI
TL;DR: The results show that there is no fairness between peers and that is an important issue for the scalability of P2P-TV systems, and points out the lack of locality-aware mechanisms for these systems.
Abstract: P2P-TV is an emerging alternative to classical television broadcast systems. Leveraging possibilities o ered by the Internet, several companies o er P2P-TV services to their customers. The overwhelming majority of these systems however is of closed nature, o ering little insight on their tra c properties. For a better understanding of the P2P-TV landscape, we performed measurement experiments in France, Japan, Spain, and Romania, using di erent commercial applications. By using multiple measurement points in di erent locations of the world, our results can paint a global picture of the measured networks, inferring their main properties. More precisely, we focus on the level of collaboration between peers, their location and the e ect of the tra c on the networks. Our results show that there is no fairness between peers and that is an important issue for the scalability of P2P-TV systems. Moreover, hundreds of Autonomous Systems are involved in the P2P-TV tra c and it points out the lack of locality-aware mechanisms for these systems. The geographic location of peers testifies the wide spread of these applications in Asia and highlights their worldwide usage.

Journal ArticleDOI
Xin Jin1, Satoshi Goto1
TL;DR: A difference detection algorithm is proposed to reduce the computational complexity and power consumption in surveillance video compression by automatically distributing the video data to different modules of the video encoder according to their content similarity features.
Abstract: As a state-of-the-art video compression technique, H.264/AVC has been deployed in many surveillance cameras to improve the compression efficiency. However, it induces very high coding complexity, and thus high power consumption. In this paper, a difference detection algorithm is proposed to reduce the computational complexity and power consumption in surveillance video compression by automatically distributing the video data to different modules of the video encoder according to their content similarity features. Without any requirement in changing the encoder hardware, the proposed algorithm provides high adaptability to be integrated into the existing H.264 video encoders. An average of over 82% of overall encoding complexity can be reduced regardless of whether or not the H.264 encoder itself has employed fast algorithms. No loss is observed in both subjective and objective video quality.

Journal ArticleDOI
TL;DR: The proposed visual signal fidelity metric, which is called sparse correlation coefficient (SCC), is motivated by the need to capture the correlation between two sets of outputs from a sparse model of simple cell receptive fields.
Abstract: Image quality assessment (IQA) is of fundamental importance to numerous image processing applications. Generally, image quality metrics (IQMs) regard image quality as fidelity or similarity with a reference image in some perceptual space. Such a full-reference IQA method is a kind of comparison that involves measuring the similarity or difference between two signals in a perceptually meaningful way. Modeling of the human visual system (HVS) has been regarded as the most suitable way to achieve perceptual quality predictions. In fact, natural image statistics can be an effective approach to simulate the HVS, since statistical models of natural images reveal some important response properties of the HVS. A useful statistical model of natural images is sparse coding, which is equivalent to independent component analysis (ICA). It provides a very good description of the receptive fields of simple cells in the primary visual cortex. Therefore, such a statistical model can be used to simulate the visual processing at the level of the visual cortex when designing IQMs. In this paper, we propose a fidelity criterion for IQA that relates image quality with the correlation between a reference and a distorted image in the form of sparse code. The proposed visual signal fidelity metric, which is called sparse correlation coefficient (SCC), is motivated by the need to capture the correlation between two sets of outputs from a sparse model of simple cell receptive fields. The SCC represents the correlation between two visual signals of images in a cortical visual space. The experimental results after both polynomial and logistic regression demonstrate that SCC is superior to recent state-of-the-art IQMs both in single-distortion and cross-distortion tests.

Journal ArticleDOI
TL;DR: This paper designs a novel omnidirectional color checker and presents a method for establishing global correspondences to facilitate automatic color calibration without manual adjustment and shows high performance in achieving inter-camera color consistency and high dynamic range.
Abstract: This paper proposes a collaborative color calibration method for multi-camera systems. The multi-camera color calibration problem is formulated as an overdetermined linear system, in which a dynamic range shaping is incorporated to ensure the high contrasts for captured images. The cameras are calibrated with the parameters obtained by solving the linear system. For non-planar multi-camera systems, we design a novel omnidirectional color checker and present a method for establishing global correspondences to facilitate automatic color calibration without manual adjustment. According to experimental results on both synthetic and real-system datasets, the proposed method shows high performance in achieving inter-camera color consistency and high dynamic range. Thanks to the generality of the linear system formulation and the flexibility of the designed color checker, the proposed method is applicable to various multi-camera systems.

Journal ArticleDOI
TL;DR: An image dependent color space transform (ID-CCT), exploiting the inter-channel redundancy optimally and which is very much suitable for compression has been proposed and the comparative performance has been evaluated and a significant improvement has been observed.
Abstract: Various contemporary standards by Joint Picture Expert Group, which is used for compression, exploited the correlation among the color components using a component color space transform before the subband transform stage. The transforms used to de-correlate the colors are primarily the fixed kernel transforms, which are not suitable for large class of images. In this paper an image dependent color space transform (ID-CCT), exploiting the inter-channel redundancy optimally and which is very much suitable for compression has been proposed. Also the comparative performance has been evaluated and a significant improvement has been observed, objectively as well as subjectively over other quantifiable methods.

Journal ArticleDOI
TL;DR: This research presents a multi-resolution reversible data-hiding algorithm to enable multi-scale marked images that are transmitted progressively to be exactly recovered at the receiver side once hidden data has been extracted.
Abstract: This research presents a multi-resolution reversible data-hiding algorithm to enable multi-scale marked images that are transmitted progressively to be exactly recovered at the receiver side once hidden data has been extracted. Based on the spatially hierarchical multi-layer structures of progressive-image transmission, the proposed algorithm first decimates the incoming image pixels into a pre-specified number of hierarchical layers of pixels. Then, it modifies pixel values in each hierarchical layer by shifting the interpolated-difference-values histogram between two neighboring layers of pixels to embed secret information into the corresponding hierarchical layer images. The proposed algorithm offers a reversible data-hiding ability for applications that use progressive image transmission to render progressive-image authentication, information-tagging, covert communications, etc. With progressive-reversible data-hiding, users of progressive image transmission can receive each original progressive image and complete hidden messages related to the received progressive image. This allows users to make real-time definite decisions according to an application's requirements. In contrast to other reversible data-hiding schemes, the algorithm proposed in this study features reversible data-hiding in progressive-image transmission based on a hierarchical decimation and interpolation technique. The interpolating process is used to reduce the difference values between the target pixel values in one progressive layer and their interpolated ones. This increases the hiding capacity of interpolation-differences histogram shifting. The experimental results demonstrate that the proposed method provides a greater embedding capacity and maintains marked images at a higher quality. Moreover, the proposed method has a low computational complexity as it requires only simple arithmetic computations.

Journal ArticleDOI
TL;DR: Simulation results have shown that an improvement in peak-signal-to-noise-ratio (PSNR), mean structural-similarity-index-measure (MSSIM) and Watson distance by an amount of about 30%, 12% and 77%, respectively, are gained by authorized user when 50% coefficients of high-high (HH) coefficients are modulated.
Abstract: This paper proposes a joint data-hiding and data modulation scheme to serve the purpose of quality access control of image(s) using quantization index modulation (QIM). The combined effect of external information embedding and data modulation cause visual degradation and may be used in access control through reversible process. The degree of deterioration depends on the amount of external data insertion, amount of data modulation as well as step size used in QIM. A weight function (q) and the quantization step size (@D"b) of JPEG 2000 are used for defining the step size of QIM. Lifting based discrete wavelet transform (DWT), instead of conventional DWT, is used to decompose the original image in order to achieve advantages, namely low loss in image quality due to QIM, better watermark decoding reliability and high embedding capacity for a given embedding distortion. At the decoder, data are demodulated first and watermark bits are then extracted using minimum distance decoding. Extracted watermark is used to suppress self-noise (SN) that provides better quality of image. Simulation results have shown that an improvement in peak-signal-to-noise-ratio (PSNR), mean structural-similarity-index-measure (MSSIM) and Watson distance by an amount of about 30%, 12% and 77%, respectively, are gained by authorized user when 50% coefficients of high-high (HH) coefficients are modulated.

Journal ArticleDOI
TL;DR: The method groups 3D shape-adaptive patches, whose surrounding cubic neighborhoods along spatial and temporal dimensions have been found similar by patch clustering, into 4D data structures with arbitrary shapes that can be represented very sparsely with a 4Dshape- Adaptive DCT.
Abstract: We present an effective patch-based video denoising algorithm that exploits both local and nonlocal correlations. The method groups 3D shape-adaptive patches, whose surrounding cubic neighborhoods along spatial and temporal dimensions have been found similar by patch clustering. Such grouping results in 4D data structures with arbitrary shapes. Since the obtained 4D groups are highly correlated along all the dimensions, they can be represented very sparsely with a 4D shape-adaptive DCT. The noise can be effectively attenuated by transform shrinkage. Experimental results on a wide range of videos show that this algorithm provides significant improvement over the state-of-the-art denoising algorithms in terms of both objective metric and subjective visual quality.

Journal ArticleDOI
TL;DR: This paper presents an effective processor chip for integer motion estimation (IME) in H264/AVC based on the full-search block-matching algorithm (FSBMA), which uses architecture with a configurable 2D systolic array to obtain a high data reuse of search area.
Abstract: Motion estimation (ME) is the most critical component of a video coding standard. H.264/AVC adopts the variable block size motion estimation (VBSME) to obtain excellent coding efficiency, but the high computational complexity makes design difficult. This paper presents an effective processor chip for integer motion estimation (IME) in H264/AVC based on the full-search block-matching algorithm (FSBMA). It uses architecture with a configurable 2D systolic array to obtain a high data reuse of search area. This systolic array supports a three-direction scan format in which only one row of pixels is changed between the two adjacent subblocks, thus reducing the memory accesses and saving clock cycles. A computing array of 64 PEs calculates the SAD of basic 4x4 subblocks and a modified Lagrangian cost is used as matching criterion to find the best 41 variable-size blocks by means of a tree pipeline parallel architecture. Finally, a mode decision module uses serial data flow to find the best mode by comparing the total minimum Lagrangian costs. The IME processor chip was designed in UMC 0.18@mm technology resulting in a circuit with only 32.3k gates and 6 RAMs (total 59kBits on-chip memory). In typical working conditions (25^oC, 1.8V), a clock frequency of 300MHz can be estimated with a processing capacity for HDTV (1920x1088 @ 30fps) and a search range of 32x32.

Journal ArticleDOI
TL;DR: This work implements a cross media analysis scheme that takes advantage of both visual and textual information for detecting high-level concepts and defines a conceptual space where information originating from heterogeneous media types can be meaningfully combined and facilitate analysis decisions.
Abstract: Existing methods for the semantic analysis of multimedia, although effective for single-medium scenarios, are inherently flawed in cases where knowledge is spread over different media types. In this work we implement a cross media analysis scheme that takes advantage of both visual and textual information for detecting high-level concepts. The novel aspect of this scheme is the definition and use of a conceptual space where information originating from heterogeneous media types can be meaningfully combined and facilitate analysis decisions. More specifically, our contribution is on proposing a modeling approach for Bayesian Networks that defines this conceptual space and allows evidence originating from the domain knowledge, the application context and different content modalities to support or disproof a certain hypothesis. Using this scheme we have performed experiments on a set of 162 compound documents taken from the domain of car manufacturing industry and 118581 video shots taken from the TRECVID2010 competition. The obtained results have shown that the proposed modeling approach exploits the complementary effect of evidence extracted across different media and delivers performance improvements compared to the single-medium cases. Moreover, by comparing the performance of the proposed approach with an approach using Support Vector Machines (SVM), we have verified that in a cross media setting the use of generative rather than discriminative models are more suited, mainly due to their ability to smoothly incorporate explicit knowledge and learn from a few examples.

Journal ArticleDOI
TL;DR: The proposed DEQA framework enables a real-time, non-intrusive assessment service by efficiently recognising and assessing individual quality violation events in the IPTV distribution network and also facilitates efficient network diagnosis and QoE management.
Abstract: Maintaining the quality of videos in resource-intensive IPTV services is challenging due to the nature of packet-based content distribution networks (CDN). Network impairments are unpredictable and highly detrimental to the quality of video content. Quality of the end user experience (QoE) has become a critical service differentiator. An efficient real-time quality assessment service in distribution networks is the foundation of service quality monitoring and management. The perceptual impact of individual impairments varies significantly and is influenced by complex impact factors. Without differentiating the impact of quality violation events to the user experience, existing assessment methodologies based on network QoS such as packet loss rate cannot provide adequate supports for the IPTV service assessment. A discrete perceptual impact evaluation quality assessment (DEQA) framework is introduced in this paper. The proposed framework enables a real-time, non-intrusive assessment service by efficiently recognising and assessing individual quality violation events in the IPTV distribution network. The discrete perceptual impacts to a media session are aggregated for the overall user level quality evaluation. With its deployment scheme the DEQA framework also facilitates efficient network diagnosis and QoE management. To realise the key assessment function of the framework and investigate the proposed advanced packet inspection mechanism, we also introduce the dedicated evaluation testbed—the LA2 system. A subjective experiment with data analysis is also presented to demonstrate the development of perceptual impact assessment functions using analytical inference, the tools of the LA2 system, subjective user tests and statistical modelling.

Journal ArticleDOI
TL;DR: Results show that the proposed MDC codec can adapt very well to changing transmission conditions, and makes gains of up to 14dB at high packet error rates (low SNR) were recorded when coupled with a MIMO architecture.
Abstract: This paper proposes a rate controlled redundancy-adaptive multiple description video coding method. The method adjusts the level of redundancy allocated to each macroblock based on an end to end distortion model that takes into account the effects of error propagation and concealment. Rate control at the macroblock level ensures that the total bit rate of all descriptions does not exceed the target rate. Results show that the proposed MDC codec can adapt very well to changing transmission conditions. Gains of up to 14dB at high packet error rates (low SNR) were recorded when coupled with a MIMO architecture, with no perceptible performance deficit at low packet error rates (high SNR).

Journal ArticleDOI
TL;DR: A new measure for image quality assessment (IQA), which supplies more flexibility than previous methods in using the pixel displacement in the assessment, is proposed.
Abstract: Image quality depends on many factors, such as the initial capture system and its image processing, compression, transmission, the output device, media and associated viewing conditions. The goal of quality assessment research is to design measures that can automatically evaluate the quality of images in a perceptually consistent manner. This paper proposes a new measure for image quality assessment (IQA), which supplies more flexibility than previous methods in using the pixel displacement in the assessment. First, the distorted and original images are divided into overlapped 11x11 blocks, and secondly, we calculated distorted pixels and displacement, and then, visual regions of interest and edge information are computed, which can be used to compute the global error. Experimental comparisons show the efficiency of the proposed method.

Journal ArticleDOI
TL;DR: A high performance fractional motion engine is proposed in this paper with three techniques, based on high correlation between motion vector of a block and its up-layer as well as relationship of integer candidates, one-step algorithm is proposed.
Abstract: Conventional two-step algorithm, long latency of interpolation and various motion vectors are three factors that mainly induce high computation complexity of fractional motion estimation and also prevent it from encoding high-definition video. In order to overcome these obstacles, a high performance fractional motion engine is proposed in this paper with three techniques. First, based on high correlation between motion vector of a block and its up-layer as well as relationship of integer candidates, one-step algorithm is proposed. Second, an 8x4 element block processing is adopted, which not only eliminates almost redundancies in interpolation, but also still ensures hardware reusability. Finally, a scheme of processing 4x4 and 4x8 block with free of cycles is presented, so that the number of motion vectors can be reduced up to 59%. Experimental results show that the proposed design just needs 50% of gate count and 56% of cycles when compared with previous design while nearly maintaining the coding performance.

Journal ArticleDOI
TL;DR: A low computational deblocking filter with four modes is proposed, including three frequency-related modes (sm smooth, non-smooth, and intermediate) and a corner mode for the corner of four blocks and achieves better detail preservation and artifact removal performance.
Abstract: Image compression plays a pivotal role in minimizing the data size and reduction in transmission costs. Many coding techniques have been developed, but the most effective is the JPEG compression. However, the reconstructed images from JPEG compression produce noticeable image degradations near block boundaries called blocking artifacts, particularly in highly compressed images. A method to detect and reduce these artifacts without smoothing images and without removing perceptual features has been presented in this paper. In this work, a low computational deblocking filter with four modes is proposed, including three frequency-related modes (smooth, non-smooth, and intermediate) and a corner mode for the corner of four blocks. Extensive experiments and comparison with other deblocking methods have been conducted on the basis of PSNR, MSSIM, SF, and MOS to justify the effectiveness of the proposed method. The proposed algorithm keeps the computation lower and achieves better detail preservation and artifact removal performance.

Journal ArticleDOI
TL;DR: Using the well-known spread transform (ST) combining with quantization based embedding systems provides an @e-secure stego-system in the sense of Cachin's security definition as long as the ratio between the quantization step and the square root of the spreading factor is small.
Abstract: The quantization based embedding systems are widely used in the information hiding applications, thanks to their efficiency and simplicity. Moreover, they are known to be insecure in steganography context according to the Cachins' security definition because they distort the stego-signal probability density function. In this paper, we show that using the well-known spread transform (ST) combining with quantization based embedding systems provides an @e-secure stego-system in the sense of Cachin's security definition. In other words, we show theoretically that this system preserves, in the sense of the relative entropy, the probability density function of the stego-signal as long as the ratio between the quantization step and the square root of the spreading factor is small. This highlights the fundamental tradeoff between these two quantities. Our theoretical conclusions are validated and illustrated on real images. Finally, a comparison with the Solanki et al. blind steganographic scheme is given.