scispace - formally typeset
Search or ask a question

Showing papers in "Signal Processing-image Communication in 2014"


Journal ArticleDOI
TL;DR: It is found that SSEQ matches well with human subjective opinions of image quality, and is statistically superior to the full-reference IQA algorithm SSIM and several top-performing NR IQA methods: BIQI, DIIVINE, and BLIINDS-II.
Abstract: We develop an efficient general-purpose no-reference (NR) image quality assessment (IQA) model that utilizes local spatial and spectral entropy features on distorted images. Using a 2-stage framework of distortion classification followed by quality assessment, we utilize a support vector machine (SVM) to train an image distortion and quality prediction engine. The resulting algorithm, dubbed Spatial–Spectral Entropy-based Quality (SSEQ) index, is capable of assessing the quality of a distorted image across multiple distortion categories. We explain the entropy features used and their relevance to perception and thoroughly evaluate the algorithm on the LIVE IQA database. We find that SSEQ matches well with human subjective opinions of image quality, and is statistically superior to the full-reference (FR) IQA algorithm SSIM and several top-performing NR IQA methods: BIQI, DIIVINE, and BLIINDS-II. SSEQ has a considerably low complexity. We also tested SSEQ on the TID2008 database to ascertain whether it has performance that is database independent.

562 citations


Journal ArticleDOI
TL;DR: The resulting algorithm, dubbed CurveletQA, correlates well with human subjective opinions of image quality, delivering performance that is competitive with popular full-reference IQA algorithms such as SSIM, and with top-performing NR IQA models.
Abstract: We study the efficacy of utilizing a powerful image descriptor, the curvelet transform, to learn a no-reference (NR) image quality assessment (IQA) model. A set of statistical features are extracted from a computed image curvelet representation, including the coordinates of the maxima of the log-histograms of the curvelet coefficients values, and the energy distributions of both orientation and scale in the curvelet domain. Our results indicate that these features are sensitive to the presence and severity of image distortion. Operating within a 2-stage framework of distortion classification followed by quality assessment, we train an image distortion and quality prediction engine using a support vector machine (SVM). The resulting algorithm, dubbed CurveletQA for short, was tested on the LIVE IQA database and compared to state-of-the-art NR/FR IQA algorithms. We found that CurveletQA correlates well with human subjective opinions of image quality, delivering performance that is competitive with popular full-reference (FR) IQA algorithms such as SSIM, and with top-performing NR IQA models. At the same time, CurveletQA has a relatively low complexity.

176 citations


Journal ArticleDOI
TL;DR: A novel RDH method based on pixel-value-ordering (PVO) and prediction-error expansion that outperforms Li et al.@?s and some other state-of-the-art works.
Abstract: Recently, Li et al. proposed a reversible data hiding (RDH) method based on pixel-value-ordering (PVO) and prediction-error expansion. In their method, the maximum and the minimum of a pixel block are predicted and modified to embed data, and the reversibility is guaranteed by keeping PVO of each block invariant after embedding. In this paper, a novel RDH method is proposed by extending Li et al.@?s work. Instead of considering only a single pixel with maximum (or minimum) value of a block, all maximum-valued (or minimum-valued) pixels are taken as a unit to embed data. Specifically, the maximum-valued (or minimum-valued) pixels are first predicted and then modified together such that they are either unchanged or increased by 1 (or decreased by 1) in value at the same time. Comparing our method with Li et al.@?s, more blocks suitable for RDH are utilized and image redundancy is better exploited. Moreover, a mechanism of advisable payload partition and pixel-block-selection is adopted to optimize the embedding performance in terms of capacity-distortion behavior. Experimental results verify that our method outperforms Li et al.@?s and some other state-of-the-art works.

140 citations


Journal ArticleDOI
TL;DR: A series of experiments and security analysis results demonstrate that this new image encryption algorithm is highly secure and more efficient for most of the real image encryption practices.
Abstract: A new image encryption algorithm based on spatiotemporal chaotic system is proposed, in which the circular S-box and the key stream buffer are introduced to increase the security. This algorithm is comprised of a substitution process and a diffusion process. In the substitution process, the S-box is considered as a circular sequence with a head pointer, and each image pixel is replaced with an element of S-box according to both the pixel value and the head pointer, while the head pointer varies with the previous substituted pixel. In the diffusion process, the key stream buffer is used to cache the random numbers generated by the chaotic system, and each image pixel is then enciphered by incorporating the previous cipher pixel and a random number dependently chosen from the key stream buffer. A series of experiments and security analysis results demonstrate that this new encryption algorithm is highly secure and more efficient for most of the real image encryption practices.

126 citations


Journal ArticleDOI
TL;DR: The results showed the superiority of the algorithm in terms of accuracy, security and recovery, and the proposed tamper detection rate higher than 99%, security robustness and self-recovery image quality for tamper ratio up to 55%.
Abstract: In this paper, an effective tamper detection and self-recovery algorithm based on singular value decomposition (SVD) is proposed. This method generates two distinct tamper detection keys based on the singular value decomposition of the image blocks. Each generated tamper detection and self-recovery key is distinct for each image block and is encrypted using a secret key. A random block-mapping sequence and three unique optimizations are employed to improve the efficiency of the proposed tamper detection and the robustness against various security attacks, such as collage attack and constant-average attack. To improve the proposed tamper localization, a mixed block-partitioning technique for 4×4 and 2×2 blocks is utilized. The performance of the proposed scheme and its robustness against various tampering attacks is analyzed. The experimental results demonstrate that the proposed tamper detection is superior in terms of tamper detection efficiency with a tamper detection rate higher than 99%, security robustness and self-recovery image quality for tamper ratio up to 55%. HighlightsA novel SVD-based image tamper detection and self-recovery by active watermarking is proposed.A new aspect of singular values for each image's block is utilized to improve the detection rate.The combination of 4×4 and 2×2 block sizes improved the recovered image's quality.The proposed optimizations improved the scheme's security against several malicious attacks.The results showed the superiority of the algorithm in terms of accuracy, security and recovery.

117 citations


Journal ArticleDOI
TL;DR: This paper presents a complex extension of the DIIVINE algorithm (called C-DIIVINE), which blindly assesses image quality based on the complex Gaussian scale mixture model corresponding to the complex version of the steerable pyramid wavelet transform.
Abstract: It is widely known that the wavelet coefficients of natural scenes possess certain statistical regularities which can be affected by the presence of distortions. The DIIVINE (Distortion Identification-based Image Verity and Integrity Evaluation) algorithm is a successful no-reference image quality assessment (NR IQA) algorithm, which estimates quality based on changes in these regularities. However, DIIVINE operates based on real-valued wavelet coefficients, whereas the visual appearance of an image can be strongly determined by both the magnitude and phase information. In this paper, we present a complex extension of the DIIVINE algorithm (called C-DIIVINE), which blindly assesses image quality based on the complex Gaussian scale mixture model corresponding to the complex version of the steerable pyramid wavelet transform. Specifically, we applied three commonly used distribution models to fit the statistics of the wavelet coefficients: (1) the complex generalized Gaussian distribution is used to model the wavelet coefficient magnitudes, (2) the generalized Gaussian distribution is used to model the [email protected]? relative magnitudes, and (3) the wrapped Cauchy distribution is used to model the [email protected]? relative phases. All these distributions have characteristic shapes that are consistent across different natural images but change significantly in the presence of distortions. We also employ the complex wavelet structural similarity index to measure degradation of the correlations across image scales, which serves as an important indicator of the [email protected]? energy distribution and the loss of alignment of local spectral components contributing to image structure. Experimental results show that these complex extensions allow C-DIIVINE to yield a substantial improvement in predictive performance as compared to its predecessor, and highly competitive performance relative to other recent no-reference algorithms.

106 citations


Journal ArticleDOI
TL;DR: Experimental results and security analysis show that the scheme can achieve good encryption result through only one round encryption process, the key space is large enough to resist against common attacks, so the scheme is reliable to be applied in image encryption and secure communication.
Abstract: This paper proposes a color image encryption scheme using one-time keys based on coupled chaotic systems. The key stream has both the key sensitivity and the plaintext sensitivity. The Secure Hash Algorithm 3 (SHA-3) is employed to combine with the initial keys to generate the new keys, to make the key stream change in each encryption process. Firstly, the SHA-3 hash value of the plain image is employed to generate six initial values of the chaotic systems. Secondly, combine and permute the six state variables, and randomly select three state variables from them, to encrypt the red, green and blue components, respectively. Experimental results and security analysis show that the scheme can achieve good encryption result through only one round encryption process, the key space is large enough to resist against common attacks, so the scheme is reliable to be applied in image encryption and secure communication.

104 citations


Journal ArticleDOI
TL;DR: A robust shuffling–masking image encryption scheme based on chaotic maps that has a large key space, strong sensitivity to the secret key, and is robust against differential attacks is proposed.
Abstract: An image encryption scheme provides means for securely transmitting images over public channels. In this work, we propose a robust shuffling–masking image encryption scheme based on chaotic maps. The shuffling phase permutes square blocks of bytes using a 3-dimensional chaotic cat map coupled with a zigzag scanning procedure. The masking phase then scrambles b -byte blocks of the shuffled image with combined outputs of three 1-dimensional chaotic skew tent maps, in such a way that the masking of every block is influenced by all previously masked blocks. Empirical results show that while the suggested scheme has good running speed, it generates ciphered images that exhibit (i) random-like behavior, (ii) almost flat histograms, (iii) almost no adjacent pixel correlation, (iv) information entropy close to the ideal theoretical value. Furthermore, this scheme has a large key space, strong sensitivity to the secret key, and is robust against differential attacks. On the basis of these results, this scheme can be regarded as secure and reliable scheme for use in secure communication applications.

84 citations


Journal ArticleDOI
TL;DR: A new hyperchaotic map derived from parametric equations of the serpentine curve is proposed, which was proven theoretically and numerically, using Lyapunov exponents, bifurcation diagram and correlation dimension of the attractor.
Abstract: Recently, the hyperchaotic maps have been investigated in order to develop more secure encryption schemes. In this paper we propose a new hyperchaotic map derived from parametric equations of the serpentine curve. Its complex behavior was proven theoretically and numerically, using Lyapunov exponents, bifurcation diagram and correlation dimension of the attractor. The proposed map is then used in a new image encryption scheme with a classic bi-modular architecture: a diffusion stage, in which the pixels of the plain image are shuffled using a random permutation generated with a new algorithm, and a confusion stage, in which the pixels are modified with a XOR-scheme based on the proposed map. The results of its statistical analysis show that the proposed image encryption scheme provides an efficient and secure way for image encryption.

75 citations


Journal ArticleDOI
TL;DR: Comparison results with IMF-PVD showed that the proposed method significantly had higher payload and image quality and that the embedded information is not easily detected by the difference histogram analysis and chi-square test.
Abstract: This paper proposed a Pixel Value Difference (PVD) based method to embed unequal amounts of secret information using pixel complexity. In previous PVD methods, embedding was sequential. Therefore, secret information can be easily accessed by a third party by the sequence. These methods are also easily detected by difference histogram analysis since their difference histograms showed unnatural shapes when compared to the cover image. The IMF-PVD method has a smoother and more natural difference histogram but its payload is not improved over the other PVD-based methods. In the proposed method, secret information was embedded in 2x2 embedding cells which were composed with randomized embedding units to reduce the falling-off-boundary problem and to eliminate sequential embedding. Comparison results with IMF-PVD showed that the proposed method significantly had higher payload and image quality. Furthermore, the payload size may be adjusted by reference tables and threshold. Results also showed that the embedded information is not easily detected by the difference histogram analysis and chi-square test.

69 citations


Journal ArticleDOI
TL;DR: This paper proposes a new feature descriptor, edge orientation difference histogram (EODH) descriptor, which is a rotation-invariant and scale-Invariant feature representation, and builds an effective image retrieval system based on weighted codeword distribution using the integrated feature descriptor.
Abstract: This paper proposes a new feature descriptor, edge orientation difference histogram (EODH) descriptor, which is a rotation-invariant and scale-invariant feature representation. The main orientation of each edge pixel is obtained through steerable filter and vector sum. Based on the main orientation, we construct the EODH descriptor for each edge pixel. Finally, we integrate the EODH and Color-SIFT descriptor, and build an effective image retrieval system based on weighted codeword distribution using the integrated feature descriptor. Experiments show that the codebook-based image retrieval method achieves the best performance on the given benchmark problems comparing to the state-of-the-art methods.

Journal ArticleDOI
TL;DR: A new pattern based feature, local mesh peak valley edge pattern (LMePVEP) is proposed for biomedical image indexing and retrieval and shows a significant improvement in terms average retrieval precision (ARP) and average retrieval rate (ARR) as compared to LBP and LBP variant features.
Abstract: In this paper, a new pattern based feature, local mesh peak valley edge pattern (LMePVEP) is proposed for biomedical image indexing and retrieval. The standard LBP extracts the gray scale relationship between the center pixel and its surrounding neighbors in an image. Whereas the proposed method extracts the gray scale relationship among the neighbors for a given center pixel in an image. The relations among the neighbors are peak/valley edges which are obtained by performing the first-order derivative. The performance of the proposed method (LMePVEP) is tested by conducting two experiments on two benchmark biomedical databases. Further, it is mentioned that the databases used for experiments are OASIS-MRI database which is the magnetic resonance imaging (MRI) database and VIA/I-ELCAP-CT database which includes region of interest computer tomography (CT) images. The results after being investigated show a significant improvement in terms average retrieval precision (ARP) and average retrieval rate (ARR) as compared to LBP and LBP variant features.

Journal ArticleDOI
TL;DR: A fast encoder decision algorithm to encode the dependent texture views in 3D-HEVC using an early merge mode decision algorithm and an early CU splitting termination algorithm to avoid unnecessary checks for larger depths.
Abstract: As a 3D extension of the High Efficiency Video Coding (HEVC) standard, 3D-HEVC is developed to improve the coding efficiency of multi-view video. However, the improvement of the coding efficiency is obtained at the expense of a computational complexity increase. How to relieve the computational burden of the encoder is becoming a critical problem in applications. In this paper, a fast encoder decision algorithm to encode the dependent texture views is proposed, where two strategies to accelerate encoder decision by exploiting inter-view correlations are utilized. The first one is an early merge mode decision algorithm, and the second one is an early CU splitting termination algorithm. Experimental results show that the proposed algorithm can achieve 47.1% encoding time saving with overall 0.1% BD-rate reduction compared to HTM (3D-HEVC test model) version 7 under the common test condition (CTC). Both of the two strategies have been adopted into the 3D-HEVC reference software and enabled as a default encoding process under CTC. HighlightsA fast encoder decision algorithm to encode the dependent texture views in 3D-HEVC.Early merge mode decision to avoid unnecessary examinations of inter and intra modes.Early CU splitting termination to avoid unnecessary checks for larger depths.Achieve 47.1% coding time saving with overall 0.1% BD-rate gain compared to HTM-7.0.

Journal ArticleDOI
TL;DR: This paper proposes a method that aims at preserving spatio-temporal brightness coherency when tone mapping video sequences, and computes HDR video zones which are constant throughout a sequence, based on the luminance of each pixel.
Abstract: Tone Mapping Operators (TMOs) compress High Dynamic Range (HDR) contents to address Low Dynamic Range (LDR) displays. While many solutions have been designed over the last decade, only few of them can cope with video sequences. Indeed, these TMOs tone map each frame of a video sequence separately, which results in temporal incoherency. Two main types of temporal incoherency are usually considered: flickering artifacts and temporal brightness incoherency. While the reduction of flickering artifacts has been well studied, less work has been performed on brightness incoherency. In this paper, we propose a method that aims at preserving spatio-temporal brightness coherency when tone mapping video sequences. Our technique computes HDR video zones which are constant throughout a sequence, based on the luminance of each pixel. Our method aims at preserving the brightness coherency between the brightest zone of the video and each other zone. This technique adapts to any TMO and results show that it preserves well spatio-temporal brightness coherency. We validate our method using a subjective evaluation. In addition, unlike local TMOs, our method, when applied to still images, is capable of ensuring spatial brightness coherency. Finally, it also preserves video fade effects commonly used in post-production.

Journal ArticleDOI
TL;DR: A novel methodology based on Normalized Cuts (NC) criterion is presented and evaluated in comparison with other state-of-the-art techniques, presenting superior performances in terms of clustering accuracy and robustness as well as a reduced computational burden.
Abstract: Camera identification is a well known problem in image forensics, addressing the issue to identify the camera a digital image has been shot by. In this paper, we pose our attention to the task of clustering images, belonging to a heterogenous set, in groups coming from the same camera and of doing this in a blind manner; this means that side information neither about the sources nor, above all, about the number of expected clusters is requested. A novel methodology based on Normalized Cuts (NC) criterion is presented and evaluated in comparison with other state-of-the-art techniques, such as Multi-Class Spectral Clustering (MCSC) and Hierarchical Agglomerative Clustering (HAC). The proposed method well fits the problem of blind image clustering because it does not a priori require the knowledge of the amount of classes in which the dataset has to be divided but it needs only a stop threshold; such a threshold has been properly defined by means of a ROC curves approach by relying on the goodness of cluster aggregation. Several experimental tests have been carried out in different operative conditions and the proposed methodology globally presents superior performances in terms of clustering accuracy and robustness as well as a reduced computational burden.

Journal ArticleDOI
TL;DR: Making use of some properties of CRT, the equivalent secret key of CECRT can be recovered efficiently and the required number of pairs of chosen plaintext and the corresponding ciphertext is only ( 1 + ⌈ ( log 2 L / l ⌉ ) , the attack complexity is only O ( L).
Abstract: As a fundamental theorem in number theory, the Chinese Reminder Theorem (CRT) is widely used to construct cryptographic primitives. This paper investigates the security of a class of image encryption schemes based on CRT, referred to as CECRT. Making use of some properties of CRT, the equivalent secret key of CECRT can be recovered efficiently. The required number of pairs of chosen plaintext and the corresponding ciphertext is only ( 1 + ⌈ ( log 2 L ) / l ⌉ ) , the attack complexity is only O(L), where L is the plaintext length and l is the number of bits representing a plaintext symbol. In addition, other defects of CECRT, such as invalid compression function and low sensitivity to plaintext, are reported. The work in this paper will help clarify positive role of CRT in cryptology.

Journal ArticleDOI
TL;DR: It is believed that VA needs consideration for evaluating the overall perceptual impact of TMOs on HDR content, since the existing studies so far have only considered the quality or esthetic appeal angle.
Abstract: High Dynamic Range (HDR) content is visually more appealing since it can represent the real luminance of the scene. However, on the downside, this means that a large amount of data needs to be handled both during storage and processing. The other problem is that HDR content cannot be displayed on the conventional display devices due to their limited dynamic range. To overcome these two problems, dynamic range compression (or range reduction) is often used and this is accomplished by tone mapping operators (TMOs). As result of tone mapping, the HDR content is not only fit to be displayed on a regular display device but also compressed. However from an artistic intention point of view, TMOs are not necessarily transparent and might induce different viewing behavior. It is generally accepted that TMOs reduce visual quality and there have been a number of studies reported in literature which examine the impact of tone mapping from the view point of perceptual quality. In contrast to this, it is largely unclear if tone mapping will induce changes in visual attention (VA) as well and whether these are significant enough to be accounted for in HDR content processing. To our knowledge, no systematic study exists which sheds light on this issue. Given that VA is a crucial visual perception mechanism which affects the way we perceive visual signals, it is important to study the effect of tone mapping on VA deployment. Towards this goal, this paper investigates and quantifies how TMOs modify VA. Comprehensive subjective tests in the form of eye-tracking experiments have been conducted on several HDR content and using a large number of TMOs. Further non-parametric statistical analysis has been carried out to ascertain the statistical significance of the results obtained. Our studies suggest that TMOs can indeed modify human attention and fixation behavior. Based on this we believe that VA needs consideration for evaluating the overall perceptual impact of TMOs on HDR content. As mentioned, since the existing studies so far have only considered the quality or esthetic appeal angle, this study brings in a new perspective regarding the importance of VA in HDR content processing for visualization on LDR displays.

Journal ArticleDOI
TL;DR: This paper proposes a distributed compressed sensing based multicast scheme (DCS-cast), where a block-wise compressed sensing is applied on video frames to obtain measurement data, which is then packed in an interleaved fashion and transmitted over OFDM channels.
Abstract: Multicasting of video signals over wireless networks has recently become a very popular application. Here, one major challenge is to accommodate heterogeneous users who have different channel characteristics and therefore will receive different noise-corrupted video packets of the same video source that is multicasted over the wireless network. This paper proposes a distributed compressed sensing based multicast scheme (DCS-cast), where a block-wise compressed sensing (BCS) is applied on video frames to obtain measurement data. The measurement data are then packed in an interleaved fashion and transmitted over OFDM channels. At the decoder side, users with different channel characteristics receive a certain number of packets and then reconstruct video frames by exploiting motion-based information. Due to the fact that the CS-measuring and interleaved packing together produce equally-important packets, users with good channel conditions will receive more packets so as to recover a better quality, which guarantees our DCS-cast scheme with a very graceful degradation rather than cliff effects. As compared to the benchmark SoftCast scheme, our DCS-cast is able to provide a better performance when some packets are lost during the transmission.

Journal ArticleDOI
TL;DR: A new lossless chain code compression method based on move-to-front transform and an adaptive run-length encoding that compresses the entropy-reduced chain code by coding the repetitions of chain code symbols and their combinations using a variable-length model is considered.
Abstract: Chain codes are the most size-efficient representations of rasterised binary shapes and contours. This paper considers a new lossless chain code compression method based on move-to-front transform and an adaptive run-length encoding. The former reduces the information entropy of the chain code, whilst the latter compresses the entropy-reduced chain code by coding the repetitions of chain code symbols and their combinations using a variable-length model. In comparison to other state-of-the-art compression methods, the entropy-reduction is highly efficient, and the newly proposed method yields, on average, better compression.

Journal ArticleDOI
TL;DR: A fast intra-encoding algorithm for HEVC, which is composed of the following four techniques, which can provide about 50% time savings with only 0.5% BD-rate loss on average when compared to HM 11.0 for the Main profile all-intra-configuration.
Abstract: The emerging High Efficiency Video Coding (HEVC) standard provides equivalent subjective quality with about 50% bit rate reduction compared to the H.264/AVC High profile. However, the improvement of coding efficiency is obtained at the expense of increased computational complexity. This paper presents a fast intra-encoding algorithm for HEVC, which is composed of the following four techniques. Firstly, an early termination technique for coding unit (CU) depth decision is proposed based on the depth of neighboring CUs and the comparison results of rate distortion (RD) costs between the parent CU and part of its child CUs. Secondly, the correlation of intra-prediction modes between neighboring PUs is exploited to accelerate the intra-prediction mode decision for HEVC intra-coding and the impact of the number of mode candidates after the rough mode decision (RMD) process in HM is studied in our work. Thirdly, the TU depth range is restricted based on the probability of each TU depth and one redundant process is removed in the TU depth selection process based on the analysis of the HEVC reference software. Finally, the probability of each case for the intra-transform skip mode is studied to accelerate the intra-transform skip mode decision. Experimental results show that the proposed algorithm can provide about 50% time savings with only 0.5% BD-rate loss on average when compared to HM 11.0 for the Main profile all-intra-configuration. Parts of these techniques have been adopted into the HEVC reference software. HighlightsAn early termination technique for coding unit (CU) depth decision.The correlation of intra-prediction modes between neighboring PUs is exploited to accelerate the intra-mode decision.The TU depth range is restricted based on the probability of each TU depth.The probability of each case for the intra-transform skip mode is firstly studied.

Journal ArticleDOI
TL;DR: In this article, the authors present a unifying approach to perform HDR assembly directly from raw sensor data, which includes a camera noise model adapted to HDR video and an algorithm for spatially adaptive HDR reconstruction based on fitting of local polynomial approximations.
Abstract: One of the most successful approaches to modern high quality HDR-video capture is to use camera setups with multiple sensors imaging the scene through a common optical system. However, such systems pose several challenges for HDR reconstruction algorithms. Previous reconstruction techniques have considered debayering, denoising, resampling (alignment) and exposure fusion as separate problems. In contrast, in this paper we present a unifying approach, performing HDR assembly directly from raw sensor data. Our framework includes a camera noise model adapted to HDR video and an algorithm for spatially adaptive HDR reconstruction based on fitting of local polynomial approximations to observed sensor data. The method is easy to implement and allows reconstruction to an arbitrary resolution and output mapping. We present an implementation in CUDA and show real-time performance for an experimental 4 Mpixel multi-sensor HDR video system. We further show that our algorithm has clear advantages over existing methods, both in terms of flexibility and reconstruction quality.

Journal ArticleDOI
TL;DR: A framework is presented that utilizes two cameras to realize a spatial exposure bracketing, for which the different exposures are distributed among the cameras, and which enables the use of more complex camera setups with different sensors and provides robust camera responses.
Abstract: To overcome the dynamic range limitations in images taken with regular consumer cameras, several methods exist for creating high dynamic range (HDR) content. Current low-budget solutions apply a temporal exposure bracketing which is not applicable for dynamic scenes or HDR video. In this article, a framework is presented that utilizes two cameras to realize a spatial exposure bracketing, for which the different exposures are distributed among the cameras. Such a setup allows for HDR images of dynamic scenes and HDR video due to its frame by frame operating principle, but faces challenges in the stereo matching and HDR generation steps. Therefore, the modules in this framework are selected to alleviate these challenges and to properly handle under- and oversaturated regions. In comparison to existing work, the camera response calculation is shifted to an offline process and a masking with a saturation map before the actual HDR generation is proposed. The first aspect enables the use of more complex camera setups with different sensors and provides robust camera responses. The second one makes sure that only necessary pixel values are used from the additional camera view, and thus, reduces errors in the final HDR image. The resulting HDR images are compared with the quality metric HDR-VDP-2 and numerical results are given for the first time. For the Middlebury test images, an average gain of 52 points on a 0-100 mean opinion score is achieved in comparison to temporal exposure bracketing with camera motion. Finally, HDR video results are provided.

Journal ArticleDOI
TL;DR: In this paper, a sparse representation-based quality (SPARQ) metric is proposed to measure the visual quality of an image by comparing the perceptually important structural information in this image with that in its reference image.
Abstract: A highly promising approach to assess the quality of an image involves comparing the perceptually important structural information in this image with that in its reference image. The extraction of the perceptually important structural information is however a challenging task. This paper employs a sparse representation-based approach to extract such structural information. It proposes a new metric called the sparse representation-based quality (SPARQ) index that measures the visual quality of an image. The proposed approach learns the inherent structures of the reference image as a set of basis vectors. These vectors are obtained such that any structure in the image can be efficiently represented by a linear combination of only a few of these basis vectors. Such a sparse strategy is known to generate basis vectors that are qualitatively similar to the receptive field of the simple cells present in the mammalian primary visual cortex. To estimate the visual quality of the distorted image, structures in the visually important areas in this image are compared with those in the reference image, in terms of the learnt basis vectors. Our approach is evaluated on six publicly available subject-rated image quality assessment datasets. The proposed SPARQ index consistently exhibits high correlation with the subjective ratings of all datasets and overall, performs better than a number of popular image quality metrics. HighlightsA sparse representation-based method for image quality assessment is proposed.The proposed method is designed to extract and compare perceptually important structures in images.Our method achieves competitive performance with the state-of-the-art.The proposed method consistently produces (statistically significant) lower error compared to the rival methods.

Journal ArticleDOI
TL;DR: An efficient general-purpose NR-IQA algorithm which is based on a new multiscale directional transform (shearlet transform) with a strong ability to localize distributed discontinuities and is tested on several database and shown to be suitable for many common distortions.
Abstract: Image and video quality measurements are crucial for many applications, such as acquisition, compression, transmission, enhancement, and reproduction. Nowadays, no-reference (NR) image quality assessment (IQA) methods have drawn extensive attention because it does not rely on any information of original images. However, most of the conventional NR-IQA methods are designed only for one or a set of predefined specific image distortion types, which are unlikely to generalize for evaluating image/video distorted with other types of distortions. In order to estimate a wide range of image distortions, in this paper, we present an efficient general-purpose NR-IQA algorithm which is based on a new multiscale directional transform (shearlet transform) with a strong ability to localize distributed discontinuities. This is mainly based on distorted natural image that leads to significant variation in the spread discontinuities in all directions. Thus, the statistical property of the distorted image is significantly different from that of natural images in fine scale shearlet coefficients, which are referred to as 'distorted parts'. However, some 'natural parts' are reserved in coarse scale shearlet coefficients. The algorithm relies on utilizing the natural parts to predict the natural behavior of distorted parts. The predicted parts act as 'reference' and the difference between the reference and distorted parts is used as an indicator to predict the image quality. In order to achieve this goal, we modify the general sparse autoencoder to serve as a predictor to get the predicted parts from natural parts. By translating the NR-IQA problem into classification problem, the predicted parts and distorted parts are utilized to form features and the differences between them are identified by softmax classifier. The resulting algorithm, which we name SHeArlet based No-reference Image quality Assessment (SHANIA), is tested on several database (LIVE, Multiply Distorted LIVE and TID2008) and shown to be suitable for many common distortions, consistent with subjective assessment and comparable to full-reference IQA methods and state-of-the-art general purpose NR-IQA algorithms.

Journal ArticleDOI
TL;DR: A novel and robust modus operandi for fast and accurate shot boundary detection where the whole design philosophy is based on human perceptual rules and the well-known ''Information Seeking Mantra''.
Abstract: In this paper, we propose a novel and robust modus operandi for fast and accurate shot boundary detection where the whole design philosophy is based on human perceptual rules and the well-known ''Information Seeking Mantra''. By adopting a top-down approach, redundant video processing is avoided and furthermore elegant shot boundary detection accuracy is obtained under significantly low computational costs. Objects within shots are detected via local image features and used for revealing visual discontinuities among shots. The proposed method can be used for detecting all types of gradual transitions as well as abrupt changes. Another important feature is that the proposed method is fully generic, which can be applied to any video content without requiring any training or tuning in advance. Furthermore, it allows a user interaction to direct the SBD process to the user's ''Region of Interest'' or to stop it once satisfactory results are obtained. Experimental results demonstrate that the proposed algorithm achieves superior computational times compared to the state-of-art methods without sacrificing performance.

Journal ArticleDOI
TL;DR: A reduced-reference image quality assessment metric is proposed, which measures the difference of the regularity of the phase congruency (PC) between the reference image and the distorted image.
Abstract: In this paper, a reduced-reference image quality assessment metric is proposed, which measures the difference of the regularity of the phase congruency (PC) between the reference image and the distorted image. The proposed model adopts a three-stage approach. The PC of the image is first extracted, then the fractal dimensions are computed on PC as the image features that characterize the image structures from the view of the spatial distribution. Finally the image features are pooled as the quality score using l1 distance. The proposed approach is evaluated on seven public benchmark databases. Experimental results have demonstrated the excellent performance of the proposed approach.

Journal ArticleDOI
TL;DR: The theoretical framework allowing for the binary quantization index modulation (QIM) embedding techniques to be extended towards multiple-symbol QIM (m-QIM, where m stands for the number of symbols on which the mark is encoded prior to its embedding) is introduced.
Abstract: This paper introduces the theoretical framework allowing for the binary quantization index modulation (QIM) embedding techniques to be extended towards multiple-symbol QIM (m-QIM, where m stands for the number of symbols on which the mark is encoded prior to its embedding). The underlying detection method is optimized with respect to the minimization of the average error probability, under the hypothesis of white, additive Gaussian behavior for the attacks. This way, for prescribed transparency and robustness constraints, the data payload is increased by a factor of log"2m. m-QIM is experimentally validated under the frameworks of the MEDIEVALS French national project and of the SPY ITEA2 European project, related to MPEG-4 AVC robust and semi-fragile watermarking applications, respectively. The experiments are three-folded and consider the data payload-robustness-transparency tradeoff. In the former case, the main benefit is the increase of data payload by a factor of log"2m while keeping fixed robustness (variations lower than 3% of the bit error rate after additive noise, transcoding and Stirmark random bending attacks) and transparency (set to average PSNR=45dB and 65dB for SD and HD encoded content, respectively). The experiments consider 1h of video content. In the semi-fragile watermarking case, the m-QIM main advantage is a relative gain factor of 0.11 of PSNR for fixed robustness (against transcoding), fragility (to content alteration) and the data payload. The experiments consider 1h 20min of video content.

Journal ArticleDOI
TL;DR: A novel reversible information hiding method aiming to achieve scalable carrier capacity while progressively distorting the image quality and compared with the conventional methods in terms of carrier capacity and scalability in perceptual quality degradation.
Abstract: This paper proposes a novel reversible information hiding method aiming to achieve scalable carrier capacity while progressively distorting the image quality. Unlike the conventional methods, the proposed method HAM (Histogram Association Mapping) purposely degrades the perceptual quality of the input image through data embedding. To the best of our knowledge, there is no method that attempts to significantly increase the carrier capacity while introducing (tolerating) intentional perceptual degradation for avoiding unauthorized viewing. HAM eliminates the expensive pre-processing step(s) required by the conventional histogram shifting data embedding approach and improves its carrier capacity. In particular, the host image is divided into non-overlapping blocks and each block is classified into two classes. Each class undergoes different HAM process to embed the external data while distorting quality of the image to the desired level. Experiments were conducted to measure the performances of the proposed method by using standard test images and CalTech 101 dataset. In the best case scenario, an average of ~2.88 bits per pixel is achieved as the effective carrier capacity for the CalTech 101 dataset. The proposed method is also compared with the conventional methods in terms of carrier capacity and scalability in perceptual quality degradation.

Journal ArticleDOI
TL;DR: A low power and hardware efficient image compressor integrated circuit for wireless capsule endoscopy application that supports dual-band imaging and performs strongly with a compression ratio of 80.4% and a high reconstruction peak-signal-to-noise-ratio.
Abstract: In this paper, we present the design of a low power and hardware efficient image compressor integrated circuit for wireless capsule endoscopy application. The proposed compression algorithm supports dual-band imaging, that is, works on both white-band imaging (WBI) and narrow-band imaging (NBI). The scheme uses a novel color-space and simple predictive coding for optimized performance. Based on the nature of the narrow- and white-band endoscopic images and video sequences, several sub-sampling schemes are introduced. The proposed dual-band compressor is designed in such as way that it can easily be interfaced with any commercial low power image sensor that outputs RGB image pixels in a raster scan fashion, eliminating the need of large buffer memory and temporary storage. Both NBI and WBI reconstructed images have been verified by medical doctors for acceptability. Compared to other designs targeted to video capsule endoscopy, the proposed algorithm performs strongly with a compression ratio of 80.4% (for WBI) and 79.2% (for NBI), and a high reconstruction peak-signal-to-noise-ratio (over 43.7dB for both bands). The results of the fabricated chip are also presented.

Journal ArticleDOI
TL;DR: Experimental measurements validate the hypothesis that each of these manifolds can be decomposed to a small number of linear subspaces of very low dimension and allow for computing the distance of a feature vector from each subspace and consequently from each one of the six manifolds.
Abstract: This paper presents a method for the recognition of the six basic facial expressions in images or in image sequences using landmark points. The proposed technique relies on the observation that the vectors formed by the landmark point coordinates belong to a different manifold for each of the expressions. In addition experimental measurements validate the hypothesis that each of these manifolds can be decomposed to a small number of linear subspaces of very low dimension. This yields a parameterization of the manifolds that allows for computing the distance of a feature vector from each subspace and consequently from each one of the six manifolds. Two alternative classifiers are next proposed that use the corresponding distances as input: the first one is based on the minimum distance from the manifolds, while the second one uses SVMs that are trained with the vector of all distances from each subspace. The proposed technique is tested for two scenarios, the subject-independent and the subject-dependent one. Extensive experiments for each scenario have been performed on two publicly available datasets yielding very satisfactory expression recognition accuracy.