scispace - formally typeset
Search or ask a question

Showing papers by "Guangming Shi published in 2015"


Journal ArticleDOI
TL;DR: A nonlocal extension of Gaussian scale mixture (GSM) model is developed using simultaneous sparse coding (SSC) and its applications into image restoration are explored and it is shown that the variances of sparse coefficients can be jointly estimated along with the unknown sparse coefficients via the method of alternating optimization.
Abstract: In image processing, sparse coding has been known to be relevant to both variational and Bayesian approaches. The regularization parameter in variational image restoration is intrinsically connected with the shape parameter of sparse coefficients' distribution in Bayesian methods. How to set those parameters in a principled yet spatially adaptive fashion turns out to be a challenging problem especially for the class of nonlocal image models. In this work, we propose a structured sparse coding framework to address this issue--more specifically, a nonlocal extension of Gaussian scale mixture (GSM) model is developed using simultaneous sparse coding (SSC) and its applications into image restoration are explored. It is shown that the variances of sparse coefficients (the field of scalar multipliers of Gaussians)--if treated as a latent variable--can be jointly estimated along with the unknown sparse coefficients via the method of alternating optimization. When applied to image restoration, our experimental results have shown that the proposed SSC---GSM technique can both preserve the sharpness of edges and suppress undesirable artifacts. Thanks to its capability of achieving a better spatial adaptation, SSC---GSM based image restoration often delivers reconstructed images with higher subjective/objective qualities than other competing approaches.

186 citations


Journal ArticleDOI
TL;DR: A beam splitter is placed in front of the objective lens of CASSI, which allows the same scene to be simultaneously captured by a grayscale camera, which greatly eases the reconstruction problem and yields high-quality 3D spectral data.
Abstract: Coded aperture snapshot spectral imaging (CASSI) provides an efficient mechanism for recovering 3D spectral data from a single 2D measurement. However, since the reconstruction problem is severely underdetermined, the quality of recovered spectral data is usually limited. In this paper we propose a novel dual-camera design to improve the performance of CASSI while maintaining its snapshot advantage. Specifically, a beam splitter is placed in front of the objective lens of CASSI, which allows the same scene to be simultaneously captured by a grayscale camera. This uncoded grayscale measurement, in conjunction with the coded CASSI measurement, greatly eases the reconstruction problem and yields high-quality 3D spectral data. Both simulation and experimental results demonstrate the effectiveness of the proposed method.

118 citations


Proceedings ArticleDOI
07 Dec 2015
TL;DR: Experimental results on spectral and dynamic MRI images show that the proposed algorithm can better preserve the sharpness of important image structures and outperform several existing state-of-the-art multiframe denoising methods.
Abstract: Patch-based low-rank models have shown effective in exploiting spatial redundancy of natural images especially for the application of image denoising. However, two-dimensional low-rank model can not fully exploit the spatio-temporal correlation in larger data sets such as multispectral images and 3D MRIs. In this work, we propose a novel low-rank tensor approximation framework with Laplacian Scale Mixture (LSM) modeling for multi-frame image denoising. First, similar 3D patches are grouped to form a tensor of d-order and high-order Singular Value Decomposition (HOSVD) is applied to the grouped tensor. Then the task of multiframe image denoising is formulated as a Maximum A Posterior (MAP) estimation problem with the LSM prior for tensor coefficients. Both unknown sparse coefficients and hidden LSM parameters can be efficiently estimated by the method of alternating optimization. Specifically, we have derived closed-form solutions for both subproblems. Experimental results on spectral and dynamic MRI images show that the proposed algorithm can better preserve the sharpness of important image structures and outperform several existing state-of-the-art multiframe denoising methods (e.g., BM4D and tensor dictionary learning).

72 citations


Proceedings ArticleDOI
07 Jun 2015
TL;DR: Experimental results demonstrate that, for the first time to the authors' knowledge, the hyperspectral video frame rate reaches up to 100fps with decent quality, even when the incident light is not strong.
Abstract: We propose a novel dual-camera design to acquire 4D high-speed hyperspectral (HSHS) videos with high spatial and spectral resolution. Our work has two key technical contributions. First, we build a dual-camera system that simultaneously captures a panchromatic video at a high frame rate and a hyperspectral video at a low frame rate, which jointly provide reliable projections for the underlying HSHS video. Second, we exploit the panchromatic video to learn an over-complete 3D dictionary to represent each band-wise video sparsely, and a robust computational reconstruction is then employed to recover the HSHS video based on the joint videos and the self-learned dictionary. Experimental results demonstrate that, for the first time to our knowledge, the hyperspectral video frame rate reaches up to 100fps with decent quality, even when the incident light is not strong.

70 citations


Journal ArticleDOI
TL;DR: Experimental results demonstrate that the orientation selectivity-based structure descriptor is robust to disturbance, and can effectively represent the structure degradation caused by different types of distortion.
Abstract: The human visual system is highly adaptive to extract structure information for scene perception, and structure character is widely used in perception-oriented image processing works. However, the existing structure descriptors mainly describe the luminance contrast of a local region, but cannot effectively represent the spatial correlation of structure. In this paper, we introduce a novel structure descriptor according to the orientation selectivity mechanism in the primary visual cortex. Research on cognitive neuroscience indicate that the arrangement of excitatory and inhibitory cortex cells arise orientation selectivity in a local receptive field, within which the primary visual cortex performs visual information extraction for scene understanding. Inspired by the orientation selectivity mechanism, we compute the correlations among pixels in a local region based on the similarities of their preferred orientation. By imitating the arrangement of the excitatory/inhibitory cells, the correlations between a central pixel and its local neighbors are binarized, and the spatial correlation is represented with a set of binary values, which is named the orientation selectivity-based pattern. Then, taking both the gradient magnitude and the orientation selectivity-based pattern into account, a rotation invariant structure descriptor is introduced. The proposed structure descriptor is applied in texture classification and reduced reference image quality assessment, as two different application domains to verify its generality and robustness. Experimental results demonstrate that the orientation selectivity-based structure descriptor is robust to disturbance, and can effectively represent the structure degradation caused by different types of distortion.

56 citations


Journal ArticleDOI
TL;DR: This paper proposes an algorithm that is both parallel and fast for the GPU, which can utilize 3072 SPs in parallel to estimate the motion vector of every prediction unit (PU) in every combination of the coding unit (CU) and PU partitions, so that the speed optimization does not result in loss of coding efficiency.
Abstract: Although the High Efficiency Video Coding (HEVC) standard significantly improves the coding efficiency of video compression, it is unacceptable even in offline applications to spend several hours compressing 10 s of high-definition video. In this paper, we propose using a multicore central processing unit (CPU) and an off-the-shelf graphics processing unit (GPU) with 3072 streaming processors (SPs) for HEVC fast encoding, so that the speed optimization does not result in loss of coding efficiency. There are two key technical contributions in this paper. First, we propose an algorithm that is both parallel and fast for the GPU, which can utilize 3072 SPs in parallel to estimate the motion vector (MV) of every prediction unit (PU) in every combination of the coding unit (CU) and PU partitions. Furthermore, the proposed GPU algorithm can avoid coding efficiency loss caused by the lack of a MV predictor (MVP). Second, we propose a fast algorithm for the CPU, which can fully utilize the results from the GPU to significantly reduce the number of possible CU and PU partitions without any coding efficiency loss. Our experimental results show that compared with the reference software, we can encode high-resolution video that consumes 1.9% of the CPU time and 1.0% of the GPU time, with only a 1.4% rate increase.

33 citations


Proceedings ArticleDOI
07 Dec 2015
TL;DR: The parameters underlying sparse distributions of desirable HR image patches are learned from a pair of LR image and retrieved HR images and can be interpreted as the first attempt of reconciling the difference between parametric and nonparametric models for low-level vision tasks.
Abstract: Existing approaches toward Image super-resolution (SR) is often either data-driven (e.g., based on internet-scale matching and web image retrieval) or model-based (e.g., formulated as an Maximizing a Posterior estimation problem). The former is conceptually simple yet heuristic, while the latter is constrained by the fundamental limit of frequency aliasing. In this paper, we propose to develop a hybrid approach toward SR by combining those two lines of ideas. More specifically, the parameters underlying sparse distributions of desirable HR image patches are learned from a pair of LR image and retrieved HR images. Our hybrid approach can be interpreted as the first attempt of reconciling the difference between parametric and nonparametric models for low-level vision tasks. Experimental results show that the proposed hybrid SR method performs much better than existing state-of-the-art methods in terms of both subjective and objective image qualities.

25 citations


Journal ArticleDOI
Xiaodan Song1, Xiulian Peng2, Xu Jizheng2, Guangming Shi1, Feng Wu2 
TL;DR: Experimental results on a landmark image database show that the Cloud-DIC can largely enhance the coding efficiency both subjectively and objectively, and perform comparably at low bitrates with the intra coding of the High Efficiency Video Coding standard with a much lower encoder complexity.
Abstract: With multimedia flourishing on the Web, it is easy to find similar images for a query, especially landmark images. Traditional image coding, such as JPEG, cannot exploit correlations with external images. Existing vision-based approaches are able to exploit such correlations by reconstructing from local descriptors but cannot ensure the pixel-level fidelity of the reconstruction. In this paper, a cloud-based distributed image coding (Cloud-DIC) scheme is proposed to exploit external correlations for mobile photo uploading. For each input image, a thumbnail is transmitted to retrieve correlated images and reconstruct it in the cloud by geometrical and illumination registrations. Such a reconstruction serves as the side information (SI) in the Cloud-DIC. The image is then compressed by a transform-domain syndrome coding to correct the disparity between the original image and the SI. Once a bitplane is received in the cloud, an iterative refinement process is performed between the final reconstruction and the SI. Moreover, a joint encoder/decoder mode decision at block, frequency, and bitplane levels is proposed to adapt to different correlations. Experimental results on a landmark image database show that the Cloud-DIC can largely enhance the coding efficiency both subjectively and objectively, with up to 5-dB gains and 70% bits saving over JPEG with arithmetic coding, and perform comparably at low bitrates with the intra coding of the High Efficiency Video Coding standard with a much lower encoder complexity.

25 citations


Journal ArticleDOI
TL;DR: This article formulate their effective features including adaptive support region selection, reliable depth selection, and color guidance together under an optimization framework for Kinect depth recovery as an energy minimization problem, which solves the depth hole filling and denoising simultaneously.
Abstract: Considering that the existing depth recovery approaches have different limitations when applied to Kinect depth data, in this article, we propose to integrate their effective features including adaptive support region selection, reliable depth selection, and color guidance together under an optimization framework for Kinect depth recovery. In particular, we formulate our depth recovery as an energy minimization problem, which solves the depth hole filling and denoising simultaneously. The energy function consists of a fidelity term and a regularization term, which are designed according to the Kinect characteristics. Our framework inherits and improves the idea of guided filtering by incorporating structure information and prior knowledge of the Kinect noise model. Through analyzing the solution to the optimization framework, we also derive a local filtering version that provides an efficient and effective way of improving the existing filtering techniques. Quantitative evaluations on our developed synthesized dataset and experiments on real Kinect data show that the proposed method achieves superior performance in terms of recovery accuracy and visual quality.

25 citations


Journal ArticleDOI
TL;DR: This work focuses more on the sparsity and uniqueness carried by the original image itself, the source of all the features, to propose a nonlocal reconstruction-based saliency model, generalized to model the global aspect saliency.

25 citations


Journal ArticleDOI
TL;DR: By showing that expertise can affect the baseline brain activity as indicated by ALFF, this study suggests that it may reveal a novel connection between the neuroplasticity mechanism and resting state activity, which would upgrade the understanding of the central mechanism of learning.
Abstract: It is well established that expertise modulates evoked brain activity in response to specific stimuli. Recently, researchers have begun to investigate how expertise influences the resting brain. Among these studies, most focused on the connectivity features within/across regions, i.e. connectivity patterns/strength. However, little concern has been given to a more fundamental issue whether or not expertise modulates baseline brain activity. We investigated this question using amplitude of low-frequency (<0.08Hz) fluctuation (ALFF) as the metric of brain activity and a novel expertise model, i.e. acupuncturists, due to their robust proficiency in tactile perception and emotion regulation. After the psychophysical and behavioral expertise screening procedure, 23 acupuncturists and 23 matched non-acupuncturists (NA) were enrolled. Our results explicated higher ALFF for acupuncturists in the left ventral medial prefrontal cortex (VMPFC) and the contralateral hand representation of the primary somatosensory area (SI) (corrected for multiple comparisons). Additionally, ALFF of VMPFC was negatively correlated with the outcomes of the emotion regulation task (corrected for multiple comparisons). We suggest that our study may reveal a novel connection between the neuroplasticity mechanism and resting state activity, which would upgrade our understanding of the central mechanism of learning. Furthermore, by showing that expertise can affect the baseline brain activity as indicated by ALFF, our findings may have profound implication for functional neuroimaging studies especially those involving expert models, in that difference in baseline brain activity may either smear the spatial pattern of activations for task data or introduce biased results into connectivity-based analysis for resting data.

Journal ArticleDOI
Guanghui Zhao1, Zicheng Liu1, Jie Lin1, Guangming Shi1, Fangfang Shen1 
TL;DR: A novel sparse representation (SR)-based model for wideband direction-of-arrival (DOA) estimation that exhibits the sparsity through the fact that any individual wideband source is characterized as a unique oblique line passing through the origin in the 2-D frequency plane of the temporal-spatial array data.
Abstract: In this paper, we propose a novel sparse representation (SR)-based model for wideband direction-of-arrival (DOA) estimation. Different from the classical SR-based DOA model, which only explores the sparsity of sources in spatial space, the proposed model exhibits the sparsity through the fact that any individual wideband source is characterized as a unique oblique line passing through the origin in the 2-D frequency plane of the temporal-spatial array data. To further reduce the computation complexity during the use of such sparse property, the 2-D direction information is projected into a 1-D space. Simulations show the superior performance of our approach, even in the noise corruption and less temporal samples condition.

Journal ArticleDOI
Guanghui Zhao1, Guangming Shi1, Fangfang Shen1, Luo Xi1, Yi Niu1 
TL;DR: In this paper, a new efficient DOA estimation algorithm based on the separable sparse representation (SSR-DOA for short) is derived, in which a separable structure for spatial observation matrix is introduced to reduce the complexity.
Abstract: Conventional sparse representation (SR)-based direction-of-arrival (DOA) estimation algorithms suffer from high computational complexity. To be specific, a wide angular range and a large-scale array will enlarge the scale of the spatial observation matrix, which results in huge computation cost for DOA estimation. In this letter, a new efficient DOA estimation algorithm based on the separable sparse representation (SSR-DOA for short) is derived, in which a separable structure for spatial observation matrix is introduced to reduce the complexity. Besides, a dual-sparsity strategy is engaged to make the algorithm tractable. Experimental results show that high resolution performance can be obtained efficiently by the proposed algorithm.

Journal ArticleDOI
Chao Chen1, Baoming Bai1, Guangming Shi1, Xiaotian Wang1, Xiaopeng Jiao1 
TL;DR: This paper finds that, in addition to those found previously, many cages can be used to construct structured LDPC codes, and shows that all cages with even girth can be structured as protograph-based codes, many of which have block-circulant Tanner graphs.
Abstract: A (v,g)-cage is a (not necessarily unique) smallest v-regular graph of girth g. On such a graph, a nonbinary (2,v)-regular low-density parity-check (LDPC) code can be defined such that the Tanner graph has girth 2g and the code length achieves the minimum possible. In this paper, we focus on two aspects of this class of codes, structural property and code optimization. We find that, in addition to those found previously, many cages can be used to construct structured LDPC codes. We show that all cages with even girth can be structured as protograph-based codes, many of which have block-circulant Tanner graphs. We also find that four cages with odd girth can be structured as protograph-based codes with block-circulant Tanner graphs. For code optimization, we develop an ontology-based approach. All possible inter-connected cycle patterns that lead to low symbol-weight codewords are identified to put together the ontology. By doing so, it becomes handleable to estimate and optimize distance spectrum of equivalent binary image codes. We further analyze some known codes from the Consultative Committee for Space Data Systems recommendation and design several new codes. Numerical results show that these codes have reasonably good minimum bit distance and perform well under iterative decoding.

Journal ArticleDOI
TL;DR: Experimental results show that the proposed coding scheme can significantly improve the existing standard codecs, H.264/AVC and HEVC, and outperform the state-of-the-art background modeling based coding schemes.

Journal ArticleDOI
Fangfang Shen1, Guanghui Zhao1, Zicheng Liu1, Guangming Shi1, Jie Lin1 
TL;DR: A novel structural SR-based SAR (SSR-SAR) imaging approach is proposed, in which the entries are derived from the PAR model, and the Augmented Lagrangian Multiplier technique is adopted to accelerate the computation speed.
Abstract: Sparse representation (SR)-based SAR imaging approaches have shown their superior performance compared with conventional approaches. However, for an image with rich spatial structures, a fixed global dictionary is usually ineffective to characterize the local structures. Piecewise autoregressive (PAR) model indicates that each pixel can be linearly represented by its local neighboring pixels. Inspired by this, an adaptive sparse space, effectively characterizing the varying image local structures, is designed, in which the entries are derived from the PAR model. By incorporating the adaptive SR into the SAR imaging, a novel structural SR-based SAR (SSR-SAR) imaging approach is proposed. Due to the fact that the adaptive sparse space is greatly dependent on the prior information of the SAR image, updating of the adaptive sparse space and SAR imaging is a joint optimization problem. In our approach, we propose to introduce the alternative minimization scheme to solve the problem. Besides, the Augmented Lagrangian Multiplier technique is adopted to accelerate the computation speed. Finally, experimental results are shown to demonstrate the validity of the proposed approach.

Book ChapterDOI
01 Aug 2015
TL;DR: This paper takes advantage of heterogeneous big data sources, including both direct measurements and various indirect data, to reconstruct a high resolution spatial-temporal air pollutant concentration map and uses the Integrated Nested Laplace Approximation (INLA) algorithm to improve the computational efficiency.
Abstract: In order to better understand the formation of air pollution and assess its influence on human beings, the acquisition of high resolution spatial-temporal air pollutant concentration map has always been an important research topic. Existing air-quality monitoring networks require potential improvement due to their limitations on data sources. In this paper, we take advantage of heterogeneous big data sources, including both direct measurements and various indirect data, to reconstruct a high resolution spatial-temporal air pollutant concentration map. Firstly, we predict a preliminary 3D high resolution air pollutant concentration map from measurements of both ground monitor stations and mobile stations equipped with sensors, as well as various meteorology and geography covariates. Our model is based on the Stochastic Partial Differential Equations (SPDE) approach and we use the Integrated Nested Laplace Approximation (INLA) algorithm as an alternative to the Markov Chain Monte Carlo (MCMC) methods to improve the computational efficiency. Next, in order to further improve the accuracy of the predicted concentration map, we model the issue as a convex and sparse optimization problem. In particular, we minimize the Total Variant along with constraints involving satellite observed low resolution air pollutant data and the aforementioned measurements from ground monitor stations and mobile platforms. We transform this optimization problem to a Second-Order Cone Program (SOCP) and solve it via the log-barrier method. Numerical simulations on real data show significant improvements of the reconstructed air pollutant concentration map.

Proceedings ArticleDOI
24 May 2015
TL;DR: Experimental results on several public image databases show that the proposed RR-IQA metric uses limited reference data (8 values) and performs highly consistent with human perception.
Abstract: Reduced-reference image quality assessment (RR-IQA) algorithm aims to automatically evaluate the image quality using only partial information about the reference image. In this paper, we propose a new RR-IQA metric by employing the entropy features of each frequency band in the DCT domain. It is well known that human eyes have different sensitivity to different bands, and distortions on each band result in individual quality degradations. Therefore, we suggest to separately compute the visual information degradations on different band for quality assessment. The degradations on each DCT band are firstly analyzed according to the entropy difference. And then, the quality score is obtained using the weighted sum of the entropy difference of each band from low frequency to high frequency. Experimental results on several public image databases show that the proposed method uses limited reference data (8 values) and performs highly consistent with human perception.

Journal ArticleDOI
Fu Li1, Gao Shan1, Guangming Shi1, Li Qin1, Lili Yang1, Li Ruodai1, Xuemei Xie1 
TL;DR: A single shot dual-frequency structured light based method to achieve dense depth in dynamic scenes is proposed, which is suitable for dense depth acquisition of the moving object and can also acquire the depth of the color scenes and is robust to the surface texture of objects.
Abstract: Structured light techniques are widely used for depth sensing. In this paper, we propose a single shot dual-frequency structured light based method to achieve dense depth in dynamic scenes. The projected pattern is a mixture of two different periodical waves whose phases are related to the change of color and intensity respectively, which can avoid the requirement of Fourier spectra separation in other multi- frequency patterns. Gabor filter is adopted to interpolate the phases. The number theory is used to deal with the phase ambiguity in phase based method conveniently and speedily. A dense depth can be achieved because of the phase-based encoding mode. The proposed method is suitable for dense depth acquisition of the moving object. Experimental results show higher accuracy of the proposed method in depth acquisition compared with the Kinect and larger resolution compared with the ToF (Time of Flight) depth camera. Meanwhile, the proposed method can also acquire the depth of the color scenes and is robust to the surface texture of objects.

Journal ArticleDOI
12 Feb 2015-Sensors
TL;DR: A joint optimization scheme by iterative SAR imaging and updating of the weighted autoregressive model to solve the problem of adaptivity in characterizing varied image contents is proposed.
Abstract: Compressive sensing-based synthetic aperture radar (SAR) imaging has shown its superior capability in high-resolution image formation. However, most of those works focus on the scenes that can be sparsely represented in fixed spaces. When dealing with complicated scenes, these fixed spaces lack adaptivity in characterizing varied image contents. To solve this problem, a new compressive sensing-based radar imaging approach with adaptive sparse representation is proposed. Specifically, an autoregressive model is introduced to adaptively exploit the structural sparsity of an image. In addition, similarity among pixels is integrated into the autoregressive model to further promote the capability and thus an adaptive sparse representation facilitated by a weighted autoregressive model is derived. Since the weighted autoregressive model is inherently determined by the unknown image, we propose a joint optimization scheme by iterative SAR imaging and updating of the weighted autoregressive model to solve this problem. Eventually, experimental results demonstrated the validity and generality of the proposed approach.

Proceedings ArticleDOI
12 Jul 2015
TL;DR: This work introduces an OS based visual pattern (OSVP) to extract visual content for reduced-reference image quality assessment (IQA), and investigates the correlation among neighbor pixels, and proposes the OSVP to represent the visual content of an image.
Abstract: Reduced-reference (RR) image quality assessment (IQA) method aims to accurately measure quality with part of the reference data. The challenge for RR IQA is how to effectively represent the visual content of an image with limited data for quality measurement. Inspired by the orientation selectivity (OS) mechanism in the primary visual cortex, we introduce an OS based visual pattern (OSVP) to extract visual content for RR IQA. The OS arises from the arrangement of the excitatory and inhibitory interactions among connected cortical neurons. Inspired by this, we investigate the correlation among neighbor pixels, and propose the OSVP to represent the visual content of an image. Then, the quality degradation is measured as the changes of OSVP, and a novel RR IQA model is proposed. Experimental results demonstrate that the OSVP based RR IQA model uses limited reference data (9 values) and performs highly consistent with the subjective perception.

Journal ArticleDOI
TL;DR: By adopting a high-resolution projector and camera, the proposed method can acquire a denser and more precise depth map than an original Kinect and ToF camera.
Abstract: In this paper, we propose a square wave encoded in three sinusoidal fringe patterns for depth sensing. The periods of the square wave and the sinusoidal wave are coprime. Because of the specific pattern design strategy, a Gabor filter is used to obtain the wrapped phase, and the coprime theorem is utilized to reliably determine the absolute phase of the encoded sinusoidal wave. Quantitative analyses and practical experiments have been presented to verify the performance of the proposed method. The precision of our method is close to that of the classic three-step phase-shifting method. Besides, depth of discontinuous surfaces can be correctly measured. By adopting a high-resolution projector and camera, our proposed method can acquire a denser and more precise depth map than an original Kinect and ToF camera.

Patent
03 Jun 2015
TL;DR: In this article, an object depth information acquisition method based on a single-frame compound template was proposed, which overcomes the defect of long consumed time of an existing time encoding mode.
Abstract: The invention discloses an object depth information acquisition method on the basis of a single-frame compound template and overcomes the defect of long consumed time of an existing time encoding mode. The object depth information acquisition method comprises the following implementing steps: 1, designing the single-frame compound template P; 2, utilizing the single-frame compound template to obtain a deformed image I; 3, demodulating from the deformed image I to obtain two frames of period co-prime square wave templates I1' and I'2; 4, calculating truncation phases Phi1 and Phi2 of each pixel point in the two frames of square wave templates by a Gabor filter; 5, by the truncation phases Phi1 and Phi2, solving a truncation phase development value phi; 6, solving a matched point (x, y) of a pixel point (x', y') in the deformed image I in the single-frame compound template P by phi; 7, according to a geometrical relationship of the pixel point and the matched point, solving a depth information value of an object to be measured. According to the invention, depth information acquiring speed of the object is increased, and depth information acquiring accuracy of the object is improved; the object depth information acquisition method can be used for industrial monitoring, medical science, human-computer interaction and 3D (three-dimensional) printing scenes.

Proceedings ArticleDOI
01 Dec 2015
TL;DR: Simulations on landmark images and an unknown Gaussian channel show that an up to 10 dB gain is achieved at low channel SNRs compared with the state-of-the-art uncoded image transmission scheme, i.e. SoftCast, when highly correlated images are available at the decoder.
Abstract: This paper proposes a distributed compressive sensing (CS) scheme for robust image transmission over unknown or time-varying channels with highly correlated images at the decoder. A compressed thumbnail is first transmitted after digital forward error correction (FEC) and modulation to retrieve highly correlated images and generate a side information (SI) at the decoder. The current residual image after subtracting the decompressed thumbnail is then coded and transmitted by CS through a very dense constellation without FEC. The linear representation of the residual signal by CS measurements and rateless sampling makes it able to achieve graceful degradation and bandwidth scalability without channel feedback. Moreover, a transform-domain power allocation is employed before random sampling to protect against channel errors. At the decoder, both the nonlocal correlations within the original image and the correlation with the SI are exploited in CS decoding via a low-rank regulation on similar patches. After CS decoding, a block-wise minimum-mean-square-error (MMSE) reconstruction using the SI is further performed in the spatial domain to enhance the reconstruction quality. Simulations on landmark images and an unknown Gaussian channel show that an up to 10 dB gain is achieved at low channel SNRs compared with the state-of-the-art uncoded image transmission scheme, i.e. SoftCast, when highly correlated images are available at the decoder.

Proceedings ArticleDOI
06 Aug 2015
TL;DR: This paper investigates multilayer video representations, such as scalable videos and simulcast streaming, and proposes an unequal error protection scheme based on local reconstruction codes (LRC) for video storage, showing that a better tradeoff between storage and repair cost is achieved.
Abstract: Redundancy is necessary for a storage system to recover from errors. The frequent errors in large-scale systems, e.g. cloud, make it desired to reduce the recovery cost. Among all kinds of data stored in the cloud, video takes a large portion due to its large data volume. The other characteristic of video is that a certain distortion can be tolerated. This paper investigates using scalable video representation and unequal error protection scheme to reduce the storage and recovery costs in the cloud. By introducing more protection for the base layer and less on the enhancement layers, it can achieve a better tradeoff between storage and reconstruction costs although the reliability for the enhancement layer sacrifices a little. Simulation results based on local reconstruction codes (LRC) show that comparing with the existing (12, 2, 2) LRC code in Windows Azure Storage, the reconstruction cost can be reduced from 6x to 3x at the same storage cost at the expense of possible video quality loss.

Proceedings ArticleDOI
12 Jul 2015
TL;DR: Experimental results show that the proposed CS recovery of dynamic MRI from partially sampled k-t space using the nonlocal low-rank regularization (NLR) can outperform existing state-of-the-art dynamic MRI reconstruction methods.
Abstract: Compressive sensing (CS) based dynamic MRI techniques have been proposed to improve the imaging speed and spatiotemporal resolution. However, existing CS recovery methods haven't exploited the rich redundancy among the spatial and temporal dimensions. In this paper, we address the CS recovery of dynamic MRI from partially sampled k-t space using the nonlocal low-rank regularization (NLR). To exploit the nonlocal redundancy in the spatial-temporal dimension, the dynamic MRI sequence is divided into overlapping 3D patches along both the spatial and temporal directions. We exploit the fact that the matrix that consists of a sufficient number of similar patches is low-rank. To effectively approximate the low-rank matrix, the non-convex surrogate function logdet (•) is used instead of the convex nuclear norm. Experimental results show that our proposed method can outperform existing state-of-the-art dynamic MRI reconstruction methods.

Journal ArticleDOI
TL;DR: Two classes of constant weight codes are constructed based on the incidence structures in projective and Euclidean geometries, respectively, and it is shown that some known optimal codes can be re-obtained in this way.
Abstract: This paper presents a geometric approach to the construction of constant weight codes from constant dimension codes. Two classes of constant weight codes are constructed based on the incidence structures in projective and Euclidean geometries, respectively. It is shown that some known optimal codes can be re-obtained in this way. As an application, the constructed codes are used in store-and-forward networks for packet loss recovery. Their decoding is translated into that of the corresponding constant dimension codes. The decoding capability is beyond that of the bounded-distance decoding. Remarkably, for constant weight codes from Koetter–Kschischang codes, the minimum-distance decoding is achieved.


Proceedings ArticleDOI
08 Jul 2015
TL;DR: A novel adaptive sparse representation based SAR image despeckling algorithm is proposed in this paper, where the noise component is considered as the coefficient residual, which equals to the difference between the actual image coefficient and the estimated coefficient.
Abstract: SR-based denoising methods have shown promising performance in image denoising. However, Because of the degradation of the noisy image, conventional SR based denoising models may not be accurate enough for the reconstruction of a clean image. Therefore, to reduce the noise corruption, a novel adaptive sparse representation based SAR image despeckling algorithm is proposed in this paper, where the noise component is considered as the coefficient residual, which equals to the difference between the actual image coefficient and the estimated coefficient. By imposing the sparsity constraint on this residual, the noise corruption can be somehow reduced. Furthermore, both the autoregressive model and the nonlocal similarity are incorporated to characterize better the image details. The experimental results demonstrate that the proposed algorithm outperforms other algorithms both subjectively and objectively.

Proceedings ArticleDOI
01 Dec 2015
TL;DR: A weighted RDO scheme for the intra coding of screen content videos, which measures the importance of each block within the picture first and then larger distortion weights are applied to the blocks with larger importance in the RDO to achieve a better rate-distortion trade-off.
Abstract: Screen content videos often have mixed content consisting of various types such as natural content, text and graphics in the same picture. To achieve high coding efficiency for this content, new coding tools are developed in the High Efficiency Video Coding (HEVC) standard Screen Content Coding (SCC) extension. Among them, intra block copy (IntraBC) allows a nonlocal intra prediction from the coded region of the same picture. However, the rate-distortion optimization (RDO) scheme of the screen content coding still follows that of the HEVC reference software and each block is optimized locally. The screen content characteristics are not fully utilized in the current RDO scheme. This paper presents a weighted RDO scheme for the intra coding of screen content videos, which measures the importance of each block within the picture first and then larger distortion weights are applied to the blocks with larger importance in the RDO. In this way, a better rate-distortion trade-off can be achieved for the picture instead of the blocks themselves. The experimental results show that up to 5.5% coding efficiency gain can be achieved compared with the reference software.