scispace - formally typeset
Search or ask a question

Showing papers by "Guangming Shi published in 2009"


Proceedings ArticleDOI
07 Nov 2009
TL;DR: Experimental results demonstrated that the proposed NLBP can reconstruct faithfully the HR images with sharp edges and texture structures and outperforms the state-of-the-art methods in both PSNR and visual perception.
Abstract: This paper presents a novel non-local iterative back-projection (NLIBP) algorithm for image enlargement. The iterative back-projection (IBP) technique iteratively reconstructs a high resolution (HR) image from its blurred and downsampled low resolution (LR) counterpart. However, the conventional IBP methods often produce many “jaggy” and “ringing” artifacts because the reconstruction errors are back projected into the reconstructed image isotropically and locally. In natural images, usually there exist many non-local redundancies which can be exploited to improve the image reconstruction quality. Therefore, we propose to incorporate adaptively the non-local information into the IBP process so that the reconstruction errors can be reduced. Experimental results demonstrated that the proposed NLBP can reconstruct faithfully the HR images with sharp edges and texture structures. It outperforms the state-of-the-art methods in both PSNR and visual perception.

131 citations


Journal ArticleDOI
TL;DR: In this article, an efficient folded architecture (EFA) for lifting-based discrete wavelet transform (DWT) is presented, which is based on a novel form of the lifting scheme that is given in this brief.
Abstract: In this brief an efficient folded architecture (EFA) for lifting-based discrete wavelet transform (DWT) is presented. The proposed EFA is based on a novel form of the lifting scheme that is given in this brief. Due to this form, the conventional serial operations of the lifting data flow can be optimized into parallel ones by employing parallel and pipeline techniques. The corresponding optimized architecture (OA) has short critical path latency and is repeatable. Further, utilizing this repeatability, the EFA is derived from the OA by employing the fold technique. For the proposed EFA, hardware utilization achieves 100%, and the number of required registers is reduced. Additionally, the shift-add operation is adopted to optimize the multiplication; thus, the proposed architecture is more suitable for hardware implementation. Performance comparisons and field-programmable gate array (FPGA) implementation results indicate that the proposed EFA possesses better performances in critical path latency, hardware cost, and control complexity.

74 citations


Journal ArticleDOI
TL;DR: A new transform scheme of multiplierless reversible time-domain lapped transform and Karhunen-Loeve transform (RTDLT/KLT) for lossy-to-lossless hyperspectral image compression that can realize progressive lossy to lossless compression from a single embedded code-stream file is proposed.
Abstract: We proposed a new transform scheme of multiplierless reversible time-domain lapped transform and Karhunen-Loeve transform (RTDLT/KLT) for lossy-to-lossless hyperspectral image compression. Instead of applying discrete wavelet transform (DWT) in the spatial domain, RTDLT is applied for decorrelation. RTDLT can be achieved by existing discrete cosine transform and pre- and postfilters, while the reversible transform is guaranteed by a matrix factorization method. In the spectral direction, reversible integer low-complexity KLT is used for decorrelation. Owing to completely reversible transform, the proposed method can realize progressive lossy-to-lossless compression from a single embedded code-stream file. Numerical experiments on benchmark images show that the proposed transform scheme performs better than 5/3DWT-based methods in both lossy and lossless compressions, comparable with the optimal 9/7DWT-FloatKLT-based lossy compression method.

48 citations


Journal ArticleDOI
TL;DR: It is suggested that top-down face processing contains not only regions for analyzing the visual appearance of faces, but also those involved in processing low spatial frequency (LSF) information, decision-making, and working memory.

48 citations


Proceedings ArticleDOI
24 May 2009
TL;DR: Experimental results show that the proposed scheme not only improves the coding efficiency even more than 10dB for compound images but also keeps the similar performance as H.264 for natural images.
Abstract: Compound images consist of text, graphics and natural images, which present strong anisotropic features. It makes existing image coding standards inefficient on compressing them. To solve the problem, this paper proposes a novel coding scheme based on the H.264 intra-frame coding. Two new intra modes are proposed to better exploit spatial correlations in compound images. The first is residual scalar quantization (RSQ) mode, where intra-predicted residues are directly quantized and entropy coded. The second is base colors and index map (BCIM) mode that can be viewed as an adaptive vector quantization. In this mode, an image block is represented by several representative colors, called as base colors, and an index map to compress. Two new modes as well as previous intra modes in H.264 are selected by the rate-distortion optimization (RDO) method in each block. Experimental results show that the proposed scheme not only improves the coding efficiency even more than 10dB for compound images but also keeps the similar performance as H.264 for natural images.

20 citations


Book ChapterDOI
15 Dec 2009
TL;DR: A bounded observation field sensing model based on the sensing feature of visual sensor is proposed, whereby the positions and poses of sensors which can enhance the coverage can be effectively worked out by this placement algorithm, thus the visual network's capability of obtaining regional information can be improved.
Abstract: Visual sensor networks have become a research focus with its expanding application domains. How to achieve optimal coverage to improve visual network's capability of obtaining regional information is a critical issue. As a visual sensor has a bounded field of view, a random deployment of network sensors can hardly solve this issue. This paper proposes a bounded observation field sensing model based on the sensing feature of visual sensor. According to this model, a sensor placement method is devised by means of multi-agent genetic algorithm (MAGA). The positions and poses of sensors which can enhance the coverage can be effectively worked out by this placement algorithm, thus the visual network's capability of obtaining regional information can be improved. Experiment results show that the algorithm proposed is effective in both 2D and 3D scenes.

15 citations


Journal ArticleDOI
TL;DR: This correspondence proposes a novel method for designing directional filter banks (DFBs) with arbitrary number of subbands based on the pseudo-polar Fourier transform (PPFT) and one-dimensional filters (FBs), which can capture directional information more flexibly than the existing methods.
Abstract: This correspondence proposes a novel method for designing directional filter banks (DFBs) with arbitrary number of subbands Its key feature is the ability to decompose images into arbitrary directionally oriented subbands The proposed approach is based on the pseudo-polar Fourier transform (PPFT) and one-dimensional (1-D) filter banks (FBs) We take some modifications on the PPFT and then employ 1-D FBs to the modified PPFT With these operations, the two-dimensional (2-D) DFBs are obtained and the design of them is converted to that of 1-D FBs plus a modified PPFT Since the number of channels of 1-D FBs can be arbitrary, the 2-D DFBs with arbitrary number of subbands can be achieved which is highly expected for directional representations of images Two examples on directional feature extractions illustrate that the proposed non-2n channel DFBs can capture directional information more flexibly than the existing methods

11 citations


Proceedings ArticleDOI
31 Dec 2009
TL;DR: Numerical experiments show that the proposed method not only has well anti-attack ability but also is robust to packet loss, which can still decrypt plain-image even when the packet loss ratio is up to 50%.
Abstract: In traditional image encryption system, decryption is extremely sensitive to packet loss. However, in wireless networks, packet loss is inevitable. Compressed sensing (CS) theory shows that sparse signal can be recovered from few incomplete measurements of it. Strong randomness of measurement matrix and irrelevance among the elements of the measurement vector imply that measurement process can be regarded as encryption process. So, this paper, based on CS theory, presents a new image encryption scheme with robustness to packet loss. In the scheme, we design a Gaussian random measurement matrix as the key to realize data encryption. Moreover, to enhance the incoherence between the plain-image and the cipher-image, we add a random disturbance term to the measurements (cipher-image) and thus improve the security level of the cipher-image. Numerical experiments show that the proposed method not only has well anti-attack ability but also is robust to packet loss, which can still decrypt plain-image even when the packet loss ratio is up to 50%.

11 citations


Proceedings ArticleDOI
07 Nov 2009
TL;DR: Simulation results show that the proposed multiplierless Low-RKLT performs much better in both lossless and lossy compression than 3D-DWT-based method.
Abstract: A multiplierless low complexity reversible integer Karhunen-Loeve transform (Low-RKLT) is proposed based on matrix factorization. Conventional methods based on KLT suffer from high computational complexity and unability of applying in lossless medical image compression. To solve the two problems, multiplierless Low-RKLT is investigated using multi-lifting in this paper. Combined with ROI coding method, we have proposed a progressive lossy-to-lossless ROI compression method for three dimensional (3D) medical images with high performance. In our proposed method Low-RKLT is used for the inter-frame decorrelation after SA-DWT in the spatial domain. Simulation results show that, the proposed method performs much better in both lossless and lossy compression than 3D-DWT-based method.

9 citations


Proceedings ArticleDOI
07 Nov 2009
TL;DR: An image coding method based on adaptive downsampling which not only uses the pixel redundancy but also considers visual redundancy is proposed which outperforms JPEG2000, SPECK, SPIHT, and LT+SPECK at low bit rates.
Abstract: This paper proposes an image coding method based on adaptive downsampling which not only uses the pixel redundancy but also considers visual redundancy. At the encoder side, codec adaptively chooses some smooth regions of the original image to downsample, and then overlapped transform with selectivity, block DCT and adaptive-shape DCT (SA-DCT) are used against the image after being downsampled. For the incomplete transformed image, OB-SPECK is adopted to code. At the decoder side, in order to reduce the computational complexity, we use the simple cubic interpolation which not only is very suitable to the downsampled regions but also enhances greatly the real time of this coding system. Experimental results shows the proposed method outperforms JPEG2000, SPECK, SPIHT, and LT+SPECK at low bit rates.

8 citations


Proceedings ArticleDOI
30 Oct 2009
TL;DR: A method based on both 3D-SPECK and DSC and the theory of DSC to realize the compression of hyperspectral images is proposed and the experimental results show that these methods can get competitive performance.
Abstract: In this paper, we propose a method based on both 3D-SPECK (3D Set Partitioning Embedded Block) and the theory of DSC (Distributed Source Coding) to realize the compression of hyperspectral images. Some experiments have been done to evaluate the compression performance, and the experimental results show that our methods can get competitive performance than 3D-SPECK, 3D-SPIHT.

Proceedings ArticleDOI
23 Oct 2009
TL;DR: Experiments show that the learning-based CS recovery algorithm can significantly improve the performance of the previous CS-MDC technique in both PSNR and visual quality.
Abstract: The recently proposed compressive sensing (CS) theory provides a new solution for multiple description coding (MDC) with fine granularity, by treating each random CS measurement as a description. The performance of CS-based MDC (CS-MDC) depends on the efficacy of the CS recovery algorithm. Existing CS recovery algorithms recover the signal in a fixed space (e.g., Wavelet, DCT, and gradient spaces) for the entire duration of the signal, even though a typical multimedia signal exhibits sparsity in time/space variant spaces. To rectify this problem and develop a better CS recovery algorithm for CSMDC, we propose a learning-based framework to conduct the CS recovery in locally adaptive spaces, and carry out a case study on image MDC. A set of prior image models are learned offline from a training set to facilitate the CS recovery in local adaptive bases. Experiments show that the learning-based CS recovery algorithm can significantly improve the performance of the previous CS-MDC technique in both PSNR and visual quality.

Proceedings ArticleDOI
19 Apr 2009
TL;DR: In this paper, a method is proposed to detect the ROIs automatically based on HVS and the properties of pixels such as the contrast, location and edges are analyzed, and the pixels are enhanced according to the of HVS.
Abstract: Detecting and segmenting out the regions of interest (ROIs) is one of the foundations in image processing and analysis. Because the final information sink of images is human, for segmenting out the ROIs effectively, we need to study human visual system (HVS) and imitate the behaviors when human viewing a scene. Researchers have found several factors which affect human attentions by studying eye movements when one views an image. In this paper, a method is proposed to detect the ROIs automatically based on HVS. In the proposed algorithm, the properties of pixels such as the contrast, location and edges are analyzed, and the pixels are enhanced according to the sensitivity of HVS. Then these factors are combined to a salient map, which classifies each pixel of the image in relation to its perceptual importance. Finally, the ROIs are segmented according to the salient map. This algorithm is easy to work, and can segment the objects from complex background efficiently.

Proceedings ArticleDOI
07 Nov 2009
TL;DR: A new imager to reconstruct high resolution image from a low resolution blurred image obtained by the intended movable random exposure, which can be considered as compressive sampling described in the compressive sensing (CS) theory.
Abstract: Recently compressive sensor developed as an imager for capturing images effectively has been studied extensively. In this paper, we design a new imager to reconstruct high resolution image from a low resolution blurred image obtained by the intended movable random exposure. This imager grabs an image by moving a camera with a randomly fluttering shutter along a certain motion route. By analyzing this kind of movable random exposure process, we find it can be considered as compressive sampling described in the compressive sensing (CS) theory. Then according to the CS theory, the exposure result of this imager can be used to recover a high resolution image. Since this imager consists only a movable camera and a fluttered shutter, it is relatively simple and easy to implement. The simulation results show that the proposed imager can recover high even ultra-high resolution images with good reconstruction performance.

Proceedings ArticleDOI
07 Nov 2009
TL;DR: Experimental results show the proposed approach for object tracking based on the online color information fusion scheme can enhance the discriminant characters in the changing environment and hence producing robust tracking results.
Abstract: In this paper, an approach for object tracking is proposed based on the online color information fusion scheme. The fusion scheme, is performed by projecting multi-channel color images to a one-dimensional pseudo gray scale space. This dimensionality reduction simplifies the designation of the tracking algorithm. The fusing coefficients are determined by taking the Fisher linear discriminant analysis to maximize the discriminative capability after taking the projection. The robustness of the approach lies in exploiting the appearance discrimination of the object in cluttered scenarios with varying illumination conditions. This scheme is embedded into a mean-shift tracking system and the experimental results show our scheme can enhance the discriminant characters in the changing environment and hence producing robust tracking results.

Proceedings ArticleDOI
30 Oct 2009
TL;DR: A novel framework on how to improve the compression performance through the idea related to the decreased spatial resolution down-sampled by the LP (Laplacian Pyramid) transform, and only code the low-pass band is proposed.
Abstract: We proposed a novel framework on how to improve the compression performance through the idea related to the decreased spatial resolution down-sampled by the LP (Laplacian Pyramid) transform, and only code the low-pass band. Before being compressed, our method intentionally down-sampled the image as part of the pre-processing step, and then applies the directionlet interpolation as post-processing step of the compressed image. At low bit-rates an appropriately down-sampled image compressed using multi-codec and later interpolated via directionlet, can be visually and objectively better than the image compressed directly with the codec scheme at the same number of bits. Down-sampling and up-sampling play a key role in our proposed novel code means which support spatial scalability. This scheme can be utilized to efficiently optimize the overall quality of the reconstructed image, especially the rich textures or edges preservation. We compared with the conventional JPEG2000 coding method and showed our framework can get comparable good results at low bit rate and is independent with the coding method.

Patent
18 Nov 2009
TL;DR: In this article, a wavelet-based image compression and transmission method is proposed. But the method is not suitable for the case of high coding efficiency and good error-tolerant effect, and can be used for compressing and transmitting the still image.
Abstract: Aiming at the problem that a transmission error can occur when a still image is wirelessly transmitted, the invention provides a wavelet based image compression and transmission method. The method comprises the following steps of: performing wavelet transformation on an input image; according to an obtained wavelet coefficient, performing the set partitioning block coding to generate code streams; grouping the code streams, interweaving the code streams after a cyclic redundancy check code and a Reed-Solomon code are added, and improving the error detecting and the error correcting capabilities of the code streams; transmitting a data packet generated by a coding end to a decoding end through a noise channel; performing de-interweaving, Reed-Solomon decoding, grouping and cyclic redundancy check on the data packet by the decoding end so as to realize the error correction and the error detection of the code streams, and obtaining the correct code stream information; performing the set partitioning block decoding on the code stream information to obtain the wavelet coefficient; and according to the wavelet coefficient, performing the wavelet inverse-transformation to obtain an decoded image. The method has the advantages of high coding efficiency and good error-tolerant effect, and can be used for compressing and transmitting the still image.

Proceedings ArticleDOI
23 Oct 2009
TL;DR: An edge-based region-of-interest (ROI) image coding technique for low bit-rate visual communication that outperforms the dynamic ROI coding of JPEG 2000 in both visuality and PSNR and can be made compliant with any existing compression standard.
Abstract: An edge-based region-of-interest (ROI) image coding technique is proposed for low bit-rate visual communication. An image is compactly coded into two semantic levels: background sketch and object textures. The background sketch is a down-sampled version of the input image. The decoder reconstructs the background first by adaptive interpolation and lets users identify the ROI, and then requests the ROI textures to be transmitted. A distinct advantage of the proposed ROI technique is a compact edge-based descriptor of natural object boundaries. The expensive ROI geometry computations are carried out at the encoder only on demand, keeping decoder complexity low to benefit wireless devices. The new system outperforms the dynamic ROI coding of JPEG 2000 in both visuality and PSNR. Furthermore, both background and texture coding can be made compliant with any existing compression standard.

Book ChapterDOI
Lili Liang1, Shihuo Ye1, Guangming Shi1, Xuemei Xie1, Wei Zhong1, Chao Wang1 
15 Dec 2009
TL;DR: The results show that the proposed DFBs have higher PSNR than the conventional wavelet transform and contourlet, and is suitable for applications requiring economical representations, such as image compression.
Abstract: In this paper, we propose a class of non-redundant directional filter banks (DFBs). It provides arbitrary number of subbands and therefore has the effective image representation ability. Furthermore, the non-redundancy property makes the proposed DFB suitable for applications requiring economical representations, such as image compression. The proposed DFB is constructed by using one-dimensional (1D) linear-phase M-channel filter banks and two-dimensional (2D) quadrant filter banks. Since only 1D operations are involved, it leads to low design complexity and simple implementation. To demonstrate the potential of the DFB, numerical experiments on nonlinear approximation are presented. The results show that the proposed DFBs have higher PSNR than the conventional wavelet transform and contourlet.

01 Jan 2009
TL;DR: Comparisons and field-programmable gate array (FPGA) implementation results indicate that the proposed EFA possesses better performances in critical path latency, hardware cost, and control complexity.
Abstract: In this brief an efficient folded architecture (EFA) for lifting-based discrete wavelet transform (DWT) is presented. The proposed EFA is based on a novel form of the lifting scheme that is given in this brief. Due to this form, the conventional serial op- erations of the lifting data flow can be optimized into parallel ones by employing parallel and pipeline techniques. The corresponding optimized architecture (OA) has short critical path latency and is repeatable. Further, utilizing this repeatability, the EFA is derived from the OA by employing the fold technique. For the proposed EFA, hardware utilization achieves 100%, and the number of re- quired registers is reduced. Additionally, the shift-add operation is adopted to optimize the multiplication; thus, the proposed ar- chitecture is more suitable for hardware implementation. Perfor- mance comparisons and field-programmable gate array (FPGA) implementation results indicate that the proposed EFA possesses better performances in critical path latency, hardware cost, and control complexity.

Proceedings ArticleDOI
07 Nov 2009
TL;DR: A context modeling technique to estimate the expectation of each wavelet coefficient conditioned on the local signal structure is proposed, which can significantly improve the performance of existing wavelet-based image denoisers.
Abstract: Existing wavelet-based image denoising techniques all assume a probability model of wavelet coefficients that has zero mean, such as zero-mean Laplacian, Gaussian, or generalized Gaussian distributions. While such a zero-mean probability model fits a wavelet subband well, in areas of edges and textures the distribution of wavelet coefficients exhibits a significant bias. We propose a context modeling technique to estimate the expectation of each wavelet coefficient conditioned on the local signal structure. The estimated expectation is then used to shift the probability model of wavelet coefficient back to zero. This bias removal technique can significantly improve the performance of existing wavelet-based image denoisers.

Proceedings ArticleDOI
TL;DR: The results indicate that making-decision, attention, episodic memory retrieving and contextual associative processing network cooperate with general face processing regions to process face information under top-down perception.
Abstract: Although top-down perceptual process plays an important role in face processing, its neural substrate is still puzzling because the top-down stream is extracted difficultly from the activation pattern associated with contamination caused by bottom-up face perception input. In the present study, a novel paradigm of instructing participants to detect faces from pure noise images is employed, which could efficiently eliminate the interference of bottom-up face perception in topdown face processing. Analyzing the map of functional connectivity with right FFA analyzed by conventional Pearson's correlation, a possible face processing pattern induced by top-down perception can be obtained. Apart from the brain areas of bilateral fusiform gyrus (FG), left inferior occipital gyrus (IOG) and left superior temporal sulcus (STS), which are consistent with a core system in the distributed cortical network for face perception, activation induced by top-down face processing is also found in these regions that include the anterior cingulate gyrus (ACC), right oribitofrontal cortex (OFC), left precuneus, right parahippocampal cortex, left dorsolateral prefrontal cortex (DLPFC), right frontal pole, bilateral premotor cortex, left inferior parietal cortex and bilateral thalamus. The results indicate that making-decision, attention, episodic memory retrieving and contextual associative processing network cooperate with general face processing regions to process face information under top-down perception.

Proceedings ArticleDOI
24 May 2009
TL;DR: By using the pseudo-polar Fourier transform, the design of DFBs is reduced to that of 1D FBs, leading to the low design complexity and good design flexibility, and by combining the proposed D FBs with Laplacian pyramid, the system can achieve a multiscale and multidirection system.
Abstract: This paper studies a novel method for creating M-channel nonsubsampled directional filter banks (DFBs). It is based on the pseudo-polar Fourier transform (PPFT) which evaluates the Fourier transform at points along rays equispaced in slope. We take some modifications on the arrangement of the euqispaced rays, making the two-dimensional (2D) PPFT grid to the Cartesian 2D discrete-Fourier transform grid. Then we apply one-dimensional (1D) filter banks (FBs) to the arranged PPFT along the slope direction. This can perform the decomposition of images into arbitrary directionally-oriented subbands. By using this method, the design of DFBs is reduced to that of 1D FBs, leading to the low design complexity and good design flexibility. Furthermore, by combining the proposed DFBs with Laplacian pyramid, we can achieve a multiscale and multidirection system. Experimental result is given to illustrate the proposed approach.

Proceedings ArticleDOI
30 Oct 2009
TL;DR: The novel packets division permits to continue reconstructing image when uncorrectable errors occur and the system makes full use of the bit-stream and the simulations show the scheme is better than an Equal Error Protection scheme (EEP) and a Unequal Error protection scheme (UEP).
Abstract: We propose a joint source and channel remote sensing image compression system for image transmission over Binary Symmetric Channels (BSC). As an effective wavelet-based compression scheme, Set Partitioned Embedded Block Coder (SPECK) is quite fragile against bit errors in noisy channels. To avoid the failure caused by the loss of synchronization and the error propagation, we improve the error resilience of SPECK with an acceptable degradation of quality. The novel packets division permits to continue reconstructing image when uncorrectable errors occur. The system makes full use of the bit-stream and the simulations show the scheme is better than an Equal Error Protection scheme (EEP) and a Unequal Error Protection scheme (UEP).