scispace - formally typeset
Search or ask a question

Showing papers by "Lai-Man Po published in 2007"


Journal ArticleDOI
TL;DR: A fast bit rate estimation technique to avoid the entropy coding method during intra- and inter-mode decision of H.264/AVC is proposed and results demonstrate that the proposed method reduces about 47% of total encoding time on using intra-modes only and saves about 34% oftotal encoding time when the fast motion search algorithm is used.
Abstract: To achieve the highest coding efficiency, H.264/AVC uses rate-distortion optimization technique. This means that the encoder has to code the video by exhaustively trying all the mode combinations including the different intra- and inter-prediction modes. Therefore, the complexity and computation load of video coding in H.264/AVC increase drastically compared to any previous standards. To reduce the complexity of rate-distortion cost computation, we propose a fast bit rate estimation technique to avoid the entropy coding method during intra- and inter-mode decision of H.264/AVC. The estimation method is based on the properties of context-based variable length coding (CAVLC). The proposed rate model predicts the rate of a 4 times 4 quantized residual block using five different tokens of CAVLC. Experimental results demonstrate that the proposed estimation method reduces about 47% of total encoding time on using intra-modes only and saves about 34% of total encoding time on using both inter- and intra-modes with ignorable degradation of coding performance when the fast motion search algorithm is used. When full search motion estimation algorithm is used, the proposed algorithm reduces about 17% of total encoding time.

59 citations


Journal ArticleDOI
TL;DR: The proposed FSSD algorithm is based on the theoretical equivalent of the SSDs in spatial and transform domains and determines the distortion in integer cosine transform domain using an iterative table-lookup quantization process and could avoid the inverse quantization/transform and pixel reconstructions processes with nearly no rate-distortion performance degradation.
Abstract: In H.264/AVC, the rate-distortion optimization for mode decision plays a significant role to achieve its outstanding performance in terms of both compression efficiency and video quality. However, this mode decision process also introduces extremely high complexity in the encoding process especially the computation of the sum of squared differences (SSD) between the original and reconstructed image blocks. In this paper, fast SSD (FSSD) algorithms are proposed to reduce the complexity of the rate-distortion cost function implementation. The proposed FSSD algorithm is based on the theoretical equivalent of the SSDs in spatial and transform domains and determines the distortion in integer cosine transform domain using an iterative table-lookup quantization process. This approach could avoid the inverse quantization/transform and pixel reconstructions processes with nearly no rate-distortion performance degradation. In addition, the FSSD can also be used with efficient bit rate estimation algorithms to further reduce the cost function complexity. Experimental results show that the new FSSD can save up to 15% of total encoding time with less than 0.1% coding performance degradation and it can save up to 30% with ignorable performance degradation when combining with conventional bit rate estimation algorithm

41 citations


Proceedings ArticleDOI
12 Nov 2007
TL;DR: Experimental results show that DCSD has a significant improvement on both retrieval performance and descriptor size over DCD, outperforming compact configurations of scalable color descriptor and color structure descriptor with smaller descriptor size.
Abstract: A new dominant color structure descriptor (DCSD) is proposed in this paper. It is designed to provide an efficient way to represent both color and spatial structure information with single compact descriptor. The descriptor combines the compactness of dominant color descriptor (DCD) and the retrieval accuracy of color structure descriptor (CSD) to enhance the retrieval performance in a highly efficient manner. The feature extraction and similarity measure of the descriptor are designed to address the problems of the existing descriptors while utilize the advantages of them. Experimental results show that DCSD has a significant improvement on both retrieval performance and descriptor size over DCD. An eight-color DCSD (DCSD 8) gives an averaged normalized modified retrieval rate (ANMRR) of 0.0993 using MPEG-7 common color dataset, outperforming compact configurations of scalable color descriptor and color structure descriptor with smaller descriptor size.

41 citations


Journal ArticleDOI
TL;DR: Experimental results show that the new HS and DS using point-oriented inner searches are faster than their original algorithms up to 30% with negligible peak signal-to-noise ratio degradation.
Abstract: Recently, an enhanced hexagon-based (EHS) search algorithm was proposed to speedup the original hexagon-based search (HS) using a 6-side-based fast inner search. However, this 6-side-based method is quite irregular by inspecting the distance between the inner search points and the coarse search points that would lower prediction accuracy. In this paper, a new point-oriented grouping strategy is proposed to develop fast inner search techniques for speeding up the HS and diamond search (DS) algorithms. Experimental results show that the new HS and DS using point-oriented inner searches are faster than their original algorithms up to 30% with negligible peak signal-to-noise ratio degradation

36 citations


Proceedings ArticleDOI
01 Nov 2007
TL;DR: Simulation results indicate that the IPBSS can averagely save bit rate more than 13% while maintaining the almost same video quality with QP = 10, 20 and 30, which is a better result than the previous work MEBSS.
Abstract: H264 achieves higher compression efficiency by employing multiple modes inter prediction, rate-distortion (RD) optimal mechanism and other new techniques Distortion metric plays an important role in video compression performance Structural similarity (SSIM) is a new image quality assessment method, which is more consistent with Human Vision Systems (HVS) So in this paper, we propose to adopt SSIM as the distortion metric in the inter prediction cost functions, named "improved inter prediction method based on SSIM" (IPBSS) It is an improved method of our previous work MEBSS Simulation results indicate that the IPBSS can averagely save bit rate more than 13% while maintaining the almost same video quality with QP = 10, 20 and 30 That is a better result than our previous work MEBSS

31 citations


Patent
26 Mar 2007
TL;DR: In this article, the authors proposed a method and apparatus for calculating the sum of squared differences (SSD) between a source block and a reconstructed block of image or video data encoding according to an encoding scheme such as H.264/AVC.
Abstract: The present invention relates to a method and apparatus for calculating the Sum of Squared Differences (SSD) between a source block and a reconstructed block of image or video data encoding according to an encoding scheme such as H.264/AVC. In a preferred embodiment, the method computes the SSD by finding the SSD between coefficients of an integer transformed residual block and the corresponding inverse-quantized coefficients. Preferably the inverse quantized coefficients are found with the aid of a look up table. This method may save computing time and processing power compared to calculating the SSD directly from the source and reconstructed blocks. The SSD is related to the distortion caused by encoding and the method may be used in calculating the rate-distortion of a particular encoding mode. One embodiment of the invention encodes a block of data by selecting the encoding mode with the least rate-distortion.

18 citations


Proceedings ArticleDOI
Zhong Gao1, Lai-Man Po, Wu Jiang, Xin Zhao, Hao Dong 
16 Dec 2007
TL;DR: This paper presents a novel computerized tongue inspection method based on support vector machine (SVM), and shows that it can be used to classify the tongue images more excellently and get a relative reliable prediction of diseases based on these features.
Abstract: The tongue diagnosis is an important diagnostic method in traditional chinese medicine (TCM). In this paper, we present a novel computerized tongue inspection method based on support vector machine (SVM). First, two kinds of quantitative features, chromatic and textural measures, are extracted from tongue images by using popular image processing techniques. Then, support vector machine and Bayesian network are employed to build the mapping relationships between these features and diseases, respectively. Finally, we present a comparison between SVM and BN classification. The experiment results show that we can use SVM to classify the tongue images more excellently and get a relative reliable prediction of diseases based on these features.

15 citations


Proceedings ArticleDOI
02 Jul 2007
TL;DR: Experimental results show that DCSD has a significant improvement in retrieval performance and descriptor size over DCD, and the feature extraction and similarity measure of the descriptor are designed to address the problems of the existing descriptors such as color inaccuracy of DCD and redundancy of CSD.
Abstract: An important problem in color based image retrieval is the lack of efficient way to represent both the color and the spatial structure information with single descriptor. To solve this problem, a new dominant color structure descriptor (DCSD), is proposed. The descriptor combines the compactness of dominant color descriptor (DCD) and the accuracy of color structure descriptor (CSD) to enhance the retrieval performance in a highly efficient manner. The feature extraction and similarity measure of the descriptor are designed to address the problems of the existing descriptors such as color inaccuracy of DCD and redundancy of CSD. Experimental results show that DCSD has a significant improvement in retrieval performance and descriptor size over DCD. An eight-color DCSD (DCSD 8) gives an averaged normalized modified retrieval rate (ANMRR) of 0.0993 using MPEG-7 common color dataset, outperforming compact configurations of scalable color descriptor and color structure descriptor with smaller descriptor size.

15 citations


Proceedings ArticleDOI
02 Jul 2007
TL;DR: A simple classifier based on error descent rate (EDR) is proposed that performs well for all kinds of video contents and uses a very few number of search points to predict whether the global minimum is far away or near the center of the search window.
Abstract: Most of the fast motion estimation algorithms based on search-point pattern are only good at handling videos with small motions, for example, block-based gradient descent search, diamond search and hexagonal-based search. An adaptive motion estimation algorithm which can switch between search patterns for different video contents should work better than a single search pattern algorithm. In this paper, a simple classifier based on error descent rate (EDR) is proposed. This classifier uses a very few number of search points to predict whether the global minimum is far away or near the center of the search window. If it is far away, a search pattern which is good at searching large motions is used. Otherwise, a pattern good at searching small motions is applied. The proposed search patterns switching (SPS) algorithm performs well for all kinds of video contents.

14 citations


Proceedings ArticleDOI
02 Jul 2007
TL;DR: A bit rate estimation technique to avoid the entropy coding method during mode decision of intra prediction of H.264/AVC and achieves up to 53 % reduced encoding time of intra coding with ignorable degradation of coding performance.
Abstract: H.264/AVC is a newest international video coding standard that can achieve considerably higher coding efficiency than previous standards. This comes at the cost of the complex mode decision procedure using the rate-distortion optimization, which makes real-time encoding difficult. To reduce the complexity of rate-distortion cost, we propose a bit rate estimation technique to avoid the entropy coding method during mode decision of intra prediction. The estimation method is based on the properties of context-based variable length coding (CAVLC). Simulation results demonstrate that the proposed estimation method achieves up to 53 % reduced encoding time of intra coding with ignorable degradation of coding performance.

11 citations


Book ChapterDOI
11 Dec 2007
TL;DR: A novel fast algorithm based on structural similarity (SSIM) in motion estimation (ME) process (FMEBSS) can greatly reduced the complexity of ME by eliminating the unnecessary search positions and reducing complex prediction modes.
Abstract: H.264 achieves considerable higher coding efficiency compared with previous video coding standards, whereas the complexity is increased significantly. This paper proposed a novel fast algorithm based on structural similarity (SSIM) in motion estimation (ME) process (FMEBSS), which can greatly reduced the complexity of ME by eliminating the unnecessary search positions and reducing complex prediction modes. Simulation results demonstrate that the proposed method can averagely reduce the coding time by about 50%, and compression ratio is improved at the same time, while the degradation in video quality is negligible.

Proceedings ArticleDOI
02 Jul 2007
TL;DR: A new adaptive vector quantization (AVQ) algorithm which achieves rate-distortion performance superior to that of the conventional AVQ algorithms using the full codeword updating (FCU) scheme and can be combined with transform coding and entropy coding for higher compression ratio.
Abstract: In this paper, we propose a new adaptive vector quantization (AVQ) algorithm based on the rate-distortion optimization. This algorithm employs a new partial codeword updating (PCU) scheme which achieves rate-distortion performance superior to that of the conventional AVQ algorithms using the full codeword updating (FCU) scheme. The PCU-AVQ only updates the codeword's components with the quantization error higher than an optimal threshold instead of replacing the whole codeword. Additionally, the mathematical relation between the Lagrangian multiplier and the approximate optimal threshold is devised to reduce the rate-distortion cost computation. The experimental results show that the proposed PCU-AVQ algorithm indeed improves the rate-distortion performance without much computational complexity penalty. The PCU-AVQ can be combined with transform coding and entropy coding for higher compression ratio, and it can be widely implemented in specific AVQ algorithms for image, video and speech coding.

Journal Article
TL;DR: In this article, an enhanced hexagon-based search (EHS) algorithm was proposed to speed up the original hexagonbased search by exploiting the group-distortion information of some evaluated points.
Abstract: Recently, an enhanced hexagon-based search (EHS) algorithm was proposed to speedup the original hexagon-based search (HS) by exploiting the group-distortion information of some evaluated points. In this paper, a second version of the EHS is proposed with a new point-oriented inner search technique which can further speedup the HS in both large and small motion environments. Experimental results show that the enhanced hexagon-based search version-2 (EHS2) is faster than the HS up to 34% with negligible PSNR degradation.

24 Jun 2007
TL;DR: Experimental results show that the enhanced hexagon-based search version-2 (EHS2) is faster than the HS up to 34% with negligible PSNR degradation.
Abstract: Recently, an enhanced hexagon-based search (EHS) algorithm was proposed to speedup the original hexagon-based search (HS) by exploiting the group-distortion information of some evaluated points. In this paper, a second version of the EHS is proposed with a new point-oriented inner search technique which can further speedup the HS in both large and small motion environments. Experimental results show that the enhanced hexagon-based search version-2 (EHS2) is faster than the HS up to 34% with negligible PSNR degradation.