scispace - formally typeset
Search or ask a question
Author

Susanto Rahardja

Bio: Susanto Rahardja is an academic researcher from Northwestern Polytechnical University. The author has contributed to research in topics: Data compression & Speech coding. The author has an hindex of 36, co-authored 324 publications receiving 5224 citations. Previous affiliations of Susanto Rahardja include Agency for Science, Technology and Research & Nanyang Technological University.


Papers
More filters
Journal ArticleDOI
TL;DR: Experimental results show that the fast intraprediction mode decision scheme increases the speed of intracoding significantly with negligible loss of peak signal-to-noise ratio.
Abstract: The H.264/AVC video coding standard aims to enable significantly improved compression performance compared to all existing video coding standards. In order to achieve this, a robust rate-distortion optimization (RDO) technique is employed to select the best coding mode and reference frame for each macroblock. As a result, the complexity and computation load increase drastically. This paper presents a fast mode decision algorithm for H.264/AVC intraprediction based on local edge information. Prior to intraprediction, an edge map is created and a local edge direction histogram is then established for each subblock. Based on the distribution of the edge direction histogram, only a small part of intraprediction modes are chosen for RDO calculation. Experimental results show that the fast intraprediction mode decision scheme increases the speed of intracoding significantly with negligible loss of peak signal-to-noise ratio.

485 citations

Journal ArticleDOI
TL;DR: A fast intermode decision algorithm to decide the best mode in intercoding makes use of the spatial homogeneity and the temporal stationarity characteristics of video objects and is able to reduce on the average 30% encoding time.
Abstract: The new video coding standard, H.264/MPEG-4 AVC, uses variable block sizes ranging from 4/spl times/4 to 16/spl times/16 in interframe coding. This new feature has achieved significant coding gain compared to coding a macroblock (MB) using fixed block size. However, this feature results in extremely high computational complexity when brute force rate distortion optimization (RDO) algorithm is used. This paper proposes a fast intermode decision algorithm to decide the best mode in intercoding. It makes use of the spatial homogeneity and the temporal stationarity characteristics of video objects. Specifically, spatial homogeneity of a MB is decided based on the MB's edge intensity, and temporal stationarity is decided by the difference of the current MB and it colocated counterpart in the reference frame. Based on the homogeneity and stationarity of the video objects, only a small number of intermodes are selected in the RDO process. The experimental results show that the fast intermode decision algorithm is able to reduce on the average 30% encoding time, with a negligible peak signal-to-noise ratio loss of 0.03 dB or, equivalently, a bit rate increment of 0.6%.

314 citations

Journal ArticleDOI
TL;DR: E evaluation results show that the proposed /spl beta/-order minimum mean-square error speech enhancement approach can achieve a more significant noise reduction and a better spectral estimation of weak speech spectral components from a noisy signal as compared to many existing speech enhancement algorithms.
Abstract: This paper proposes /spl beta/-order minimum mean-square error (MMSE) speech enhancement approach for estimating the short time spectral amplitude (STSA) of a speech signal. We analyze the characteristics of the /spl beta/-order STSA MMSE estimator and the relation between the value of /spl beta/ and the spectral amplitude gain function of the MMSE method. We further investigate the effectiveness of a range of fixed-/spl beta/ values in estimating STSA based on the MMSE criterion, and discuss how the /spl beta/ value could be adapted using the frame signal-to-noise ratio (SNR). The performance of the proposed speech enhancement approach is then evaluated through spectrogram inspection, objective speech distortion measures and subjective listening tests using several types of noise sources from the NOISEX-92 database. Evaluation results show that our approach can achieve a more significant noise reduction and a better spectral estimation of weak speech spectral components from a noisy signal as compared to many existing speech enhancement algorithms.

127 citations

Journal ArticleDOI
TL;DR: A new quadratic optimization-based method to extract fine details from a vector field that can enhance fine details to produce sharper images is introduced.
Abstract: In a typical processing chain of image enhancement, an exposure fusion scheme can be used to synthesize a more detailed low dynamic range (LDR) image directly from a set of differently exposed LDR images, without generation of an intermediate high dynamic range image. In this brief, we introduce a new quadratic optimization-based method to extract fine details from a vector field. The new method extracts fine details from a set of differently exposed LDR images simultaneously. The extracted fine details are then added to an intermediate LDR image which is fused by simply using an existing exposure fusion scheme. With this, the proposed scheme can enhance fine details to produce sharper images.

124 citations

Proceedings ArticleDOI
24 Oct 2004
TL;DR: A rate control scheme for H.264 is presented by introducing the concept of basic unit and a linear prediction model that is used to solve the chicken and egg dilemma existing in the rate control of H. 264.
Abstract: This paper presents a rate control scheme for H264 by introducing the concept of basic unit and a linear prediction model The basic unit can be a macroblock (MB), a slice, or a frame It can be used to obtain a trade-off between the overall coding efficiency and the bits fluctuation The linear model is used to solve the chicken and egg dilemma existing in the rate control of H264 Both constant bit rate (CBR) and variable bit rate (VBR) cases are studied Our scheme has been adopted by H264

119 citations


Cited by
More filters
01 Mar 2001
TL;DR: Using singular value decomposition in transforming genome-wide expression data from genes x arrays space to reduced diagonalized "eigengenes" x "eigenarrays" space gives a global picture of the dynamics of gene expression, in which individual genes and arrays appear to be classified into groups of similar regulation and function, or similar cellular state and biological phenotype.
Abstract: ‡We describe the use of singular value decomposition in transforming genome-wide expression data from genes 3 arrays space to reduced diagonalized ‘‘eigengenes’’ 3 ‘‘eigenarrays’’ space, where the eigengenes (or eigenarrays) are unique orthonormal superpositions of the genes (or arrays). Normalizing the data by filtering out the eigengenes (and eigenarrays) that are inferred to represent noise or experimental artifacts enables meaningful comparison of the expression of different genes across different arrays in different experiments. Sorting the data according to the eigengenes and eigenarrays gives a global picture of the dynamics of gene expression, in which individual genes and arrays appear to be classified into groups of similar regulation and function, or similar cellular state and biological phenotype, respectively. After normalization and sorting, the significant eigengenes and eigenarrays can be associated with observed genome-wide effects of regulators, or with measured samples, in which these regulators are overactive or underactive, respectively.

1,815 citations

Journal ArticleDOI
01 Oct 1980

1,565 citations

Book
01 Jan 1996
TL;DR: This text is the first to combine the study of neural networks and fuzzy systems, their basics and their use, along with symbolic AI methods to build comprehensive artificial intelligence systems.
Abstract: From the Publisher: "Covering the latest issues and achievements, this well documented, precisely presented text is timely and suitable for graduate and upper undergraduate students in knowledge engineering, intelligent systems, AI, neural networks, fuzzy systems, and related areas. The author's goal is to explain the principles of neural networks and fuzzy systems and to demonstrate how they can be applied to building knowledge-based systems for problem solving. Especially useful are the comparisons between different techniques (AI rule-based methods, fuzzy methods, connectionist methods, hybrid systems) used to solve the same or similar problems." -- Anca Ralescu, Associate Professor of Computer Science, University of Cincinnati Neural networks and fuzzy systems are different approaches to introducing human-like reasoning into expert systems. This text is the first to combine the study of these two subjects, their basics and their use, along with symbolic AI methods to build comprehensive artificial intelligence systems. In a clear and accessible style, Kasabov describes rule- based and connectionist techniques and then their combinations, with fuzzy logic included, showing the application of the different techniques to a set of simple prototype problems, which makes comparisons possible. A particularly strong feature of the text is that it is filled with applications in engineering, business, and finance. AI problems that cover most of the application-oriented research in the field (pattern recognition, speech and image processing, classification, planning, optimization, prediction, control, decision making, and game simulations) are discussed and illustrated with concrete examples. Intended both as a text for advanced undergraduate and postgraduate students as well as a reference for researchers in the field of knowledge engineering, Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering has chapters structured for various levels of teaching and includes original work by the author along with the classic material. Data sets for the examples in the book as well as an integrated software environment that can be used to solve the problems and do the exercises at the end of each chapter are available free through anonymous ftp.

977 citations

Journal ArticleDOI
TL;DR: A systematic, comprehensive and up-to-date review of perceptual visual quality metrics (PVQMs) to predict picture quality according to human perception.

895 citations

Journal ArticleDOI
TL;DR: The proposed opinion-unaware BIQA method does not need any distorted sample images nor subjective quality scores for training, yet extensive experiments demonstrate its superior quality-prediction performance to the state-of-the-art opinion-aware BIZA methods.
Abstract: Existing blind image quality assessment (BIQA) methods are mostly opinion-aware. They learn regression models from training images with associated human subjective scores to predict the perceptual quality of test images. Such opinion-aware methods, however, require a large amount of training samples with associated human subjective scores and of a variety of distortion types. The BIQA models learned by opinion-aware methods often have weak generalization capability, hereby limiting their usability in practice. By comparison, opinion-unaware methods do not need human subjective scores for training, and thus have greater potential for good generalization capability. Unfortunately, thus far no opinion-unaware BIQA method has shown consistently better quality prediction accuracy than the opinion-aware methods. Here, we aim to develop an opinion-unaware BIQA method that can compete with, and perhaps outperform, the existing opinion-aware methods. By integrating the features of natural image statistics derived from multiple cues, we learn a multivariate Gaussian model of image patches from a collection of pristine natural images. Using the learned multivariate Gaussian model, a Bhattacharyya-like distance is used to measure the quality of each image patch, and then an overall quality score is obtained by average pooling. The proposed BIQA method does not need any distorted sample images nor subjective quality scores for training, yet extensive experiments demonstrate its superior quality-prediction performance to the state-of-the-art opinion-aware BIQA methods. The MATLAB source code of our algorithm is publicly available at www.comp.polyu.edu.hk / $\sim $ cslzhang/IQA/ILNIQE/ILNIQE.htm.

783 citations