scispace - formally typeset
Search or ask a question
Author

C.-C.J. Kuo

Bio: C.-C.J. Kuo is an academic researcher from University of Southern California. The author has contributed to research in topics: Fading & Encoder. The author has an hindex of 35, co-authored 197 publications receiving 5955 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A progressive texture classification algorithm which is not only computationally attractive but also has excellent performance is developed and is compared with that of several other methods.
Abstract: A multiresolution approach based on a modified wavelet transform called the tree-structured wavelet transform or wavelet packets is proposed. The development of this transform is motivated by the observation that a large class of natural textures can be modeled as quasi-periodic signals whose dominant frequencies are located in the middle frequency channels. With the transform, it is possible to zoom into any desired frequency channels for further decomposition. In contrast, the conventional pyramid-structured wavelet transform performs further decomposition in low-frequency channels. A progressive texture classification algorithm which is not only computationally attractive but also has excellent performance is developed. The performance of the present method is compared with that of several other methods. >

1,507 citations

Journal ArticleDOI
TL;DR: The main idea is to effectively exploit the information obtained from the corresponding block at a coarser resolution level and spatio-temporal neighboring blocks at the same level in order to select a good set of initial MV candidates and then perform further local search to refine the MV result.
Abstract: We propose a new fast algorithm for block motion vector (MV) estimation based on the correlations of the MVs existing in spatially and temporally adjacent as well as hierarchically related blocks. We first establish a basic framework by introducing new algorithms based on spatial correlation and then spatio-temporal correlations before integrating them with a multiresolution scheme for the ultimate algorithm. The main idea is to effectively exploit the information obtained from the corresponding block at a coarser resolution level and spatio-temporal neighboring blocks at the same level in order to select a good set of initial MV candidates and then perform further local search to refine the MV result. We show with experimental results that, in comparison with the full search algorithm, the proposed algorithm achieves a speed-up factor ranging from 150 to 310 with only 2-7% mean square error (MSE) increase and a similar rate-distortion performance when applied to typical test video sequences.

279 citations

Journal ArticleDOI
TL;DR: A new rate control scheme for H.264 video encoding with enhanced rate and distortion models is proposed and it is shown by experimental results that the new algorithm can control bit rates accurately with the R-D performance significantly better than that of the rate control algorithm implemented in the H. 264 software encoder JM8.1a.
Abstract: A new rate control scheme for H.264 video encoding with enhanced rate and distortion models is proposed in this work. Compared with existing H.264 rate control schemes, our scheme has offered several new features. First, the inter-dependency between rate-distortion optimization (RDO) and rate control in H.264 is resolved via quantization parameter estimation and update. Second, since the bits of the header information may occupy a larger portion of the total bit budget, which is especially true when being coded at low bit rates, a rate model for the header information is developed to estimate header bits more accurately. The number of header bits is modeled as a function of the number of nonzero motion vector (MV) elements and the number of MVs. Third, a new source rate model and a distortion model are proposed. For this purpose, coded 4 times 4 blocks are identified and the number of source bits and distortion are modeled as functions of the quantization stepsize and the complexity of coded 4 times 4 blocks. Finally, a R-D optimized bit allocation scheme among macroblocks (MBs) is proposed to improve picture quality. Built upon the above ideas, a rate control algorithm is developed for the H.264 baseline-profile encoder under the constant bit rate constraint. It is shown by experimental results that the new algorithm can control bit rates accurately with the R-D performance significantly better than that of the rate control algorithm implemented in the H.264 software encoder JM8.1a

255 citations

Journal ArticleDOI
TL;DR: A novel practical low-complexity multicell orthogonal frequency-division multiple access (OFDMA) downlink channel-assignment method that uses a graphic framework that can be used in next-generation cellular systems such as the 3GPP Long-Term Evolution and IEEE 802.16 m.
Abstract: A novel practical low-complexity multicell orthogonal frequency-division multiple access (OFDMA) downlink channel-assignment method that uses a graphic framework is proposed in this paper. Our solution consists of two phases: 1) a coarse-scale intercell interference (ICI) management scheme and 2) a fine-scale channel-aware resource-allocation scheme. In the first phase, state-of-the-art ICI management techniques such as ICI coordination (ICIC) and base-station cooperation (BSC) are incorporated in our framework. In particular, the ICI information is acquired through inference from the diversity set of mobile stations and is presented by an interference graph. Then, ICIC or BSC is mapped to the MAX k-CUT problem in graph theory and is solved in the first phase. In the second phase, channel assignment is accomplished by taking instantaneous channel conditions into account. Heuristic algorithms are proposed to efficiently solve both phases of the problem. Extensive simulation is conducted for various practical scenarios to demonstrate the superior performance of the proposed solution compared with the conventional OFDMA allocation scheme. The proposed scheme can be used in next-generation cellular systems such as the 3GPP Long-Term Evolution and IEEE 802.16 m.

172 citations

Proceedings ArticleDOI
15 Mar 1999
TL;DR: It is shown that the proposed system has achieved an accuracy higher than 90% for coarse-level audio classification and the query-by-example audio retrieval is implemented where similar sounds can be found according to an input sample audio.
Abstract: A hierarchical system for audio classification and retrieval based on audio content analysis is presented in this paper. The system consists of three stages. The first stage is called the coarse-level audio classification and segmentation, where audio recordings are classified and segmented into speech, music, several types of environmental sounds, and silence, based on morphological and statistical analysis of temporal curves of short-time features of audio signals. In the second stage, environmental sounds are further classified into finer classes such as applause, rain, bird sound, etc. This fine-level classification is based on time-frequency analysis of audio signals and use of the hidden Markov model (HMM) for classification. In the third stage, the query-by-example audio retrieval is implemented where similar sounds can be found according to an input sample audio. It is shown that the proposed system has achieved an accuracy higher than 90% for coarse-level audio classification. Examples of audio fine classification and audio retrieval are also provided.

170 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Journal ArticleDOI
TL;DR: Comparisons with other multiresolution texture features using the Brodatz texture database indicate that the Gabor features provide the best pattern retrieval accuracy.
Abstract: Image content based retrieval is emerging as an important research area with application to digital libraries and multimedia databases. The focus of this paper is on the image processing aspects and in particular using texture information for browsing and retrieval of large image data. We propose the use of Gabor wavelet features for texture analysis and provide a comprehensive experimental evaluation. Comparisons with other multiresolution texture features using the Brodatz texture database indicate that the Gabor features provide the best pattern retrieval accuracy. An application to browsing large air photos is illustrated.

4,017 citations

Journal ArticleDOI
TL;DR: The survey includes 100+ papers covering the research aspects of image feature representation and extraction, multidimensional indexing, and system design, three of the fundamental bases of content-based image retrieval.

2,197 citations

Journal ArticleDOI
TL;DR: A relevance feedback based interactive retrieval approach that effectively takes into account the subjectivity of human perception of visual content and the gap between high-level concepts and low-level features in CBIR.
Abstract: Content-based image retrieval (CBIR) has become one of the most active research areas in the past few years. Many visual feature representations have been explored and many systems built. While these research efforts establish the basis of CBIR, the usefulness of the proposed approaches is limited. Specifically, these efforts have relatively ignored two distinct characteristics of CBIR systems: (1) the gap between high-level concepts and low-level features, and (2) the subjectivity of human perception of visual content. This paper proposes a relevance feedback based interactive retrieval approach, which effectively takes into account the above two characteristics in CBIR. During the retrieval process, the user's high-level query and perception subjectivity are captured by dynamically updated weights based on the user's feedback. The experimental results over more than 70000 images show that the proposed approach greatly reduces the user's effort of composing a query, and captures the user's information need more precisely.

1,933 citations