scispace - formally typeset
Search or ask a question
Author

Shiuh-Ku Weng

Other affiliations: National Defense University
Bio: Shiuh-Ku Weng is an academic researcher from United States Naval Academy. The author has contributed to research in topics: Search algorithm & Motion estimation. The author has an hindex of 2, co-authored 7 publications receiving 322 citations. Previous affiliations of Shiuh-Ku Weng include National Defense University.

Papers
More filters
Journal ArticleDOI
TL;DR: The proposed method has the robust ability to track theMoving object in the consecutive frames under some kinds of real-world complex situations such as the moving object disappearing totally or partially due to occlusion by other ones, fast moving object, changing lighting, changing the direction and orientation of the movingobject, and changing the velocity of moving object suddenly.

314 citations

Journal ArticleDOI
TL;DR: This paper presents a novel motion estimate scheme, called correlation search, which attempts to find the highest motion-correlation neighbor block from the spatial and temporal neighbor blocks.
Abstract: Block-matching motion estimation plays an important role in video coding. In general, there exists a high motion correlation between neighbor blocks in spatial and temporal directions. This paper presents a novel motion estimate scheme, called correlation search, which attempts to find the highest motion-correlation neighbor block from the spatial and temporal neighbor blocks. The motion vector of the highest motion-correlation neighbor block is regarded as the motion estimate of the current block. The correlation search scheme can be based on any existing block matching algorithms. When the scheme is based on a full-search algorithm, it achieves almost the same estimate accuracy with a significant reduction of computational complexity. If the scheme is based on a fast search algorithm, it obtains better estimate accuracy in addition to the improvements in computation.

37 citations

Journal ArticleDOI
TL;DR: A new block-based motion estimation algorithm for image sequence coding that exploits the motion correlation in two dimensions in a simple manner and achieves a significant reduction in computation with better performance as compared with the conventional full search algorithm.
Abstract: A new block-based motion estimation algorithm for image sequence coding is proposed. The motion vectors between neighboring blocks are highly correlated, and the correlation exists in two dimensions in general. We model the 2-D motion correlation with two separate 1-D models in the horizontal and vertical directions, respectively. The 1-D Kalman filtering is then used to obtain the estimates in the two directions. By linearly combining the two estimates, we develop a motion estimation algorithm that exploits the motion correlation in two dimensions in a simple manner. The algorithm overcomes the difficulty of the general 2-D Kalman filter, which is very complicated and computationally intensive. The results indicate that the proposed algorithm achieves a significant reduction in computation with better performance as compared with the conventional full search algorithm.

2 citations

Journal ArticleDOI
TL;DR: An efficient image compression technique using block group- ing, where similar blocks in an image are merged into a group in a recursive manner, which improves the compressioniciency significantly, compared with the JPEG-like algorithm.
Abstract: An efficient image compression technique using block group- ing is presented. The similar blocks in an image are merged into a group in a recursive manner. The center of a group is then encoded with the Joint Photographic Experts Group (JPEG)-like algorithm. Simulation re- sults indicate that the proposed algorithm improves the compression ef- ficiency significantly, compared with the JPEG-like algorithm. © 1997 So- ciety of Photo-Optical Instrumentation Engineers. (S0091-3286(97)03108-5)
Proceedings ArticleDOI
15 Oct 1996
TL;DR: Simulation results indicate that the proposed DCT-based intraframe coding algorithm improves the compression efficiency significantly at low bit rate, as compared to JPEG.
Abstract: This paper presents a new DCT-based intraframe coding algorithm. The algorithm exploits the interblock correlation by grouping the similar blocks in an image into a category. The center of a category is then encoded with the JPEG algorithm. Simulation results indicate that the proposed algorithm improves the compression efficiency significantly at low bit rate, as compared to JPEG.

Cited by
More filters
Journal ArticleDOI
TL;DR: The search speed of the proposed ARPS-ZMP is about two to three times faster than that of the diamond search (DS), and the method even achieves higher peak signal-to-noise ratio (PSNR) particularly for those video sequences containing large and/or complex motion contents.
Abstract: We propose a novel and simple fast block-matching algorithm (BMA), called adaptive rood pattern search (ARPS), which consists of two sequential search stages: (1) initial search and (2) refined local search. For each macroblock (MB), the initial search is performed only once at the beginning in order to find a good starting point for the follow-up refined local search. By doing so, unnecessary intermediate search and the risk of being trapped into local minimum matching error points could be greatly reduced in long search case. For the initial search stage, an adaptive rood pattern (ARP) is proposed, and the ARP's size is dynamically determined for each MB, based on the available motion vectors (MVs) of the neighboring MBs. In the refined local search stage, a unit-size rood pattern (URP) is exploited repeatedly, and unrestrictedly, until the final MV is found. To further speed up the search, zero-motion prejudgment (ZMP) is incorporated in our method, which is particularly beneficial to those video sequences containing small motion contents. Extensive experiments conducted based on the MPEG-4 Verification Model (VM) encoding platform show that the search speed of our proposed ARPS-ZMP is about two to three times faster than that of the diamond search (DS), and our method even achieves higher peak signal-to-noise ratio (PSNR) particularly for those video sequences containing large and/or complex motion contents.

605 citations

Posted Content
TL;DR: The Encoder-Recurrent-Decoder (ERD) model is a recurrent neural network that incorporates nonlinear encoder and decoder networks before and after recurrent layers that extends previous Long Short Term Memory models in the literature to jointly learn representations and their dynamics.
Abstract: We propose the Encoder-Recurrent-Decoder (ERD) model for recognition and prediction of human body pose in videos and motion capture. The ERD model is a recurrent neural network that incorporates nonlinear encoder and decoder networks before and after recurrent layers. We test instantiations of ERD architectures in the tasks of motion capture (mocap) generation, body pose labeling and body pose forecasting in videos. Our model handles mocap training data across multiple subjects and activity domains, and synthesizes novel motions while avoid drifting for long periods of time. For human pose labeling, ERD outperforms a per frame body part detector by resolving left-right body part confusions. For video pose forecasting, ERD predicts body joint displacements across a temporal horizon of 400ms and outperforms a first order motion model based on optical flow. ERDs extend previous Long Short Term Memory (LSTM) models in the literature to jointly learn representations and their dynamics. Our experiments show such representation learning is crucial for both labeling and prediction in space-time. We find this is a distinguishing feature between the spatio-temporal visual domain in comparison to 1D text, speech or handwriting, where straightforward hard coded representations have shown excellent results when directly combined with recurrent units.

570 citations

Proceedings ArticleDOI
07 Dec 2015
TL;DR: In this paper, the Encoder-Recurrent-Decoder (ERD) model is proposed for recognition and prediction of human body pose in videos and motion capture, which is a recurrent neural network that incorporates nonlinear encoder and decoder networks before and after recurrent layers.
Abstract: We propose the Encoder-Recurrent-Decoder (ERD) model for recognition and prediction of human body pose in videos and motion capture. The ERD model is a recurrent neural network that incorporates nonlinear encoder and decoder networks before and after recurrent layers. We test instantiations of ERD architectures in the tasks of motion capture (mocap) generation, body pose labeling and body pose forecasting in videos. Our model handles mocap training data across multiple subjects and activity domains, and synthesizes novel motions while avoiding drifting for long periods of time. For human pose labeling, ERD outperforms a per frame body part detector by resolving left-right body part confusions. For video pose forecasting, ERD predicts body joint displacements across a temporal horizon of 400ms and outperforms a first order motion model based on optical flow. ERDs extend previous Long Short Term Memory (LSTM) models in the literature to jointly learn representations and their dynamics. Our experiments show such representation learning is crucial for both labeling and prediction in space-time. We find this is a distinguishing feature between the spatio-temporal visual domain in comparison to 1D text, speech or handwriting, where straightforward hard coded representations have shown excellent results when directly combined with recurrent units [31].

546 citations

Journal ArticleDOI
TL;DR: Two techniques are proposed, the generalized motion vector predictor and the adaptive threshold calculation, that can be used to significantly improve the performance of many existing fast ME algorithms and create two new algorithms, named advanced predictive diamond zonal search and predictive MV field adaptive search technique.
Abstract: Motion estimation (ME) is an important part of any video encoding system since it could significantly affect the output quality of an encoded sequence. Unfortunately, this feature requires a significant part of the encoding time especially when using the straightforward full search (FS) algorithm. We propose two techniques, the generalized motion vector (MV) predictor and the adaptive threshold calculation, that can be used to significantly improve the performance of many existing fast ME algorithms. In particular, we apply them to create two new algorithms, named advanced predictive diamond zonal search and predictive MV field adaptive search technique, respectively, which can considerably reduce, if not essentially remove, the computational cost of ME at the encoder, while at the same time give similar, and in many cases better, visual quality with the brute force full search algorithm. The proposed algorithms mainly rely upon very robust and reliable predictive techniques and early termination criteria with parameters adapted to the local characteristics combined with the zonal based patterns. Our experiments verify the considerable superiority of the proposed algorithms versus the performance of possibly all other known fast algorithms, and FS.

215 citations

Proceedings ArticleDOI
20 Jun 2010
TL;DR: An algorithm of feature-based using Kalman filter motion to handle multiple objects tracking is proposed and shows that the algorithm achieves efficient tracking of multiple moving objects under the confusing situations.
Abstract: It is important to maintain the identity of multiple targets while tracking them in some applications such as behavior understanding. However, unsatisfying tracking results may be produced due to different real-time conditions. These conditions include: inter-object occlusion, occlusion of the ocjects by background obstacles, splits and merges, which are observed when objects are being tracked in real-time. In this paper, an algorithm of feature-based using Kalman filter motion to handle multiple objects tracking is proposed. The system is fully automatic and requires no manual input of any kind for initialization of tracking. Through establishing Kalman filter motion model with the features centroid and area of moving objects in a single fixed camera monitoring scene, using information obtained by detection to judge whether merge or split occurred, the calculation of the cost function can be used to solve the problems of correspondence after split happened. The algorithm proposed is validated on human and vehicle image sequence. The results shows that the algorithm proposed achieves efficient tracking of multiple moving objects under the confusing situations.

185 citations