scispace - formally typeset
Search or ask a question

Showing papers by "Masayuki Nishiguchi published in 2006"


Patent
23 Aug 2006
TL;DR: A video processing apparatus includes: face area detection means for detecting a face area included in a frame forming video data; trace-generation means for generating a frame identification corresponding to a start and an end of a trace including, as a unit, a set of frames from an appearance of the face area to a disappearance on the basis of the detection; representative face-area information generation means for selecting a representative face area from the face region included in frames forming the trace and generating representative face region information representing contents of the representative region as mentioned in this paper.
Abstract: A video processing apparatus includes: face-area detection means for detecting a face area included in a frame forming video data; trace-generation means for generating a frame identification corresponding to a start and an end of a trace including, as a unit, a set of frames from an appearance of the face area to a disappearance on the basis of the detection; representative face-area information generation means for selecting a representative face area from the face area included in frames forming the trace and generating representative face-area information representing contents of the representative face area; and video-data appended information generation means for generating video-data appended information relating the frame identification corresponding to a start and an end of the trace to the representative face-area information for the video data.

13 citations


Journal ArticleDOI
TL;DR: The encoder and decoder algorithms for HVXC, including fast harmonic synthesis, time scale modification, and pitch-change decoding, are discussed, which provides near toll-quality speech at 4.0 kbit/s, and communication- quality speech at 2.0kbit/S, thus outperforming FS1016 4.8-k bit/s CELP.
Abstract: A coding algorithm for speech called harmonic vector excitation coding (HVXC) has been developed that encodes speech at very low bit rates (2.0–4.0 kbit/s). It breaks speech signals down into two types of segments: voiced segments, for which a parametric representation of harmonic spectral magnitudes of LPC residual signals is used; and unvoiced segments, for which the CELP coding algorithm is used. This combination provides near toll-quality speech at 4.0 kbit/s, and communication-quality speech at 2.0 kbit/s, thus outperforming FS1016 4.8-kbit/s CELP. This paper discusses the encoder and decoder algorithms for HVXC, including fast harmonic synthesis, time scale modification, and pitch-change decoding. Due to its high coding efficiency and new functionality, HVXC has been adopted as the ISO/IEC International Standard for MPEG-4 audio.

4 citations


Patent
27 Sep 2006
TL;DR: In this paper, a histogram generation part 101 generates the histogram of the luminance or color of each of a front frame and a current frame, and a space correlation image generation part 102 generates a spatial correlation image as an image showing the correlation of the space layout of each frame and the current frame.
Abstract: PROBLEM TO BE SOLVED: To highly precisely detect cut change. SOLUTION: A histogram generation part 101 generates the histogram of the luminance or color of each of a front frame and a current frame, and a space correlation image generation part 102 generates a space correlation image as an image showing the correlation of the space layout of each of the front frame and the current frame. A histogram similarity calculation part 104 calculates histogram similarity as the similarity of the histogram of the previous frame and the histogram of the current frame, and a space correlation image similarity calculation part 105 calculates space correlation image similarity as the similarity of the space correlation image of the previous frame and the space correlation image of the current frame. A decision part 64 decides whether or not the change of the image in the previous frame and the current frame is cut change based on the histogram similarity and the space correlation image similarity. This invention may be applied to a personal computer for image processing. COPYRIGHT: (C)2008,JPO&INPIT

1 citations


Patent
24 Apr 2006
TL;DR: In this article, a cross-fade signal generation section 131 for generating a cross fade signal from an audio signal, a time-axis reversed difference signal generation (TADCSG) section 132 for generating the differential signal from the audio signal and an adder 133 for adding the time axis reversed differential signal to the cross fading signal.
Abstract: PROBLEM TO BE SOLVED: To provide an audio signal expansion and compression method and device, capable of obtaining excellent sound quality. SOLUTION: The device comprises: a cross fade signal generation section 131 for generating a cross fade signal from an audio signal; a time axis reversed difference signal generation section 132 for generating a differential signal from the audio signal and generating a time axis reversed differential signal in which a time axis of the differential signal is reversed; and an adder 133 for adding the time axis reversed differential signal to the cross fade signal. COPYRIGHT: (C)2008,JPO&INPIT

1 citations


Journal ArticleDOI
TL;DR: A weighted vector quantization method for spectral vectors composed of a variable number of harmonic magnitudes is presented, based on simple, efficient linear dimension conversion and employs a weighted distortion measure that exploits the human auditory sense.
Abstract: Harmonic coding is a very powerful technique for the coding of speech at very low bit rates; and the efficient coding of spectral magnitudes sampled at harmonic frequencies is the key to obtaining good coded-speech quality. This paper presents a weighted vector quantization method for spectral vectors composed of a variable number of harmonic magnitudes. It is based on simple, efficient linear dimension conversion and employs a weighted distortion measure that exploits the human auditory sense. A codebook training algorithm using the weighting matrix is also presented. Finally, a low-complexity VQ codebook search technique based on pre-selection is described that reduces the computational complexity to less than 10% of that of an exhaustive search, without perceptible loss of quality. The proposed quantization scheme is used in Harmonic Vector eXcitation Coding (HVXC), which is a very low-bit-rate speech coding algorithm that combines harmonic and stochastic vector representations of LPC residual signals. Due to the high efficiency of this VQ scheme, HVXC provides good communication-quality speech at bit rates as low as 2–4 kbit/s, and was adopted as the ISO/IEC International Standard for MPEG-4 Audio.

1 citations