scispace - formally typeset
Search or ask a question

Showing papers on "Noise published in 2003"


Proceedings Article
01 Jan 2003
TL;DR: The algorithm is noise and distortion resistant, computationally efficient, and massively scalable, capable of quickly identifying a short segment of music captured through a cellphone microphone in the presence of foreground voices and other dominant noise, out of a database of over a million tracks.
Abstract: We have developed and commercially deployed a flexible audio search engine. The algorithm is noise and distortion resistant, computationally efficient, and massively scalable, capable of quickly identifying a short segment of music captured through a cellphone microphone in the presence of foreground voices and other dominant noise, and through voice codec compression, out of a database of over a million tracks. The algorithm uses a combinatorially hashed time-frequency constellation analysis of the audio, yielding unusual properties such as transparency, in which multiple tracks mixed together may each be identified. Furthermore, for applications such as radio monitoring, search times on the order of a few milliseconds per query are attained, even on a massive music database.

683 citations


Journal ArticleDOI
TL;DR: The spectral smoothness principle is proposed as an efficient new mechanism in estimating the spectral envelopes of detected sounds and works robustly in noise, and is able to handle sounds that exhibit inharmonicities.
Abstract: A new method for estimating the fundamental frequencies of concurrent musical sounds is described. The method is based on an iterative approach, where the fundamental frequency of the most prominent sound is estimated, the sound is subtracted from the mixture, and the process is repeated for the residual signal. For the estimation stage, an algorithm is proposed which utilizes the frequency relationships of simultaneous spectral components, without assuming ideal harmonicity. For the subtraction stage, the spectral smoothness principle is proposed as an efficient new mechanism in estimating the spectral envelopes of detected sounds. With these techniques, multiple fundamental frequency estimation can be performed quite accurately in a single time frame, without the use of long-term temporal features. The experimental data comprised recorded samples of 30 musical instruments from four different sources. Multiple fundamental frequency estimation was performed for random sound source and pitch combinations. Error rates for mixtures ranging from one to six simultaneous sounds were 1.8%, 3.9%, 6.3%, 9.9%, 14%, and 18%, respectively. In musical interval and chord identification tasks, the algorithm outperformed the average of ten trained musicians. The method works robustly in noise, and is able to handle sounds that exhibit inharmonicities. The inharmonicity factor and spectral envelope of each sound is estimated along with the fundamental frequency.

356 citations


Journal ArticleDOI
TL;DR: In this article, distortion discriminant analysis (DDA) is proposed to map audio data to feature vectors for the classification, retrieval or identification tasks, and the feature extraction operation must be computationally efficient.
Abstract: Mapping audio data to feature vectors for the classification, retrieval or identification tasks presents four principal challenges. The dimensionality of the input must be significantly reduced; the resulting features must be robust to likely distortions of the input; the features must be informative for the task at hand; and the feature extraction operation must be computationally efficient. We propose distortion discriminant analysis (DDA), which fulfills all four of these requirements. DDA constructs a linear, convolutional neural network out of layers, each of which performs an oriented PCA dimensional reduction. We demonstrate the effectiveness of DDA on two audio fingerprinting tasks: searching for 500 audio clips in 36 h of audio test data; and playing over 10 days of audio against a database with approximately 240 000 fingerprints. We show that the system is robust to kinds of noise that are not present in the training procedure. In the large test, the system gives a false positive rate of 1.5 /spl times/ 10/sup -8/ per audio clip, per fingerprint, at a false negative rate of 0.2% per clip.

155 citations


Patent
21 Feb 2003
TL;DR: In this paper, a segmentation method based on the analysis of audio signal statistical features variation is proposed. But the main objective of this method is localization of segments with stationary properties, which can be used as an input data for classification, speech/music/noise attribution and so on.
Abstract: Disclosed herein is a segmentation method, which divides an input audio stream into segments containing different homogeneous signals. The main objective of this method is localization of segments with stationary properties. This method seeks all no-stationary points or intervals in the audio stream and creates a list of segments. The obtained list of segments can be used as an input data for the following procedures, such as classification, speech/music/noise attribution and so on. The proposed segmentation method is based on the analysis of audio signal statistical features variation and comprises three main stages: stage of first-grade characteristics calculation, stage of second-grade characteristics calculation and stage of decision-making.

130 citations


Journal ArticleDOI
TL;DR: Global Noise: Rap and Hip-Hop outside the USA outside the United States, ed.
Abstract: Global Noise: Rap and Hip-Hop outside the USA. Daniel Miller. ed. Oxford: Berg, Wesleyan University Press, 2001. ix. 336 pp., photographs, index.

128 citations


Journal ArticleDOI
TL;DR: Two freshwater gobies Padogobius martensii and Gobius nigricans live in shallow stony streams, and males of both species produce courtship sounds, and the equivalent relationship between auditory sensitivity and maximum ambient noise levels in both species suggests that ambient noise shapes hearing sensitivity.
Abstract: Two freshwater gobies Padogobius martensii and Gobius nigricans live in shallow (5–70 cm) stony streams, and males of both species produce courtship sounds. A previous study demonstrated high noise levels near waterfalls, a quiet window in the noise around 100 Hz at noisy locations, and extremely short-range propagation of noise and goby signals. To investigate the relationship of this acoustic environment to communication, we determined audiograms for both species and measured parameters of courtship sounds produced in the streams. We also deflated the swimbladder in P. martensii to determine its effect on frequency utilization in sound production and hearing. Both species are maximally sensitive at 100 Hz and produce low-frequency sounds with main energy from 70 to 100–150 Hz. Swimbladder deflation does not affect auditory threshold or dominant frequency of courtship sounds and has no or minor effects on sound amplitude. Therefore, both species utilize frequencies for hearing and sound production that fall within the low-frequency quiet region, and the equivalent relationship between auditory sensitivity and maximum ambient noise levels in both species further suggests that ambient noise shapes hearing sensitivity.

123 citations


Journal ArticleDOI
TL;DR: The basic spatial and spectral characteristics of blue- and green-noise halftoning models are reviewed, and some of the earlier work done to improve error diffusion as a noise generator is reviewed.
Abstract: In this article, we review the spatial and spectral characteristics of blue- and green-noise halftoning models. In the case of blue noise, dispersed-dot dither patterns are constructed by isolating minority pixels as homogeneously as possible, and by doing so, a pattern composed exclusively of high-frequency spectral components is produced. Blue-noise halftoning is preferred for display devices that can accommodate isolated dots such as various video displays and some print technologies such as ink-jet. For print marking engines that cannot support isolated pixels, dispersed-dot halftoning is inappropriate. For such cases, clustered-dot halftoning is used to avoid dot-gain instability. Green-noise halftones are clustered-dot, blue-noise patterns. Such patterns enjoy the blue-noise properties of homogeneity and lack low-frequency texture but have clusters of minority pixels on blue-noise centers. Green noise is composed exclusively of midfrequency spectral components. In addition to the basic spatial and spectral characteristics of the halftoning models, this article also reviews some of the earlier work done to improve error diffusion as a noise generator. We also discuss processes to generate threshold arrays to achieve blue and green noise with the computationally efficient process of ordered dither.

105 citations


Patent
24 Sep 2003
TL;DR: In this paper, a variable amplifier adjusts an audio input signal to generate an audio output signal with an appropriate level so that the audio output signals are audible over noise in a listening area.
Abstract: Systems and methods for ambient noise compensation are disclosed. One example of a system includes a variable amplifier, a source sound processor, an area sound processor, and an adjustment circuit. The variable amplifier adjusts an audio input signal to generate an audio output signal with an appropriate level so that the audio output signal is audible over noise in a listening area. The source sound processor and the area sound processor may split the audio output signal and a monitoring signal into frequency bands, and may compare these signals band-by band to find differences that represent time-varying noise in the monitoring signal. These differences may be modified to account for the acoustic response of the listening area and for constant-level background noise in the listening area. The adjustment circuit controls the variable amplifier in response to these differences.

89 citations


Journal ArticleDOI
TL;DR: To determine environmental constraints on the communication of two freshwater gobies Padogobius martensii and Gobius nigricans, numerous noise spectra were measured from quiet areas and ones adjacent to waterfalls and rapids in two shallow stony streams.
Abstract: Noise is an important theoretical constraint on the evolution of signal form and sensory performance. In order to determine environmental constraints on the communication of two freshwater gobies Padogobius martensii and Gobius nigricans, numerous noise spectra were measured from quiet areas and ones adjacent to waterfalls and rapids in two shallow stony streams. Propagation of goby sounds and waterfall noise was also measured. A quiet window around 100 Hz is present in many noise spectra from noisy locations. The window lies between two noise sources, a low-frequency one attributed to turbulence, and a high-frequency one (200–500 Hz) attributed to bubble noise from water breaking the surface. Ambient noise from a waterfall (frequencies below 1 kHz) attenuates as much as 30 dB between 1 and 2 m, after which values are variable without further attenuation (i.e., buried in the noise floor). Similarly, courtship sounds of P. martensii attenuate as much as 30 dB between 5 and 50 cm. Since gobies are known to court in noisy as well as quiet locations in these streams, their acoustic communication system (sounds and auditory system) must be able to cope with short-range propagation dictated by shallow depths and ambient noise in noisy locations.

89 citations


Journal ArticleDOI
TL;DR: It is suggested that amplitude regulation of vocalizations contibutes to signal transmission distance along with the established relationships between singing behaviour, acoustic structure and habitat.

76 citations


Journal ArticleDOI
TL;DR: An efficient noise estimation algorithm for speech enhancement is proposed, which gives reliable results even at very low SNRs.
Abstract: An efficient noise estimation algorithm for speech enhancement is proposed. The noisy speech is decomposed into subband signals and the subband noise estimate is updated by adaptively smoothing the noisy signal power. The smoothing parameter is chosen as a function of the estimated signal-to-noise ratio (SNR). This noise estimation technique gives reliable results even at very low SNRs.

Journal ArticleDOI
TL;DR: A two-microphone speech enhancer designed to remove noise in hands-free car kits using speech correlation and noise decorrelation to separate speech from noise, showing the superiority of the two-sensor approach to single microphone techniques.
Abstract: This paper presents a two-microphone speech enhancer designed to remove noise in hands-free car kits. The algorithm, based on the magnitude squared coherence, uses speech correlation and noise decorrelation to separate speech from noise. The remaining correlated noise is reduced using cross-spectral subtraction. Particular attention is focused on the estimation of the different spectral densities (noise and noisy signals power spectral densities) which are critical for the quality of the algorithm. We also propose a continuous noise estimation, avoiding the need of vocal activity detector. Results on recorded signals are provided, showing the superiority of the two-sensor approach to single microphone techniques.

Patent
10 Mar 2003
TL;DR: In this article, a signal-to-noise ratio dependent adaptive spectral subtraction process is used to eliminate noise from noise-corrupted speech signals, which is done by determining if the frame of data being sampled is a voiced or unvoiced frame.
Abstract: A signal-to-noise ratio dependent adaptive spectral subtraction process eliminates noise from noise-corrupted speech signals. The process first pre-emphasizes the frequency components of the input sound signal which contain the consonant information in human speech. Next, a signal-to-noise ratio is determined and a spectral subtraction proportion adjusted appropriately. After spectral subtraction, low amplitude signals can be squelched. A single microphone is used to obtain both the noise-corrupted speech and the average noise estimate. This is done by determining if the frame of data being sampled is a voiced or unvoiced frame. During unvoiced frames an estimate of the noise is obtained. A running average of the noise is used to approximate the expected value of the noise. Spectral subtraction may be performed on a composite noise-corrupted signal, or upon individual sub-bands of the noise-corrupted signal. Pre-averaging of the input signal's magnitude spectrum over multiple time frames may be performed to reduce musical noise.

Journal ArticleDOI
TL;DR: The noise level found in this ICU is above the recommended by the literature during all the periods examined, and excessive noise sources need to be better identified so that proper steps may be taken to reduce this noise and make this environment more silent, thus improving the professionals' work and the patients' recovery.
Abstract: Noise levels in hospitals are excessively high, especially in the ICU environment, because of the numerous alarms and equipment, in addition to the conversation of the hospital staff itself. For this reason, this environment, which should be quiet and calm, has become noisy, thus converting into a major stress factor, likely to cause physiological and psychological disorders in both inpatients and the unit personnel. AIM: The purpose of this study was to assess the equivalent noise pressure level in a general ICU, in an attempt to establish the period of greatest exposure and to compare the results to both domestic and international recommendations. STUDY DESIGN: Observational study. MATERIAL AND METHOD: Measure the ambient noise in the ICU of Hospital Sao Paulo using a noise analyzer model 2260 (Bruel & Kjaer) for a total period of 6.000 minutes, at a rate of one reading every 27 seconds, was carried out with the following configuration: fast response time, measuring the noise pressure level in decibels with A-frequency weighting, from September 2001 to June 2002, without knowledge by the sector personnel. RESULTS: The average equivalent noise pressure level (Leq) was of 65.36 dB(A) ranging from 62.9 to 69.3 dB(A). During the day, the average Leq was of 65.23 dB(A), and at night 63.89 dB(A). LFMax was found to be 108.4 dB(A) and LFMin of 40 dB(A). CONCLUSIONS: The noise level found in this ICU is above the recommended by the literature during all the periods examined. Thus, excessive noise sources need to be better identified so that proper steps may be taken to reduce this noise and make this environment more silent, thus improving the professionals' work and the patients' recovery.

Journal ArticleDOI
TL;DR: A novel approach for real-time multichannel speech enhancement in environments of nonstationary noise and time-varying acoustical transfer functions (ATFs) by integrating adaptive beamforming, ATF identification, soft signal detection, and multich channel postfiltering.
Abstract: We present a novel approach for real-time multichannel speech enhancement in environments of nonstationary noise and time-varying acoustical transfer functions (ATFs). The proposed system integrates adaptive beamforming, ATF identification, soft signal detection, and multichannel postfiltering. The noise canceller branch of the beamformer and the ATF identification are adaptively updated online, based on hypothesis test results. The noise canceller is updated only during stationary noise frames, and the ATF identification is carried out only when desired source components have been detected. The hypothesis testing is based on the nonstationarity of the signals and the transient power ratio between the beamformer primary output and its reference noise signals. Following the beamforming and the hypothesis testing, estimates for the signal presence probability and for the noise power spectral density are derived. Subsequently, an optimal spectral gain function that minimizes the mean square error of the log-spectral amplitude (LSA) is applied. Experimental results demonstrate the usefulness of the proposed system in nonstationary noise environments.

Proceedings ArticleDOI
08 Nov 2003
TL;DR: This paper presents a multi-expert person identification system based on the integration of three separate systems employing audio features, static face images and lip motion features respectively that improves the person identification accuracies for both clean and noisy audio conditions.
Abstract: This paper presents a multi-expert person identification system based on the integration of three separate systems employing audio features, static face images and lip motion features respectively. Audio person identification was carried out using a text dependent Hidden Markov Model methodology. Modeling of the lip motion was carried out using Gaussian probability density functions. The static image based identification was carried out using the FaceIt system. Experiments were conducted with 251 subjects from the XM2VTS audio-visual database. Late integration using automatic weights was employed to combine the three experts. The integration strategy adapts automatically to the audio noise conditions. It was found that the integration of the three experts improved the person identification accuracies for both clean and noisy audio conditions compared with the audio only case. For audio, FaceIt, lip motion, and tri-expert identification, maximum accuracies achieved were 98%, 93.22%, 86.37% and 100% respectively. Maximum bi-expert integration of the two visual experts achieved an identification accuracy of 96.8% which is comparable to the best audio accuracy of 98%.

Proceedings ArticleDOI
10 Jul 2003
TL;DR: Proposed method performs substantially better than spectral subtraction because of the absence of any "musical noise" artifacts in the processed speech, therefore it is suitable for speech recognition.
Abstract: In this paper, we propose an improved spectral subtraction method for enhancing noise-corrupted speech. This implementation uses over-subtraction method for power spectral subtraction and a modified residual noise reduction algorithm to eliminate musical noise. Performance of the algorithm was compared with previous spectral subtraction methods using both Gaussian white noise and car noise. Proposed method performs substantially better than spectral subtraction because of the absence of any "musical noise" artifacts in the processed speech, therefore it is suitable for speech recognition.

Journal Article
TL;DR: For instance, this paper found that 60 decibel (dB) helicopter noise resulted in lower ratings of scenic beauty, solitude, tranquility, freedom, naturalness, and preference, and higher ratings of annoyance.
Abstract: Aircraft overflight noise from helicopter tours is frequently encountered in such national parks as Grand Canyon, Hawaii Volcanoes, Haleakala, and Bryce Canyon. Noise is an environmental stressor and is associated with a variety of physiological and psychological effects, some of which are long-lasting. Psychologically, attributing a stressor to a nonhostile origin (e.g., a helicopter rescue mission) could mitigate stress effects. In this study, 200 undergraduates rated National Park scenes while exposed to either natural sounds (birds, brooks, wind), helicopter noise attributed to tourist overflights, helicopter noise attributed to back country maintenance operations, or helicopter noise attributed to the rescue of a back country hiker. Regardless of the source, 60 decibel (dB(A)) helicopter noise resulted in lower ratings of scenic beauty, solitude, tranquility, freedom, naturalness, and preference, and higher ratings of annoyance. These effects occurred across all types of scenery. Results suggest that park management-related overflight noise is just as disturbing as tourist aircraft noise, and that the impact of this noise is substantial across demographic variables and across types of vistas.

Journal ArticleDOI
TL;DR: In this paper, the authors address the problem of predictability for the NAO in dex series, and show that the spectral analysis, completed with a bootstrap procedure, shows a rather featureless structure of the index.
Abstract: Inthis paper the authors address the problem of predictability for the NAO in dex series. The spectral analysis, completed with a bootstrap procedure, shows a rather featureless structure of the index. In other words, the actual time series could be a realisation of many di6erent stochastic processes. An analysis of the Hurst exponent does suggest a slightly red noise as a model for the index, which is interpreted as the NAO being driven by meteorological noise. A nonlinear study of the series (embedding dimension, fractal correlation dimension and leading Lyapunov exponent) shows little predictive performance as well. c � 2003 Published by Elsevier Science B.V.

Journal ArticleDOI
TL;DR: The results demonstrate the extent to which communication ranges in the e eld can vary depending on call type, signal directivity, ambient noise conditions, and receiver capabilities.
Abstract: Acoustic communication range estimates for four northern elephant seal ( Mirounga angustirostris ) vocalization types are presented for this species. Maximum signal detection ranges are determined using an integrated approach involving: e eld measurements of vocalization source levels and spectral characteristics, signal directivity patterns, natural ambient noise measurements, and previously collected laboratory audiometric data. Signals and masking noise were analyzed using two e lter bandwidths believed to approximate the upper and lower limit of auditory e lter widths for the northern elephant seal auditory system. Signal detection ranges are estimated for representative pup ‘ female attraction calls’ (FAC), adult female ‘ pup attraction calls’ (PAC), adult female ‘ threat calls’ (AFT), and adult male ‘ clap threat calls’ (AMCT) in each of three intensity categories for biotic noise, wave noise, and wind noise. Signal detection ranges in these nine natural masking noise conditions vary from 5‐ 70 m for FAC, 10‐ 105 m for PAC, 41‐ 479 m for AFT, and 59‐ 507 m for AMCT. The results demonstrate the extent to which communication ranges in the e eld can vary depending on call type, signal directivity, ambient noise conditions, and receiver capabilities. These data are also useful in considering natural constraints on acoustic communication in northern elephant seals, selective pressures on signal production and reception systems, and potential negative e Vects of anthropogenic noise.

Patent
10 Nov 2003
TL;DR: In this article, a device for automatically controlling audio volume based on vehicle speed and a method for operating the same keep the car audio volume at a uniform level relative to noise occurring at a level proportional to vehicle speed.
Abstract: A device for automatically controlling audio volume based on vehicle speed and a method for operating the same keep the car audio volume at a uniform level relative to noise occurring at a level proportional to vehicle speed. The device includes a vehicle speed calculation unit, a car audio device including a speaker, a mode-setting button, and a microcomputer. The mode-setting button is formed on the car audio device at a portion thereof so as to allow a user to set an automatic volume control mode of the car audio device. The microcomputer automatically controls volume of the car audio device so as to output an audio volume corresponding to vehicle speed calculated in the vehicle calculation unit, according to the automatic volume control mode set through the mode-setting button.

Patent
10 Nov 2003
TL;DR: In this paper, a plurality of microphones are spaced apart, attached to or within a PC housing, in such a way that noise from PC components can be subtracted or reduced from an input audio signal, which increases the signal to noise ratio and improves audio processing accuracy.
Abstract: Systems and methods for improving the signal to noise ratio for audio input in a computing system are provided. Signal to noise ratio improvements are enabled for an input audio signal by incorporating a plurality of microphones into the PC environment, for example, to improve the processing of voice or speech input or other audio capture applications. In various embodiments, a plurality of microphones are spaced apart, e.g., attached to or within a PC housing, in such a way that noise from PC components can be “subtracted” or reduced from an input audio signal, which increases the signal to noise ratio and improves audio processing accuracy. The “subtraction” or reduction techniques applied by the present invention are unique to the PC environment wherein different noises having particular characteristics can be identified, including, but not limited to, rattle of device components, fan noise and disk noise.

PatentDOI
Lloyd Watts1
TL;DR: A system and method for processing an audio signal including separating the audio signal into a plurality of streams which group sounds from the same source prior to classification and analyzing each separate stream to determine phoneme-level classification is described in this article.
Abstract: A system and method are disclosed for processing an audio signal including separating the audio signal into a plurality of streams which group sounds from a same source prior to classification and analyzing each separate stream to determine phoneme-level classification. One or more words of the audio signal may then be outputted.

Journal ArticleDOI
TL;DR: The aims of this study were to characterize the impulse noise environment at a law enforcement firing range; document the insufficiencies found at the range from a health and safety standpoint; and provide noise abatement recommendations to reduce the overall health hazard to the auditory system.
Abstract: Exposure to hazardous impulse noise is common during the firing of weapons at indoor firing ranges. The aims of this study were to characterize the impulse noise environment at a law enforcement firing range; document the insufficiencies found at the range from a health and safety standpoint; and provide noise abatement recommendations to reduce the overall health hazard to the auditory system. Ten shooters conducted a typical live-fire exercise using three different weapons--the Beretta.40 caliber pistol, the Remington.308 caliber shotgun, and the M4.223 caliber assault rifle. Measurements were obtained at 12 different positions throughout the firing range and adjacent areas using dosimeters and sound level meters. Personal and area measurements were recorded to a digital audio tape (DAT) recorder for further spectral analysis. Peak pressure levels inside the firing range reached 163 decibels (dB) in peak pressure. Equivalent sound levels (Leq) ranged from 78 decibels, A-weighted (dBA), in office area adjacent to the range to 122 dBA inside the range. Noise reductions from wall structures ranged from 29-44 dB. Noise abatement strategies ranged from simple noise control measures (such as sealing construction joints and leaks) to elaborate design modifications to eliminate structural-borne sounds using acoustical treatments. Further studies are needed to better characterize the effects of firing weapons in enclosed spaces on hearing and health in general.

Journal Article
TL;DR: The proportion of persons who reported that they were very or extremely annoyed indoors from noise from installations was more than twice as high as for traffic noise, indicating the importance of also regulating the noise exposure on the "quiet side" of buildings.
Abstract: In order to improve the living conditions for respondents highly exposed to traffic noise, it has been recommended that one side of the building should face a "quiet side". Quiet may, however, be spoilt by noise from installations such as ventilation and air-conditioning systems. The noises generated by installations of this kind often have a dominant portion of low frequencies (20-200 Hz) and may be a source of great annoyance and sleep disturbance. This paper describes the cross-sectional part of an intended intervention study among residents exposed to traffic noise on one side of the building and to low frequency noise from installations on the other side of the building. A questionnaire masked as a general living environment study was delivered to a randomly selected person in each household. In total 41 respondents answered the questionnaire (71% response rate). Noise from installations was measured indoors in a bedroom facing the courtyard in a selection of apartments and outdoors in the yard. 24h traffic noise outdoor and indoor levels were calculated. The noise levels from installations were slightly above or at the Swedish recommendations for low frequency noise indoors with the window closed and exceeded the recommendations by about 10 dB SPL when the window was slightly opened. The proportion of persons who reported that they were very or extremely annoyed indoors from noise from installations was more than twice as high as for traffic noise. Installation noise also affected respondents' willingness to have their windows open and to sleep with an open window. The high disturbance of installation noises found in this study indicates the importance of also regulating the noise exposure on the "quiet side" of buildings. Further studies will give a better base for the extent of annoyance and acceptable levels of installation noises.

Patent
20 Mar 2003
TL;DR: In this paper, the authors propose a method and device for interleaving bit stream by which multimedia data including digital picture data, audio data, and sub-video data are seamless-reproduced by smoothly switching videos and voices to each other without disturbing videos, allowing noise to be contained in voices, and discontinuing voices at angle switching sections during multi-angle reproduction.
Abstract: PROBLEM TO BE SOLVED: To provide a method and device for interleaving bit stream by which multimedia data including digital picture data, audio data, and sub-video data are seamless-reproduced by smoothly switching videos and voices to each other without disturbing videos, allowing noise to be contained in voices, and discontinuing voices at angle switching sections during multi-angle reproduction from an optical disk on which multi-media data are recorded. SOLUTION: A multi-angle system stream is constituted of a plurality of system streams composed of picture data and audio data created at different viewing points. In the multi-angle system stream from which a system stream corresponding to an angle can be reproduced at every predetermined unit by freely switching the system stream to another during reproduction, the display time of picture data contained in the system stream corresponding to an angle and the display time of audio data are made equal to each other for every angle at every prescribed unit at which the angle can be switched.

Patent
Tae-Gyu Chang1, Jang Heung Yeop1
25 Nov 2003
TL;DR: In this article, a method and apparatus for shaping quantization noise generated when compressing audio data at low bit rate is disclosed, where a quantization threshold allowed during quantization of sampled audio data and quantization energy information of a quantized MDCT coefficient are received in all frequency bands of an audio frequency.
Abstract: A method and apparatus for shaping quantization noise generated when compressing audio data at a low bit rate is disclosed. A predetermined quantization noise threshold allowed during quantization of sampled audio data and quantization noise energy information of a quantized MDCT coefficient are received in all frequency bands of an audio frequency. The quantization noise energy of the quantized MDCT coefficient is attenuated in a predetermined number of frequency bands in which a difference between the predetermined quantization noise threshold and the quantization noise energy of the quantized MDCT coefficient is large.

01 Nov 2003
TL;DR: A forward swept fan, designated the Quiet High Speed Fan (QHSF), was tested in the NASA Glenn 9-by 15-foot Low Speed Wind Tunnel to investigate its noise reduction relative to a baseline fan of the same aerodynamic performance as discussed by the authors.
Abstract: A forward swept fan, designated the Quiet High Speed Fan (QHSF), was tested in the NASA Glenn 9-by 15-foot Low Speed Wind Tunnel to investigate its noise reduction relative to a baseline fan of the same aerodynamic performance. The objective of the Quiet High Speed Fan was a 6 decibel reduction in the Effective Perceived Noise relative to the baseline fan at the takeoff condition. The intent of the Quiet High Speed Fan design was to provide both a multiple pure tone noise reduction from the forward sweep of the fan rotor and a rotor-stator interaction blade passing tone noise reduction from a leaned stator. The tunnel noise data indicted that the Quiet High Speed Fan was quieter than the baseline fan for a significant portion of the operating line and was 6 dB quieter near the takeoff condition. Although reductions in the multiple pure tones were observed, the vast majority of the EPNdB reduction was a result of the reduction in the blade passing tone and its harmonics. The baseline fan's blade passing tone was dominated by the rotor-strut interaction mechanism. The observed blade passing tone reduction could be the result of either the redesign of the Quiet High Speed Fan Rotor or the redesigned stator. The exact cause of this rotor-strut noise reduction, whether from the rotor or stator redesign, was not discernable from this experiment.

Proceedings ArticleDOI
06 Jul 2003
TL;DR: An adaptive noise suppression system that mitigates or eliminates processing artifacts common to Wiener filtering without decreasing speech recognition performance is proposed.
Abstract: Removal of ambient noise from a single-channel audio signal is becoming an increasingly important problem due to the proliferation of portable communication devices. Furthermore, in applications such as wireless telephony and phonetic data mining, it is desired that noise suppression be robust to changing noise conditions and that processing take place in real time or faster. This paper proposes an adaptive noise suppression system that mitigates or eliminates processing artifacts common to Wiener filtering without decreasing speech recognition performance. Results of one implementation of such a structure demonstrate significant improvements in both perceptual quality and speech recognition performance under noisy conditions.

Journal ArticleDOI
TL;DR: A stereo audio delta-sigma A/D converter is implemented to support both the standard pulse-code modulation audio and the direct stream digital (DSD) output format.
Abstract: A stereo audio delta-sigma A/D converter is implemented to support both the standard pulse-code modulation audio and the direct stream digital (DSD) output format. It provides all the standard audio rates up to 192 kHz. A sixth-order, single-bit modulator is employed to achieve the noise performance as well as the bitstream output required by the DSD format. A novel density-modulated dithering scheme is utilized to dramatically reduce the tone level in the signal band without compromising the stability of the high-order modulator. This analog-to-digital converter achieves a dynamic range of 113 dB and a total harmonic distortion +N of 105 dB. It is fabricated in a 0.35-/spl mu/m CMOS process with a die size of 10.5 mm/sup 2/.