scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the Acoustical Society of America in 1998"



Journal ArticleDOI
TL;DR: The most important parameter of the microperforated panel (MPP) is found to be the perforate constant k which is proportional to the ratio of the perfusion radius to the viscous boundary layer thickness inside the holes as discussed by the authors, and this, together with the relative (to the characteristic acoustic impedance in air) acoustic resistance r and the frequency f0 of maximum absorption of the MPP absorber, decides the entire structure and its frequency characteristics.
Abstract: Many applications have been found for the microperforated panel (MPP) absorber, on which the perforations are reduced to submillimeter size so that they themselves will provide enough acoustic resistance and also sufficiently low acoustic mass reactance necessary for a wide-band sound absorber. The most important parameter of the MPP is found to be the perforate constant k which is proportional to the ratio of the perforation radius to the viscous boundary layer thickness inside the holes. This, together with the relative (to the characteristic acoustic impedance in air) acoustic resistance r and the frequency f0 of maximum absorption of the MPP absorber, decides the entire structure of the MPP absorber and its frequency characteristics. In other words, the MPP absorber may be designed according to the required absorbing characteristics in terms of the parameters k, r, and f0. Formulas and curves are presented toward this end. It is shown that the MPP absorber has tremendous potential for wide-band absorp...

832 citations


Journal ArticleDOI
TL;DR: A method is presented for using a small number of bandpass filters and banks of parallel comb filters to analyze the tempo of, and extract the beat from, musical signals of arbitrary polyphonic complexity and containing arbitrary timbres that can be used predictively to guess when beats will occur in the future.
Abstract: A method is presented for using a small number of bandpass filters and banks of parallel comb filters to analyze the tempo of, and extract the beat from, musical signals of arbitrary polyphonic complexity and containing arbitrary timbres. This analysis is performed causally, and can be used predictively to guess when beats will occur in the future. Results in a short validation experiment demonstrate that the performance of the algorithm is similar to the performance of human listeners in a variety of musical situations. Aspects of the algorithm are discussed in relation to previous high-level cognitive models of beat tracking.

682 citations


Journal ArticleDOI
TL;DR: In this paper, the authors presented synthesis curves for the relationship between DNL and percentage highly annoyed for three transportation noise sources, including aircraft, road traffic, and railway noise, based on all 21 datasets examined by Schultz and Fidell et al. and augmented with 34 datasets.
Abstract: This article presents synthesis curves for the relationship between DNL and percentage highly annoyed for three transportation noise sources. The results are based on all 21 datasets examined by Schultz [J. Acoust. Soc. Am. 64, 377-405 (1978)] and Fidell et al. [J. Acoust. Soc. Am. 89, 221-233 (1991)] for which acceptable DNL and percentage highly annoyed measure could be derived, augmented with 34 datasets. Separate, nonidentical curves were found for aircraft, road traffic, and railway noise. A difference between sources was found using data for all studies combined and for only those studies in which respondents evaluated two sources. The latter outcome strengthens the conclusion that the differences between sources cannot be explained by differences in study methodology.

590 citations


Journal ArticleDOI
TL;DR: In this paper, a phase conjugate array was implemented to spatially and temporally refocus an incident acoustic field back to its origin in the Mediterranean Sea by transmitting a 50-ms pulse from the SRT to the SRA, digitizing the received signal and retransmitting the time reversed signals from all the sources of the sRA.
Abstract: An experiment conducted in the Mediterranean Sea in April 1996 demonstrated that a time-reversal mirror (or phase conjugate array) can be implemented to spatially and temporally refocus an incident acoustic field back to its origin. The experiment utilized a vertical source–receiver array (SRA) spanning 77 m of a 125-m water column with 20 sources and receivers and a single source/receiver transponder (SRT) colocated in range with another vertical receive array (VRA) of 46 elements spanning 90 m of a 145-m water column located 6.3 km from the SRA. Phase conjugation was implemented by transmitting a 50-ms pulse from the SRT to the SRA, digitizing the received signal and retransmitting the time reversed signals from all the sources of the SRA. The retransmitted signal then was received at the VRA. An assortment of runs was made to examine the structure of the focal point region and the temporal stability of the process. The phase conjugation process was extremely robust and stable, and the experimental results were consistent with theory.

568 citations


Journal ArticleDOI
TL;DR: It was found that neither the absolute level of feedback intensity nor the presence of pink masking noise significantly affect magnitude or latency of the voice F0 response, and the existence of a second F1 response with a longer latency than the first was suggested.
Abstract: Recent studies have shown that when phonating subjects hear their voice pitch feedback shift upward or downward, they respond with a change in voice fundamental frequency (F0) output. Three experiments were performed to improve our understanding of this response and to explore the effects of different stimulus variables on voice F0 responses to pitch-shift stimuli. In experiment 1, it was found that neither the absolute level of feedback intensity nor the presence of pink masking noise significantly affect magnitude or latency of the voice F0 response. In experiment 2, changes in stimulus magnitude led to no systematic differences in response magnitudes or latencies. However, as stimulus magnitude was increased from 25 to 300 cents, the proportion of responses that changed in the direction opposite that of the stimulus (“opposing” response) decreased. A corresponding increase was observed in the proportion of same direction responses (“following” response). In experiment 3, increases in pitch-shift stimulus durations from 20 to 100 ms led to no differences in the F0 response. Durations between 100 and 500 ms led to longer duration voice F0 responses with greater response magnitude, and suggested the existence of a second F0 response with a longer latency than the first.

482 citations


PatentDOI
TL;DR: In this paper, high intensity focused ultrasound (HIFU) is used to form cauterized tissue regions prior to surgical incision, for instance, forming a cauterised tissue shell around a tumor to be removed.
Abstract: Methods and apparatus for enabling substantially bloodless surgery and for stemming hemorrhaging. High intensity focused ultrasound (“HIFU”) is used to form cauterized tissue regions prior to surgical incision, for example, forming a cauterized tissue shell around a tumor to be removed. The procedure is referred to as “presurgical volume cauterization.” In one embodiment, the method is particularly effective for use in surgical lesion removal or resection of tissue having a highly vascularized constitution, such as the liver or spleen, and thus a propensity for hemorrhaging. In further embodiments, methods and apparatus for hemostasis using HIFU is useful in both surgical, presurgical, and medical emergency situations. In an apparatus embodiment, a telescoping, acoustic coupler is provided such that depth of focus of the HIFU energy is controllable. In other embodiments, apparatus characterized by portability are demonstrated, useful for emergency medical situations.

434 citations


PatentDOI
TL;DR: An underfluid ultrasound imaging catheter system as discussed by the authors includes a catheter having a distal end inserted into an under-fluid structure, an ultrasonic transducer array mounted proximate the distal-end of the catheter wherein the array has a row of individual transducers, and a lens mounted on the array for defocusing ultrasound beams in a direction perpendicular to an axis of the array.
Abstract: An underfluid ultrasound imaging catheter system includes a catheter having a distal end inserted into an underfluid structure, an ultrasonic transducer array mounted proximate the distal end of the catheter wherein the array has a row of individual transducer crystals, a lens mounted on the array for defocusing ultrasound beams in a direction perpendicular to an axis of the array so as to provide a volumetric field of view within which the underfluid features are imaged. Alternatively, the single row of transducer crystals is replaced by multiple rows of transducer crystals so as to provide a volumetric field of view. This imaging catheter system helps an operator see 3-dimensional images of an underfluid environment, such as the 3-dimensional images of fluid-filled cavities of heart, blood vessel, urinary bladder, etc. Features in such wide volumetric field of view can be imaged, measured, or intervened by an underfluid therapeutic device with an aid of the real-time image.

410 citations


Patent
TL;DR: In this paper, a system and method for developing interactive speech applications stores a plurality of dialogue modules in a speech processing system, wherein each dialogue module includes computer readable instructions for accomplishing a predefined interactive dialogue task in an interactive speech application.
Abstract: The disclosed system and method for developing interactive speech applications stores a plurality of dialogue modules in a speech processing system, wherein each dialogue module includes computer readable instructions for accomplishing a predefined interactive dialogue task in an interactive speech application. In response to user input (Figure 7, S1), a subset of the plurality of dialogue modules (Figure 7, 710, 720, 730) are selected to accomplish their respective interactive dialogue tasks and are interconnected in an order defining the call flow of the application (Figure 1, 110-180). A graphical user interface is disclosed, representing the stored plurality of dialogue modules as icons in a graphical display (Figure 7) in which icons are selected in the graphical display in response to user input, the icons for the subset of dialogue modules are graphically interconnected and the interactive speech application is generated based upon the graphical representation.

398 citations


Journal ArticleDOI
TL;DR: In this article, a theoretical and experimental investigation of the head-related transfer function (HRTF) for an ideal rigid sphere was performed, and an algorithm was developed for computing the variation in sound pressure at the surface of the sphere as a function of direction and range.
Abstract: The head-related transfer function (HRTF) varies with range as well as with azimuth and elevation. To better understand its close-range behavior, a theoretical and experimental investigation of the HRTF for an ideal rigid sphere was performed. An algorithm was developed for computing the variation in sound pressure at the surface of the sphere as a function of direction and range to the sound source. The impulse response was also measured experimentally. The results may be summarized as follows. First, the experimental measurements were in close agreement with the theoretical solution. Second, the variation of low-frequency interaural level difference with range is significant for ranges smaller than about five times the sphere radius. Third, the impulse response reveals the source of the ripples observed in the magnitude response, and provides direct evidence that the interaural time difference is not a strong function of range. Fourth, the time delay is well approximated by well-known ray-tracing formula due to Woodworth and Schlosberg. Finally, except for this time delay, the HRTF for the ideal sphere appears to be minimum-phase, permitting exact recovery of the impulse response from the magnitude response in the frequency domain.

385 citations


Journal ArticleDOI
TL;DR: It is suggested that age-related factors other than peripheral hearing loss contribute to temporal processing deficits of elderly listeners.
Abstract: Measures of monaural temporal processing and binaural sensitivity were obtained from 12 young (mean age = 26.1 years) and 12 elderly (mean age = 70.9 years) adults with clinically normal hearing (pure-tone thresholds < or = 20 dB HL from 250 to 6000 Hz). Monaural temporal processing was measured by gap detection thresholds. Binaural sensitivity was measured by interaural time difference (ITD) thresholds. Gap and ITD thresholds were obtained at three sound levels (4, 8, or 16 dB above individual threshold). Subjects were also tested on two measures of speech perception, a masking level difference (MLD) task, and a syllable identification/discrimination task that included phonemes varying in voice onset time (VOT). Elderly listeners displayed poorer monaural temporal analysis (higher gap detection thresholds) and poorer binaural processing (higher ITD thresholds) at all sound levels. There were significant interactions between age and sound level, indicating that the age difference was larger at lower stimulus levels. Gap detection performance was found to correlate significantly with performance on the ITD task for young, but not elderly adult listeners. Elderly listeners also performed more poorly than younger listeners on both speech measures; however, there was no significant correlation between psychoacoustic and speech measures of temporal processing. Findings suggest that age-related factors other than peripheral hearing loss contribute to temporal processing deficits of elderly listeners.

Journal ArticleDOI
TL;DR: This collection of essays explores neural networks applications in signal and image processing, function and estimation, robotics and control, associative memories, and electrical and optical networks.

Journal ArticleDOI
TL;DR: Results indicate that audibility cannot adequately explain speech recognition of many hearing-impaired listeners, and suggest that for people with severe or profound losses at the high frequencies, amplification should only achieve a low or zero sensation level at this region, contrary to the implications of the unmodified SII.
Abstract: Two experiments were conducted to examine the relationship between audibility and speech recognition for individuals with sensorineural hearing losses ranging from mild to profound degrees. Speech scores measured using filtered sentences were compared to predictions based on the Speech Intelligibility Index (SII). The SII greatly overpredicted performance at high sensation levels, and for many listeners, it underpredicted performance at low sensation levels. To improve predictive accuracy, the SII needed to be modified. Scaling the index by a multiplicative proficiency factor was found to be inappropriate, and alternative modifications were explored. The data were best fitted using a method that combined the standard level distortion factor (which accounted for decrease in speech intelligibility at high presentation levels based on measurements of normal-hearing people) with individual frequency-dependent proficiency. This method was evaluated using broadband sentences and nonsense syllables tests. Results indicate that audibility cannot adequately explain speech recognition of many hearing-impaired listeners. Considerable variations from audibility-based predictions remained, especially for people with severe losses listening at high sensation levels. The data suggest that, contrary to the basis of the SII, information contained in each frequency band is not strictly additive. The data also suggest that for people with severe or profound losses at the high frequencies, amplification should only achieve a low or zero sensation level at this region, contrary to the implications of the unmodified SII.

Journal ArticleDOI
TL;DR: The best cochlear implant user showed similar performance with the CIS strategy in quiet and in noise to that of normal-hearing listeners when listening to correspondingly spectrally degraded speech, suggesting that the noise susceptibility of co chlear implant users is at least partly due to the loss of spectral resolution.
Abstract: Current multichannel cochlear implant devices provide high levels of speech performance in quiet. However, performance deteriorates rapidly with increasing levels of background noise. The goal of this study was to investigate whether the noise susceptibility of cochlear implant users is primarily due to the loss of fine spectral information. Recognition of vowels and consonants was measured as a function of signal-to-noise ratio in four normal-hearing listeners in conditions simulating cochlear implants with both CIS and SPEAK-like strategies. Six conditions were evaluated: 3-, 4-, 8-, and 16-band processors (CIS-like), a 6/20 band processor (SPEAK-like), and unprocessed speech. Recognition scores for vowels and consonants decreased as the S/N level worsened in all conditions, as expected. Phoneme recognition threshold (PRT) was defined as the S/N at which the recognition score fell to 50% of its level in quiet. The unprocessed speech had the best PRT, which worsened as the number of bands decreased. Recognition of vowels and consonants was further measured in three Nucleus-22 cochlear implant users using either their normal SPEAK speech processor or a custom processor with a four-channel CIS strategy. The best cochlear implant user showed similar performance with the CIS strategy in quiet and in noise to that of normal-hearing listeners when listening to correspondingly spectrally degraded speech. These findings suggest that the noise susceptibility of cochlear implant users is at least partly due to the loss of spectral resolution. Efforts to improve the effective number of spectral information channels should improve implant performance in noise.

Journal ArticleDOI
TL;DR: There was a clear pattern in the results suggesting that as the degree of hearing loss at a given frequency increased beyond 55 dB HL, the efficacy of providing additional audibility to that frequency region was diminished, especially when this degree of Hearing loss was present at frequencies of 4000 Hz and above.
Abstract: The present study was a systematic investigation of the benefit of providing hearing-impaired listeners with audible high-frequency speech information. Five normal-hearing and nine high-frequency hearing-impaired listeners identified nonsense syllables that were low-pass filtered at a number of cutoff frequencies. As a means of quantifying audibility for each condition, Articulation Index (AI) was calculated for each condition for each listener. Most hearing-impaired listeners demonstrated an improvement in speech recognition as additional audible high-frequency information was provided. In some cases for more severely impaired listeners, increasing the audibility of high-frequency speech information resulted in no further improvement in speech recognition, or even decreases in speech recognition. A new measure of how well hearing-impaired listeners used information within specific frequency bands called "efficiency" was devised. This measure compared the benefit of providing a given increase in speech audibility to a hearing-impaired listener to the benefit observed in normal-hearing listeners for the same increase in speech audibility. Efficiencies were calculated using the old AI method and the new AI method (which takes into account the effects of high speech presentation levels). There was a clear pattern in the results suggesting that as the degree of hearing loss at a given frequency increased beyond 55 dB HL, the efficacy of providing additional audibility to that frequency region was diminished, especially when this degree of hearing loss was present at frequencies of 4000 Hz and above. A comparison of analyses from the "old" and "new" AI procedures suggests that some, but not all, of the deficiencies of speech recognition in these listeners was due to high presentation levels.

Journal ArticleDOI
TL;DR: Integration modeling results suggested that speechreading and AV integration training could be useful for some individuals, potentially providing as much as 26% improvement in AV consonant recognition.
Abstract: Factors leading to variability in auditory-visual (AV) speech recognition include the subject's ability to extract auditory (A) and visual (V) signal-related cues, the integration of A and V cues, and the use of phonological, syntactic, and semantic context. In this study, measures of A, V, and AV recognition of medial consonants in isolated nonsense syllables and of words in sentences were obtained in a group of 29 hearing-impaired subjects. The test materials were presented in a background of speech-shaped noise at 0-dB signal-to-noise ratio. Most subjects achieved substantial AV benefit for both sets of materials relative to A-alone recognition performance. However, there was considerable variability in AV speech recognition both in terms of the overall recognition score achieved and in the amount of audiovisual gain. To account for this variability, consonant confusions were analyzed in terms of phonetic features to determine the degree of redundancy between A and V sources of information. In addition, a measure of integration ability was derived for each subject using recently developed models of AV integration. The results indicated that (1) AV feature reception was determined primarily by visual place cues and auditory voicing + manner cues, (2) the ability to integrate A and V consonant cues varied significantly across subjects, with better integrators achieving more AV benefit, and (3) significant intra-modality correlations were found between consonant measures and sentence measures, with AV consonant scores accounting for approximately 54% of the variability observed for AV sentence recognition. Integration modeling results suggested that speechreading and AV integration training could be useful for some individuals, potentially providing as much as 26% improvement in AV consonant recognition.

Journal ArticleDOI
TL;DR: A simplified expression for predicting the absorption of sound in sea water, retaining the essential dependence on temperature, pressure, salinity, and acidity of the more complicated formula on which it is based, was presented in this paper.
Abstract: A simplified expression is presented for predicting the absorption of sound in sea water, retaining the essential dependence on temperature, pressure, salinity, and acidity of the more complicated formula on which it is based [R. E. Francois and G. R. Garrison, “Sound absorption based on ocean measurements. Part II: Boric acid contribution and equation for total absorption,” J. Acoust. Soc. Am. 72, 1879–1890 (1982)]. The accuracy of the simplified formula is demonstrated by comparison with the original one for a range of oceanographic conditions and frequencies between 100 Hz and 1 MHz.

Patent
Leon Lumelsky1
TL;DR: In this paper, an information signal content authoring system is presented, which includes a speech analyzer, responsive to a spoken utterance signal provided by a narrator, generating a speech signal representative of one or more prosodic parameters associated with the narrator.
Abstract: An information signal content authoring system is provided. The authoring system includes a speech analyzer, responsive to a spoken utterance signal provided by a narrator. The spoken utterance signal is representative of information available to the narrator. The speech analyzer generates a speech signal representative of one or more prosodic parameters associated with the narrator. A text-to-speech converter, responsive to a text signal representative of the information available to the narrator, generates a phonetic representation signal from the text signal and synthesizes a speech signal from the text signal. The text-to-speech converter also generates one or more prosodic parameters from the text signal. A spectrum comparator, operatively coupled to the speech analyzer and the text-to-speech converter, compares the spectral parameters of the speech signal generated by the speech analyzer to the speech signal synthesized by the converter and generates a variance signal indicative of a spectral distance between the two speech signals. The variance signal is provided to the text-to-speech converter to adjust the prosodic parameters. An output portion, operatively coupled to the text-to-speech converter, outputs the phonetic representation signal and the prosodic parameters from the converter as a composite encoded signal representative of information content available to the narrator. The output portion further preferably includes an editor, response to editing commands issued by the narrator, for editing at least a portion of the composite encoded signal.

Journal ArticleDOI
TL;DR: In this paper, a novel exact mixed displacement pressure (u¯,p) formulation is presented, which derives directly from Biot's poroelasticity equations and has the form of a classical coupled fluid-structure problem involving the dynamic equations of the skeleton in vacuo and the equivalent fluid in the rigid skeleton limit.
Abstract: Recently, finite element models based on Biot’s displacement (u¯,U¯) formulation for poroelastic materials have been extensively used to predict the acoustical and structural behavior of multilayer structures. These models while accurate lead to large frequency dependent matrices for three-dimensional problems necessitating important setup time, computer storage and solution time. In this paper, a novel exact mixed displacement pressure (u¯,p) formulation is presented. The formulation derives directly from Biot’s poroelasticity equations. It has the form of a classical coupled fluid-structure problem involving the dynamic equations of the skeleton in vacuo and the equivalent fluid in the rigid skeleton limit. The governing (u¯,p) equations and their weak integral form are given together with the coupling conditions with acoustic media. The numerical implementation of the presented approach in a finite element code is discussed. Examples are presented to show the accuracy and effectiveness of the presented formulation.

Journal ArticleDOI
TL;DR: A class of cochlear models which account for much of the characteristic variation with frequency of human otoacoustic emissions and hearing threshold microstructure is presented and successfully describes in particular the characteristic quasiperiodic frequency variations (fine structures) of the hearing threshold.
Abstract: A class of cochlear models which account for much of the characteristic variation with frequency of human otoacoustic emissions and hearing threshold microstructure is presented. The models are based upon wave reflections via distributed spatial cochlear inhomogeneities and tall and broad cochlear activity patterns, as suggested by Zweig and Shera [J. Acoust. Soc. Am. 98, 2018–2047 (1995)]. They successfully describe in particular the following features: (1) the characteristic quasiperiodic frequency variations (fine structures) of the hearing threshold, synchronous and click-evoked emissions, distortion-product emissions, and spontaneous emissions; (2) the relationships between these fine structures; and (3) the distortion product emission filter shape. All of the characteristic frequency spacings are approximately the same (0.4 bark) and are mainly determined by the phase behavior of the apical reflection function. The frequency spacings for spontaneous emissions and threshold microstructure are predicted to be the same, but some deviations from these values are predicted for synchronous and click-evoked and distortion-product emissions. The analysis of models is aided considerably by the use of the solutions of apical, and basal, moving solutions (basis functions) of the cochlear wave equation in the absence of inhomogeneities.

Journal ArticleDOI
TL;DR: The performance of two techniques is compared for automated recognition of bird song units from continuous recordings, and one gives excellent to satisfactory performance and the other requires careful selection of templates.
Abstract: The performance of two techniques is compared for automated recognition of bird song units from continuous recordings. The advantages and limitations of dynamic time warping (DTW) and hidden Markov models (HMMs) are evaluated on a large database of male songs of zebra finches (Taeniopygia guttata) and indigo buntings (Passerina cyanea), which have different types of vocalizations and have been recorded under different laboratory conditions. Depending on the quality of recordings and complexity of song, the DTW-based technique gives excellent to satisfactory performance. Under challenging conditions such as noisy recordings or presence of confusing short-duration calls, good performance of the DTW-based technique requires careful selection of templates that may demand expert knowledge. Because HMMs are trained, equivalent or even better performance of HMMs can be achieved based only on segmentation and labeling of constituent vocalizations, albeit with many more training examples than DTW templates. One weakness in HMM performance is the misclassification of short-duration vocalizations or song units with more variable structure (e.g., some calls, and syllables of plastic songs). To address these and other limitations, new approaches for analyzing bird vocalizations are discussed.

Journal ArticleDOI
TL;DR: It is shown here how to extend the time reversal process to aberrating and absorbing layers, like the skull bone, located at any distance from the array of transducers.
Abstract: The time-reversal process is applied to focus pulsed ultrasonic waves through the human skull bone. The aim here is to treat brain tumors, which are difficult to reach with classical surgery means. Such a surgical application requires precise control of the size and location of the therapeutic focal beam. The severe ultrasonic attenuation in the skull reduces the efficiency of the time reversal process. Nevertheless, an improvement of the time reversal process in absorbing media has been investigated and applied to the focusing through the skull [J.-L. Thomas and M. Fink, IEEE Trans. Ultrason. Ferroelectr. Freq. Control 43, 1122–1129 (1996)]. Here an extension of this technique is presented in order to focus on a set of points surrounding an initial artificial source implanted in the tissue volume to treat. From the knowledge of the Green’s function matched to this initial source location a new Green’s function matched to various points of interest is deduced in order to treat the whole volume. In a homog...

Journal ArticleDOI
TL;DR: It is concluded that for a clinical application and prediction of the hearing threshold, DPs should be measured not only at high, but also at lower primary tone levels.
Abstract: The 2 f1-f2 distortion product otoacoustic emission (DP) was measured in 20 normal hearing subjects and 15 patients with moderate cochlear hearing loss and compared to the pure-tone hearing threshold, measured with the same probe system at the f2 frequencies. DPs were elicited over a wide primary tone level range between L2 = 20 and 65 dB SPL. With decreasing L2, the L1-L2 primary tone level difference was continuously increased according to L1 = 0.4L2 + 39 dB, to account for differences of the primary tone responses at the f2 place. Above 1.5 kHz, DPs were measurable with that paradigm on average within 10 dB of the average hearing threshold in both subject groups. The growth of the DP was compressive in normal hearing subjects, with strong saturation at moderate primary tone levels. In cases of cochlear impairment, reductions of the DP level were greatest at lowest, but smallest at highest stimulus levels, such that the growth of the DP became linearized. The correlation of the DP level to the hearing threshold was found to depend on the stimulus level. Maximal correlations were found in impaired ears at moderate primary tone levels around L2 = 45 dB SPL, but at lowest stimulus levels in normal hearing (L2 = 25 dB SPL). At these levels, 17/20 impaired ears and 14/15 normally hearing ears showed statistically significant correlations. It is concluded that for a clinical application and prediction of the hearing threshold, DPs should be measured not only at high, but also at lower primary tone levels.

PatentDOI
TL;DR: In this article, a communications earpiece is described which can be used by hearing impaired and non-hearing impaired users so as to be able to communicate with an external device such as personal communications node or cellular phone via a wireless link.
Abstract: A communications earpiece (20) is disclosed which can be used by hearing impaired and non-hearing impaired users so as to be able to communicate with an external device such as personal communications node or cellular phone (110) via a wireless link. A communications earpiece comprises an ear canal tube (22) sized for positioning in an ear canal of a user so that the ear canal is at least partially open for directly receiving ambient sounds. A sound processor (32) amplifies the received ambient sounds to produce a processed analog signal. The processed analog signal is then converted into digital signals (37) and transmitted to a remote unit via a wireless link. The earpiece also receives signals from the remote unit which are then processed and applied to a speaker (36) in the earpiece.

Journal ArticleDOI
TL;DR: In this paper, the authors measured the sound produced by individual snapping shrimp in a small cage located 1 m from an H-52 broadband hydrophone and calculated the acoustic power produced by a typical snap.
Abstract: Snapping shrimp are among the major sources of biological noise in shallow bays, harbors, and inlets, in temperate and tropical waters. Snapping shrimp sounds can severely limit the use of underwater acoustics by humans and may also interfere with the transmission and reception of sounds by other animals such as dolphins, whales, and pinnipeds. The shrimp produce sounds by rapidly closing one of their frontal chela (claws), snapping the ends together to generate a loud click. The acoustics of the species Synalpheus paraneomeris was studied by measuring the sound produced by individual shrimp housed in a small cage located 1 m from an H-52 broadband hydrophone. Ten clicks from 40 specimens were digitized at a 1-MHz sample rate and the data stored on computer disk. A low-frequency precursor signature was observed; this previously unreported signature may be associated with a “plunger” structure which directs a jet of water forward of the claw during a snap. The peak-to-peak sound pressure level and energy flux density at 1 m (source level and source energy flux density) varied linearly with claw size and body length. Peak-to-peak source levels varied from 183 to 189 dB re: 1 μPa. The acoustic power produced by a typical snap was calculated to be about 3 W. A typical spectrum of a click had a low-frequency peak between 2 and 5 kHz and energy extending out to 200 kHz. The spectrum of a click is very broad with only a 20-dB difference between the peak and minimum amplitudes across 200 kHz. A physical model of the snapping mechanism is used to estimate the velocity, acceleration, and force produced by a shrimp closing its claws.

PatentDOI
TL;DR: In this paper, an apparatus and method in a video conference system provides accurate determination of the position of a speaking participant by measuring the difference in arrival times of a sound originating from the speaking participant, using as few as four microphones in a 3D configuration.
Abstract: An apparatus and method in a video conference system provides accurate determination of the position of a speaking participant by measuring the difference in arrival times of a sound originating from the speaking participant, using as few as four microphones in a 3-dimensional configuration. In one embodiment, a set of simultaneous equations relating the position of the sound source and each microphone and relating to the distance of each microphone to each other are solved off-line and programmed into a host computer. In one embodiment, the set of simultaneous equations provide multiple solutions and the median of such solutions is picked as the final position. In another embodiment, an average of the multiple solutions are provided as the final position.

Journal ArticleDOI
TL;DR: The spatial variation in scala tympani pressure indicated that the pressure is composed of two modes, which can be identified with fast and slow waves, and the impedance was found to be tuned in frequency.
Abstract: Intracochlear pressure was measured in vivo in the base of the gerbil cochlea. The measurements were made over a wide range of frequencies simultaneously in scalae vestibuli and tympani. Pressure was measured just adjacent to the stapes in scala vestibuli and at a number of positions spaced by tens of micrometers, including a position within several micrometers of the basilar membrane, in scala tympani. Two findings emerged from the basic results. First, the spatial variation in scala tympani pressure indicated that the pressure is composed of two modes, which can be identified with fast and slow waves. Second, at frequencies between 2 and 46 kHz (the upper frequency limit of the measurements) the scala vestibuli pressure adjacent to the stapes had a gain of approximately 30 dB with respect to the pressure in the ear canal, and a phase which decreased linearly with frequency. Thus, over these frequencies the middle ear and its termination in the cochlea operate as a frequency independent transmission line. A subset of the data was analyzed further to derive the velocity of the basilar membrane, the pressure difference across the organ of Corti complex (defined to include the tectorial and basilar membranes) and the specific acoustic impedance of the organ of Corti complex. The impedance was found to be tuned in frequency.

Journal ArticleDOI
TL;DR: In this paper, the authors used acoustic attenuation spectra to determine resonance frequencies of shell-encapsulated gas bubbles with a bulk modulus 700 kPa and found that when exposed to hydrostatic overpressures mimicking those found in vivo during the systolic heart cycle, the resonance frequency increased, as expected by the particles' increased stiffness.
Abstract: Nycomed’s ultrasound contrast agent NC100100 has been investigated by in vitro acoustic measurements. Acoustic attenuation spectra were used to determine resonance frequencies of the particles. The spectra were correlated with size distributions, and it was found that the shell‐encapsulated gas bubbles can be described as viscoelastic particles with bulk modulus 700 kPa. When exposed to hydrostatic overpressures mimicking those found in vivo during the systolic heart cycle, the resonance frequency increased, as expected by the particles’ increased stiffness. This effect was reversible: After the pressure was released, the particles went back to giving the original attenuation spectrum. This shows that the particles are not destroyed or otherwise changed by the pressure. Acoustic backscatter measured as a function of distance through a contrast agent was used to estimate the backscatter efficiency of the particles, that is, the ratio between scattered and absorbed ultrasound. Results from these measurements agree with theoretical estimates based on the attenuation spectra. Measurements on NC100100 were compared with earlier results from measurements on Albunex® and measurements on an experimental polymer‐encapsulated contrast agent, showing how different shell materials cause differences in particle stability and stiffness.

PatentDOI
TL;DR: In this article, a system and method of operating an automatic speech recognition service using a client-server architecture is used to make ASR services accessible at a client location remote from the location of the main ASR engine.
Abstract: A system and method of operating an automatic speech recognition service using a client-server architecture is used to make ASR services accessible at a client location remote from the location of the main ASR engine. The present invention utilizes client-server communications over a packet network, such as the Internet, where the ASR server receives a grammar from the client, receives information representing speech from the client, performs speech recognition, and returns information based upon the recognized speech to the client.

PatentDOI
Hsiao-Wuen Hon1, Dong Li1, Xuedong Huang1, Yun-Chen Ju1, Xianghui Sean Zhang1 
TL;DR: A computer implemented system and method of proofreading text in a computer system includes receiving text from a user into a text editing module as discussed by the authors, at least a portion of the text is converted to an audio signal upon the detection of an indicator, the indicator defining a boundary in the text by either being embodied therein or comprising delays in receiving text.
Abstract: A computer implemented system and method of proofreading text in a computer system includes receiving text from a user into a text editing module. At least a portion of the text is converted to an audio signal upon the detection of an indicator, the indicator defining a boundary in the text by either being embodied therein or comprising delays in receiving text. The audio signal is played through a speaker to the user to provide feedback.