scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the Acoustical Society of America in 2002"


Journal ArticleDOI
TL;DR: An algorithm is presented for the estimation of the fundamental frequency (F0) of speech or musical sounds, based on the well-known autocorrelation method with a number of modifications that combine to prevent errors.
Abstract: An algorithm is presented for the estimation of the fundamental frequency (F0) of speech or musical sounds. It is based on the well-known autocorrelation method with a number of modifications that combine to prevent errors. The algorithm has several desirable features. Error rates are about three times lower than the best competing methods, as evaluated over a database of speech recorded together with a laryngograph signal. There is no upper limit on the frequency search range, so the algorithm is suited for high-pitched voices and music. The algorithm is relatively simple and may be implemented efficiently and with low latency, and it involves few parameters that must be tuned. It is based on a signal model (periodic signal) that may be extended in several ways to handle various forms of aperiodicity that occur in particular applications. Finally, interesting parallels may be drawn with models of auditory processing.

1,975 citations


PatentDOI
TL;DR: In this paper, an ultrasonic dissection and coagulation system for surgical use is described, which includes a transducer, a control module, and a remote actuator.
Abstract: An ultrasonic dissection and coagulation system for surgical use is provided. The system includes an ultrasonic instrument, a control module, and a remote actuator. The ultrasonic instrument has a housing and an elongated body portion extending from the housing. An ultrasonic transducer supported within the housing is operatively connected to a cutting jaw by a vibration coupler. The vibration coupler conducts high frequency vibration from the ultrasonic transducer to the cutting jaw. The cutting jaw has a blade surface which is curved downwardly and outwardly in the distal direction with respect to the longitudinal axis of the elongated body portion along its length such that an angle defined by a line drawn tangent to the blade surface and the longitudinal axis of the elongated body portion varies between 5 degrees and 75 degrees. A clamp member having a tissue contact surface is positioned adjacent to the cutting jaw and is movable from an open position in which the tissue contact surface is spaced form the blade surface to a clamped position in which the tissue contact surface is in close juxtaposed alignment with the blade surface to clamp tissue therebetween. The clamp member and the curved cutting jaw combine to enhance contact between tissue and the blade surface of the cutting jaw during cutting. Further, the continuously varying angle of the blade surface with respect to the longitudinal axis of the elongated body portion facilitates selective user control over the application of force on tissue during a cutting or dissecting procedure.

608 citations


Journal ArticleDOI
TL;DR: A model in which the acoustic speech signal is processed to yield a discrete representation of the speech stream in terms of a sequence of segments, each of which is described by a set (or bundle) of binary distinctive features.
Abstract: This article describes a model in which the acoustic speech signal is processed to yield a discrete representation of the speech stream in terms of a sequence of segments, each of which is described by a set (or bundle) of binary distinctive features. These distinctive features specify the phonemic contrasts that are used in the language, such that a change in the value of a feature can potentially generate a new word. This model is a part of a more general model that derives a word sequence from this feature representation, the words being represented in a lexicon by sequences of feature bundles. The processing of the signal proceeds in three steps: (1) Detection of peaks, valleys, and discontinuities in particular frequency ranges of the signal leads to identification of acoustic landmarks. The type of landmark provides evidence for a subset of distinctive features called articulator-free features (e.g., [vowel], [consonant], [continuant]). (2) Acoustic parameters are derived from the signal near the landmarks to provide evidence for the actions of particular articulators, and acoustic cues are extracted by sampling selected attributes of these parameters in these regions. The selection of cues that are extracted depends on the type of landmark and on the environment in which it occurs. (3) The cues obtained in step (2) are combined, taking context into account, to provide estimates of “articulator-bound” features associated with each landmark (e.g., [lips], [high], [nasal]). These articulator-bound features, combined with the articulator-free features in (1), constitute the sequence of feature bundles that forms the output of the model. Examples of cues that are used, and justification for this selection, are given, as well as examples of the process of inferring the underlying features for a segment when there is variability in the signal due to enhancement gestures (recruited by a speaker to make a contrast more salient) or due to overlap of gestures from neighboring segments.

514 citations


Patent
TL;DR: A hand-held ultrasound system includes integrated electronics within an ergonomic housing as mentioned in this paper, which communicate with a host computer using an industry standard high speed serial bus, and allows a user to gather ultrasonic data on a standard user computing device such as a PC and employ the data so gathered via an independent external application without requiring a custom system, expensive hardware modifications, or system rebuilds.
Abstract: A hand-held ultrasound system includes integrated electronics within an ergonomic housing. The electronics includes control circuitry, beamforming and circuitry transducer drive circuitry. The electronics communicate with a host computer using an industry standard high speed serial bus. The ultrasonic imaging system is operable on a standard, commercially available, user computing device without specific hardware modifications, and is adapted to interface with an external application without modification to the ultrasonic imaging system to allow a user to gather ultrasonic data on a standard user computing device such as a PC, and employ the data so gathered via an independent external application without requiring a custom system, expensive hardware modifications, or system rebuilds. An integrated interface program allows such ultrasonic data to be invoked by a variety of such external applications having access to the integrated interface program via a standard, predetermined platform such as visual basic or c++.

503 citations


PatentDOI
TL;DR: In this article, the authors present a method for simultaneous use of ultrasound on a probe for imaging and therapeutic purposes, where the probe limits the effects of undesirable interference noise in a display by synchronizing high intensity focused ultrasound (HIFU) waves with an imaging transducer to cause the noise to be displayed in an area of the image that does not overlap the treatment site.
Abstract: Method and apparatus for the simultaneous use of ultrasound on a probe for imaging and therapeutic purposes. The probe limits the effects of undesirable interference noise in a display by synchronizing high intensity focused ultrasound (HIFU) waves with an imaging transducer to cause the noise to be displayed in an area of the image that does not overlap the treatment site. In one embodiment, the HIFU is first energized at a low power level that does not cause tissue damage, so that the focal point of the HIFU can be identified by a change in the echogenicity of the tissue caused by the HIFU. Once the focal point is properly targeted on a desired treatment site, the power level is increased to a therapeutic level. The location of each treatment site is stored and displayed to the user to enable a plurality of spaced-apart treatment sites to be achieved. As the treatment progresses, any changes in the treatment site can be seen in the real time, noise-free image. A preferred application of the HIFU waves is to cause lesions in blood vessels, so that the supply of nutrients and oxygen to a region, such as a tumor, is interrupted. The tumor will thus eventually be destroyed. In a preferred embodiment, the HIFU is used to treat disorders of the female reproductive system, such as uterine fibroids. The HIFU treatment can be repeated at spaced-apart intervals, until any remaining fibroid tissue is destroyed.

491 citations


Journal ArticleDOI
TL;DR: An overview of the acoustics of friction is presented by covering friction sounds, friction-induced vibrations and waves in solids, and descriptions of other frictional phenomena related to acoustic.
Abstract: This article presents an overview of the acoustics of friction by covering friction sounds, friction-induced vibrations and waves in solids, and descriptions of other frictional phenomena related to acoustics. Friction, resulting from the sliding contact of solids, often gives rise to diverse forms of waves and oscillations within solids which frequently lead to radiation of sound to the surrounding media. Among the many everyday examples of friction sounds, violin music and brake noise in automobiles represent the two extremes in terms of the sounds they produce and the mechanisms by which they are generated. Of the multiple examples of friction sounds in nature, insect sounds are prominent. Friction also provides a means by which energy dissipation takes place at the interface of solids. Friction damping that develops between surfaces, such as joints and connections, in some cases requires only microscopic motion to dissipate energy. Modeling of friction-induced vibrations and friction damping in mechanical systems requires an accurate description of friction for which only approximations exist. While many of the components that contribute to friction can be modeled, computational requirements become prohibitive for their contemporaneous calculation. Furthermore, quantification of friction at the atomic scale still remains elusive. At the atomic scale, friction becomes a mechanism that converts the kinetic energy associated with the relative motion of surfaces to thermal energy. However, the description of the conversion to thermal energy represented by a disordered state of oscillations of atoms in a solid is still not well understood. At the macroscopic level, friction interacts with the vibrations and waves that it causes. Such interaction sets up a feedback between the friction force and waves at the surfaces, thereby making friction and surface motion interdependent. Such interdependence forms the basis for friction-induced motion as in the case of ultrasonic motors and other examples. Last, when considered phenomenologically, friction and boundary layer turbulence exhibit analogous properties and, when compared, each may provide clues to a better understanding of the other.

481 citations


PatentDOI
TL;DR: In this article, a distributed voice user interface system includes a local device which receives speech input issued from a user, such speech input may specify a command or a request by the user, and the local device performs preliminary processing of the speech input and determines whether it is able to respond to the command or request by itself.
Abstract: A distributed voice user interface system includes a local device which receives speech input issued from a user. Such speech input may specify a command or a request by the user. The local device performs preliminary processing of the speech input and determines whether it is able to respond to the command or request by itself. If not, the local device initiates communication with a remote system for further processing of the speech input.

441 citations


PatentDOI
TL;DR: In this article, a medical imaging and navigation system including a processor, a display unit, a database, a medical positioning system (MPS), a two-dimensional imaging system, an inspected organ monitor interface, and a superimposing processor is described.
Abstract: Medical imaging and navigation system including a processor, a display unit, a database, a medical positioning system (MPS), a two-dimensional imaging system, an inspected organ monitor interface, and a superimposing processor, the MPS including a transducer MPS sensor and a surgical tool MPS sensor, the two-dimensional imaging system including an imaging transducer, the processor being connected to the display unit, to the database, to the MPS, to the two-dimensional imaging system, to the inspected organ monitor interface, and to the superimposing processor, the inspected organ monitor interface being connected to an organ monitor, the surgical tool MPS sensor being firmly attached to a surgical tool, the transducer MPS sensor being firmly attached to the imaging transducer, the organ monitor monitoring an organ timing signal associated with an inspected organ.

428 citations


PatentDOI
TL;DR: In this article, a single transducer is configured such that when connected to the subsystems, the therapy subsystem can generate high power acoustic energy to heat the treatment region, and the temperature monitoring subsystem can map and monitor the temperature of the treatment regions and display the temperature on the display, all through the use of the single transducers.
Abstract: An ultrasonic system useful for providing imaging, therapy and temperature monitoring generally comprises an acoustic transducer assembly configured to enable the ultrasound system to perform the imaging, therapy and temperature monitoring functions. The acoustic transducer assembly comprises a single transducer that is operatively connected to an imaging subsystem, a therapy subsystem and a temperature monitoring subsystem. The ultrasound system may also include a display for imaging and temperature monitoring functions. An exemplary single transducer is configured such that when connected to the subsystems, the imaging subsystem can generate an image of a treatment region on the display, the therapy subsystem can generate high power acoustic energy to heat the treatment region, and the temperature monitoring subsystem can map and monitor the temperature of the treatment region and display the temperature on the display, all through the use of the single transducer. Additionally, the acoustic transducer assembly can be configured to provide three-dimensional imaging, temperature monitoring or therapeutic heating through the use of adaptive algorithms and/or rotational or translational movement. Moreover, a plurality of the exemplary single transducers can be provided to facilitate enhanced treatment capabilities.

405 citations


Journal ArticleDOI
TL;DR: The virtual auditory space technique was used to quantify the relative strengths of interaural time difference (ITD), Interaural level difference (ILD), and spectral cues in determining the perceived lateral angle of wideband, low-pass, and high-pass noise bursts.
Abstract: The virtual auditory space technique was used to quantify the relative strengths of interaural time difference (ITD), interaural level difference (ILD), and spectral cues in determining the perceived lateral angle of wideband, low-pass, and high-pass noise bursts. Listeners reported the apparent locations of virtual targets that were presented over headphones and filtered with listeners' own directional transfer functions. The stimuli were manipulated by delaying or attenuating the signal to one ear (by up to 600 micros or 20 dB) or by altering the spectral cues at one or both ears. Listener weighting of the manipulated cues was determined by examining the resulting localization response biases. In accordance with the Duplex Theory defined for pure-tones, listeners gave high weight to ITD and low weight to ILD for low-pass stimuli, and high weight to ILD for high-pass stimuli. Most (but not all) listeners gave low weight to ITD for high-pass stimuli. This weight could be increased by amplitude-modulating the stimuli or reduced by lengthening stimulus onsets. For wideband stimuli, the ITD weight was greater than or equal to that given to ILD. Manipulations of monaural spectral cues and the interaural level spectrum had little influence on lateral angle judgements.

405 citations


Journal ArticleDOI
TL;DR: The phenomenon of super-resolution in time-reversal acoustics is analyzed theoretically and with numerical simulations and numerical simulations confirm the theory.
Abstract: The phenomenon of super-resolution in time-reversal acoustics is analyzed theoretically and with numerical simulations. A signal that is recorded and then retransmitted by an array of transducers, propagates back though the medium, and refocuses approximately on the source that emitted it. In a homogeneous medium, the refocusing resolution of the time-reversed signal is limited by diffraction. When the medium has random inhomogeneities the resolution of the refocused signal can in some circumstances beat the diffraction limit. This is super-resolution. A theoretical treatment of this phenomenon is given, and numerical simulations which confirm the theory are presented.

Journal ArticleDOI
TL;DR: The minimum standard deviations achievable for concurrent estimates of thresholds and psychometric function slopes as well as the optimal target values for adaptive procedures are calculated as functions of stimulus level and track length on the basis of the binomial theory.
Abstract: The minimum standard deviations achievable for concurrent estimates of thresholds and psychometric function slopes as well as the optimal target values for adaptive procedures are calculated as functions of stimulus level and track length on the basis of the binomial theory. The optimum pair of targets for a concurrent estimate is found at the correct response probabilities p1 = 0.19 and p2 = 0.81 for the logistic psychometric function. An adaptive procedure that converges at these optimal targets is introduced and tested with Monte Carlo simulations. The efficiency increases rapidly when each subject's response consists of more than one statistically independent Bernoulli trial. Sentence intelligibility tests provide more than one Bernoulli trial per sentence when each word is scored separately. The number of within-sentence trials can be quantified by the j factor [Boothroyd and Nittrouer, J. Acoust. Soc. Am. 84, 101-114 (1988)]. The adaptive procedure was evaluated with 10 normal-hearing and 11 hearing-impaired listeners using two German sentence tests that differ in j factors. The expected advantage of the sentence test with the higher j factor was not observed, possibly due to training effects. Hence, the number of sentences required for a reliable speech reception threshold (approximately 1 dB standard deviation) concurrently with a slope estimate (approximately 20%-30% relative standard deviation) is at least N = 30 if word scoring for short, meaningful sentences (j approximately 2) is performed.

Patent
TL;DR: In this paper, a variable multi-dimensional apodization control (200) for an ultrasonic transducer array (202) is disclosed. The apodisation control is applicable to both piezoelectric based transducers and to MUT-based transducers.
Abstract: Variable multi-dimensional apodization control (200) for an ultrasonic transducer array (202) is disclosed The variable multi-dimensional apodization control (200) is applicable to both piezoelectric based transducers and to MUT based transducers and allows control of the apodization profile of an ultrasonic transducer array (202) having elements arranged in more than one dimension

Journal ArticleDOI
TL;DR: The results suggest that, in these conditions, the advantage due to spatial separation of sources is greater for informational masking than for energetic masking.
Abstract: The effect of spatial separation of sources on the masking of a speech signal was investigated for three types of maskers, ranging from energetic to informational. Normal-hearing listeners performed a closed-set speech identification task in the presence of a masker at various signal-to-noise ratios. Stimuli were presented in a quiet sound field. The signal was played from 0° azimuth and a masker was played either from the same location or from 90° to the right. Signals and maskers were derived from sentences that were preprocessed by a modified cochlear-implant simulation program that filtered each sentence into 15 frequency bands, extracted the envelopes from each band, and used these envelopes to modulate pure tones at the center frequencies of the bands. In each trial, the signal was generated by summing together eight randomly selected frequency bands from the preprocessed signal sentence. Three maskers were derived from the preprocessed masker sentences: (1) different-band sentence, which was generated by summing together six randomly selected frequency bands out of the seven bands not present in the signal (resulting in primarily informational masking); (2) different-band noise, which was generated by convolving the different-band sentence with Gaussian noise; and (3) same-band noise, which was generated by summing the same eight bands from the preprocessed masker sentence that were used in the signal sentence and convolving the result with Gaussian noise (resulting in primarily energetic masking). Results revealed that in the different-band sentence masker, the effect of spatial separation averaged 18 dB (at 51% correct), while in the different-band and same-band noise maskers the effect was less than 10 dB. These results suggest that, in these conditions, the advantage due to spatial separation of sources is greater for informational masking than for energetic masking.

Journal ArticleDOI
TL;DR: Simulations and experiments indicate that microbubbles translate significant distances with clinically relevant parameters, and radiation force should be considered during an ultrasonic examination.
Abstract: High-speed photography of insonified bubbles with a time resolution of 10 ns allows observations of translation due to radiation force, in addition to the visualization of radial oscillations. A modified version of the Rayleigh-Plesset equation is used to estimate the radius-time behavior of insonified microbubbles, and the accuracy of this model is verified experimentally. The translation of insonified microbubbles is calculated using a differential equation relating the acceleration of the bubble to the forces due to acoustic radiation and the drag imposed by the fluid. Simulations and experiments indicate that microbubbles translate significant distances with clinically relevant parameters. A 1.5 micron radius contrast agent can translate over 5 microns during a single 20-cycle, 2.25 MHz, 380 kPa acoustic pulse, achieving velocities over 0.5 m/s. Therefore, radiation force should be considered during an ultrasonic examination because of the possibility of influencing the position and flow velocity of the contrast agents with the interrogating acoustic beam.

Patent
TL;DR: In this paper, a transverse mode ultrasonic probe is provided which creates a cavitation area along its longitudinal length, increasing the working surface of the probe, and accessory sheaths are also provided for use with the probe to enable a user to select from features most suited to an individual medical procedure.
Abstract: A transverse mode ultrasonic probe is provided which creates a cavitation area along its longitudinal length, increasing the working surface of the probe. Accessory sheaths are also provided for use with the probe to enable a user to select from features most suited to an individual medical procedure. The sheaths provide acoustic enhancing and aspiration enhancing properties, and/or can be used as surgical tools or as medical access devices, protecting tissue from physical contact with the probe.

Journal ArticleDOI
TL;DR: Comparison between NAQ and its counterpart among the conventional time-domain parameters, the closing quotient, shows that the proposed parameter is more robust against distortion such as measurement noise that make the extraction of conventionalTime-based parameters of the glottal flow problematic.
Abstract: Normalized amplitude quotient (NAQ) is presented as a method to parametrize the glottal closing phase using two amplitude-domain measurements from waveforms estimated by inverse filtering. In this technique, the ratio between the amplitude of the ac flow and the negative peak amplitude of the flow derivative is first computed using the concept of equivalent rectangular pulse, a hypothetical signal located at the instant of the main excitation of the vocal tract. This ratio is then normalized with respect to the length of the fundamental period. Comparison between NAQ and its counterpart among the conventional time-domain parameters, the closing quotient, shows that the proposed parameter is more robust against distortion such as measurement noise that make the extraction of conventional time-based parameters of the glottal flow problematic. Experiments with breathy, normal, and pressed vowels indicate that NAQ is also able to separate the type of phonation effectively.

Patent
TL;DR: In this article, an ultrasonic surgical instrument is described which incorporates an articulating shearing end-effector and a substantially solid ultrasonic waveguide that connects the ultrasonic transducer to the end effector.
Abstract: A therapeutic ultrasound instrument is described for cutting, dissecting, or cauterizing tissue. Ultrasonic vibrations, when transmitted to organic tissue at suitable energy levels and using a suitable end-effector, may be used for the safe and effective treatment of many medical conditions. An ultrasonic surgical instrument is described which incorporates an articulating shearing end-effector. The instrument comprises an ultrasonic transducer, an ultrasonically activatable end-effector, and a substantially solid ultrasonic waveguide that connects the ultrasonic transducer to the end-effector. The waveguide comprises a transmission section extending from the transducer to a fixed node, and an articulation section extending from the fixed node to a pivoting node. The end-effector includes a bifurcated waveguide segment. A handle is adapted to hold the transducer. An outer sheath extends from the handle to the end-effector and surrounds the waveguide. An actuation trigger is rotatably positioned on the handle, and an actuation arm extends from a distal end of the actuation trigger to the pivoting node. Such instruments are particularly suited for use in minimally invasive procedures, such as endoscopic or laparoscopic procedures.

Journal ArticleDOI
TL;DR: Investigation of the extent to which naturally produced clear speech is an effective intelligibility enhancement strategy for non-native listeners found that it is only beneficial to listeners with extensive experience with the sound structure of the target language.
Abstract: Previous work has established that naturally produced clear speech is more intelligible than conversational speech for adult hearing-impaired listeners and normal-hearing listeners under degraded listening conditions. The major goal of the present study was to investigate the extent to which naturally produced clear speech is an effective intelligibility enhancement strategy for non-native listeners. Thirty-two non-native and 32 native listeners were presented with naturally produced English sentences. Factors that varied were speaking style (conversational versus clear), signal-to-noise ratio (-4 versus -8 dB) and talker (one male versus one female). Results showed that while native listeners derived a substantial benefit from naturally produced clear speech (an improvement of about 16 rau units on a keyword-correct count), non-native listeners exhibited only a small clear speech effect (an improvement of only 5 rau units). This relatively small clear speech effect for non-native listeners is interpreted as a consequence of the fact that clear speech is essentially native-listener oriented, and therefore is only beneficial to listeners with extensive experience with the sound structure of the target language.

PatentDOI
TL;DR: In this article, the authors propose a speech recognition technique for video and audio signals that consists of processing a video signal associated with an arbitrary content video source, processing an audio signal associated to the video signal, and recognizing at least a portion of the processed audio signal using at least the processed video signal to generate output signal representative of the audio signal.
Abstract: Techniques for providing speech recognition comprise the steps of processing a video signal associated with an arbitrary content video source, processing an audio signal associated with the video signal, and recognizing at least a portion of the processed audio signal, using at least a portion of the processed video signal, to generate an output signal representative of the audio signal.

PatentDOI
Martin Shetter1
TL;DR: In this paper, a virtual keyboard displayed on a touch sensitive screen allows a user to do touch-typing thereon to enter textual data into a computer, and the keyboard is semi-transparently displayed over on a background image with individual keys shown with shaded edges so that they can be easily distinguished from features in the background image.
Abstract: A virtual keyboard displayed on a touch sensitive screen allows a user to do touch-typing thereon to enter textual data into a computer. The keyboard image has a standard key layout for typewriting, and the keys are sized to allow the fingers of the user to take the positions necessary for “ten-finger” touch-typing in the standard fashion. The virtual keyboard image is semi-transparently displayed over on a background image, with the individual keys shown with shaded edges so that they can be easily distinguished from features in the background image. When a key is touched, a sound is generated. The sound generated when the touch is away from a target portion of the key is different from the sound generated when the touch is on or adjacent to the target portion of the key, thereby providing audio feedback to enable the user to adjust finger positions to maintain proper alignment with the virtual keys.

Journal ArticleDOI
TL;DR: The algorithm is designed to minimize the error between measured signals and signals calculated from the reconstructed image and requires less time and instrumentation for signal acquisition than conventional methods using filtered backprojection.
Abstract: Optoacoustic imaging is based on the generation of thermoelastic stress waves by heating an object in an optically heterogeneous medium with a short laser pulse. The stress waves contain information about the distribution of structures with preferential optical absorption. Detection of the waves with an array of broadband ultrasound detectors at the surface of the medium and applying a backprojection algorithm is used to create a map of absorbed energy inside the medium. With conventional reconstruction methods a large number of detector elements and filtering of the signals are necessary to reduce backprojection artifacts. As an alternative this study proposes an iterative procedure. The algorithm is designed to minimize the error between measured signals and signals calculated from the reconstructed image. In experiments using broadband optical ultrasound detectors and in simulations the algorithm was used to obtain three-dimensional images of multiple optoacoustic sources. With signals from a planar ar...

Journal ArticleDOI
TL;DR: While vowel intelligibility was significantly higher in clear speech than in conversational speech for the YNH listeners, no clear speech advantage was found for the EHI group, which suggests that hearing loss alters the way acoustic cues are used for identifying vowels.
Abstract: Several studies have demonstrated that when talkers are instructed to speak clearly, the resulting speech is significantly more intelligible than speech produced in ordinary conversation. These speech intelligibility improvements are accompanied by a wide variety of acoustic changes. The current study explored the relationship between acoustic properties of vowels and their identification in clear and conversational speech, for young normal-hearing (YNH) and elderly hearing-impaired (EHI) listeners. Monosyllabic words excised from sentences spoken either clearly or conversationally by a male talker were presented in 12-talker babble for vowel identification. While vowel intelligibility was significantly higher in clear speech than in conversational speech for the YNH listeners, no clear speech advantage was found for the EHI group. Regression analyses were used to assess the relative importance of spectral target, dynamic formant movement, and duration information for perception of individual vowels. For both listener groups, all three types of information emerged as primary cues to vowel identity. However, the relative importance of the three cues for individual vowels differed greatly for the YNH and EHI listeners. This suggests that hearing loss alters the way acoustic cues are used for identifying vowels.

Journal ArticleDOI
TL;DR: Though listeners' judgments of sound source distance are found to consistently and exponentially underestimate true distance, the perceptual weight assigned to two primary distance cues varies substantially as a function of both sound source type and angular position.
Abstract: In most naturally occurring situations, multiple acoustic properties of the sound reaching a listener's ears change as sound source distance changes. Because many of these acoustic properties, or cues, can be confounded with variation in the acoustic properties of the source and the environment, the perceptual processes subserving distance localization likely combine and weight multiple cues in order to produce stable estimates of sound source distance. Here, this cue-weighting process is examined psychophysically, using a method of virtual acoustics that allows precise measurement and control of the acoustic cues thought to be salient for distance perception in a representative large-room environment. Though listeners' judgments of sound source distance are found to consistently and exponentially underestimate true distance, the perceptual weight assigned to two primary distance cues (intensity and direct-to-reverberant energy ratio) varies substantially as a function of both sound source type (noise and speech) and angular position (0 degrees and 90 degrees relative to the median plane). These results suggest that the cue-weighting process is flexible, and able to adapt to individual distance cues that vary as a result of source properties and environmental conditions.

Journal ArticleDOI
TL;DR: In this paper, threshold ITDs were measured using three types of stimuli: (1) low-frequency pure tones; (2) 100% sinusoidally amplitude-modulated (SAM) high-frequency tones, and (3) special, “transposed” highfrequency stimuli whose envelopes were designed to provide the highfrequency channels with information similar to that available in lowfrequency channels.
Abstract: It is well-known that thresholds for ongoing interaural temporal disparities (ITDs) at high frequencies are larger than threshold ITDs obtained at low frequencies. These differences could reflect true differences in the binaural mechanisms that mediate performance. Alternatively, as suggested by Colburn and Esquissaud [J. Acoust. Soc. Am. Suppl. 1 59, S23 (1976)], they could reflect differences in the peripheral processing of the stimuli. In order to investigate this issue, threshold ITDs were measured using three types of stimuli: (1) low-frequency pure tones; (2) 100% sinusoidally amplitude-modulated (SAM) high-frequency tones, and (3) special, “transposed” high-frequency stimuli whose envelopes were designed to provide the high-frequency channels with information similar to that available in low-frequency channels. The data and their interpretation can be characterized by two general statements. First, threshold ITDs obtained with the transposed stimuli were generally smaller than those obtained with SAM tones and, at modulation frequencies of 128 and 64 Hz, were equal to or smaller than threshold ITDs obtained with their low-frequency pure-tone counterparts. Second, quantitative analyses revealed that the data could be well accounted for via a model based on normalized interaural correlations computed subsequent to known stages of peripheral auditory processing augmented by low-pass filtering of the envelopes within the high-frequency channels of each ear. The data and the results of the quantitative analyses appear to be consistent with the general ideas comprising Colburn and Esquissaud’s hypothesis.

Journal ArticleDOI
TL;DR: Dutch subjects listening to Germans and English speech, ranging from reasonable to excellent proficiency in these languages, were found to require a 1-7 dB better speech-to-noise ratio to obtain 50% sentence intelligibility than native listeners.
Abstract: The intelligibility of speech pronounced by non-native talkers is generally lower than speech pronounced by native talkers, especially under adverse conditions, such as high levels of background noise The effect of foreign accent on speech intelligibility was investigated quantitatively through a series of experiments involving voices of 15 talkers, differing in language background, age of second-language (L2) acquisition and experience with the target language (Dutch) Overall speech intelligibility of L2 talkers in noise is predicted with a reasonable accuracy from accent ratings by native listeners, as well as from the self-ratings for proficiency of L2 talkers For non-native speech, unlike native speech, the intelligibility of short messages (sentences) cannot be fully predicted by phoneme-based intelligibility tests Although incorrect recognition of specific phonemes certainly occurs as a result of foreign accent, the effect of reduced phoneme recognition on the intelligibility of sentences may range from severe to virtually absent, depending on (for instance) the speech-to-noise ratio Objective acoustic-phonetic analyses of accented speech were also carried out, but satisfactory overall predictions of speech intelligibility could not be obtained with relatively simple acoustic-phonetic measures

Journal ArticleDOI
TL;DR: An efficient recursive algorithm, the stiffness matrix method, has been developed for wave propagation in multilayered generally anisotropic media and it is shown both numerically and analytically that for a thick structure the solution approaches the solution for a semispace.
Abstract: An efficient recursive algorithm, the stiffness matrix method, has been developed for wave propagation in multilayered generally anisotropic media. This algorithm has the computational efficiency and simplicity of the standard transfer matrix method and is unconditionally computationally stable for high frequency and layer thickness. In this algorithm, the stiffness (compliance) matrix is calculated for each layer and recursively applied to generate a stiffness (compliance) matrix for a layered system. Next, reflection and transmission coefficients are calculated for layered media bounded by liquid or solid semispaces. The results show that the method is stable for arbitrary number and thickness of layers and the computation time is proportional to the number of layers. It is shown both numerically and analytically that for a thick structure the solution approaches the solution for a semispace. This algorithm is easily adaptable to laminates with periodicity, such as multiangle lay-up composites. The repetition and symmetry of the unit cell are naturally incorporated in the recursive scheme. As an example the angle beam time domain pulse reflections from fluid-loaded multilayered composites have been computed and compared with experiment. Based on this method, characteristic equations for Lamb waves and Floquet waves in periodic media have also been determined.

Journal ArticleDOI
TL;DR: In this article, the Helmholtz Integral Equation and the Kirchhoff Integral Formulations were used for acoustic eigenvalue analysis by boundary element methods in linear acoustics.
Abstract: Fundamentals of Linear Acoustics The Helmholtz Integral Equation Two-Dimensional Problems Three-Dimensional Problems The Normal-Derivative Integral Equation Indirect Variational Boundary Element Method in Acoustics Acoustic Eigenvalue Analysis by Boundary Element Methods Time Domain Three-Dimensional Analysis Extended Kirchhoff Integral Formulations.

PatentDOI
TL;DR: In this article, a document is presented to the user via an audio output device (310) and provides the user with the ability to annotate the document by speaking into an audio input device(310).
Abstract: Apparatus and methods allowing users to review and add annotations (512,910) to a digital document (305,306). The document is presented to the user via an audio output device (310) and provides the user with the ability to annotate the document by speaking into an audio input device (310). The user may access the document from multiple locations using multiple types of devices.

PatentDOI
TL;DR: In this article, a method of navigating textual information via auditory indicators is provided, through pervasive and immediate articulation of virtually all textual elements in response to a comparatively passive action, such as a selection device rollover, a user may quickly peruse titles, headings, list items, and so on, as well as emphasized text, paragraphs, captions and virtually any unit of visually contiguous text.
Abstract: A method of navigating textual information via auditory indicators is provided. Through the pervasive and immediate articulation of virtually all textual elements in response to a comparatively passive action, such as a selection device rollover, a user may quickly peruse titles, headings, list items, and so on, as well as emphasized text, paragraphs, captions, and virtually any unit of visually contiguous text. A user may also hear a particular selected word via a slightly more active action, such as clicking a mouse button. Via use of this method, a child, or other user who may understand a language, but not be able to recognize its orthography, may successfully and easily navigate textual documents. The method also resumes articulation of text that has been stopped due to a passive deindication. The articulation is restarted approximately at the word where it was stopped.