scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the Acoustical Society of America in 1982"


Journal ArticleDOI
TL;DR: It is suggested that most of the frequency response and nonlinear behavior of inner hair cells and afferent fibers may be found in basilar motion.
Abstract: Basilar membrane motion was measured at the 16–19 kHz place of the guinea pig cochlea using the Mossbauer technique. The threshold of the gross cochlear action potential (CAP) evoked by pure‐tone bursts was used as an indication of neural threshold. CAP threshold deteriorated progressively after the cochlea was opened and the Mossbauer source placed on the basilar membrane. A close relationship was found between the amplitude of basilar membrane motion at the source place frequency and CAP threshold. Basilar membrane velocity at CAP threshold SPL was about 0.04 mm/s over a 60‐dB range of CAP threshold. Intensity functions for basilar membrane motion were linear for frequencies more than an octave below the source place frequency but demonstrated progressive saturation for frequencies greater than an octave below the CF. This nonlinear behavior was eliminated as the CAP threshold became less sensitive and was absent post mortem. Isovelocity curves at the 0.04 mm/s criterion were remarkably similar to frequency threshold curves from primary afferent fibers innervating a similar place on the basilar membrane. The isovelocity curve was a better fit than the isoamplitude curve suggesting that inner hair cells respond to basilar membrane velocity. As the CAP threshold deterioriated, the isovelocity curves lost sensitivity around the best frequency, whereas sensitivity to frequencies below 10 kHz remained constant even after the animal was killed. We suggest that most of the frequency response and nonlinear behavior of inner hair cells and afferent fibers may be found in basilar membrane motion.

783 citations


Journal ArticleDOI
TL;DR: The harmonics-to-noise (H/N) ratio proved useful in quantitatively assessing the results of treatment for hoarseness and showed a highly significant agreement between H/N calculations and the subjective evaluations of the spectrograms.
Abstract: Degree of hoarseness can be evaluated by judging the extent to which noise replaces the harmonic structure in the spectrogram of a sustained vowel. However, this visual method is subjective. The present study was undertaken to develop the harmonics‐to‐noise (H/N) ratio as an objective and quantitative evaluation of the degree of hoarseness. The computation is conceptually straightforward; 50 consecutive pitch periods of a sustained vowel /a/ are averaged; H is the energy of the averaged waveform, while N is the mean energy of the differences between the individual periods and the averaged waveform. Recordings of 42 normal voices and 41 samples with varying degrees of hoarseness were analyzed. Two experts rated the spectrogram of each voice sample, based on the amount of noise relative to that of the harmonic component. The results showed a highly significant agreement (the rank correlation coefficient = 0.849) between H/N calculations and the subjective evaluations of the spectrograms. The H/N ratio also proved useful in quantitatively assessing the results of treatment for hoarseness.

548 citations


Journal ArticleDOI
TL;DR: Auditory filter shapes derived from the tone-in-noise data show that the passband of the filter broadens progressively with age, and that the dynamic range of thefilter ages like the audiogram, which means that the range changes little with age before 55, but beyond this point there is an accelerating rate of loss.
Abstract: The frequency selectivity of the auditory system was measured by masking a sinusoidal signal (0.5, 2.0, or 4.0 kHz) or a filtered‐speech signal with a wideband noise having a notch, or stopband, centered on the signal. As the notch was widened performance improved for both types of signal but the rate of improvement decreased as the age of the 16 listeners increased from 23 to 75 years, indicating a loss in frequency selectivity with age. Auditory filter shapes derived from the tone‐in‐noise data show (a) that the passband of the filter broadens progressively with age, and (b) that the dynamic range of the filter ages like the audiogram. That is, the range changes little with age before 55, but beyond this point there is an accelerating rate of loss. The speech experiment shows comparable but smaller effects. The filter‐width measurements show that the critical ratio is a poor estimator of frequency selectivity because it confounds the tuning of the system with the efficiency of the signal‐detection and speech‐processing mechanisms that follow the filter. An alternative, one‐point measure of frequency selectivity, which is both sensitive and reliable, is developed via the filter‐shape model of masking.

529 citations


Journal ArticleDOI
TL;DR: In this article, the absorption of sound in seawater is considered as the sum of three contributions: those from pure water, magnesium sulfate, and boric acid, and the three contributions are then combined to form an equation with both a theoretical basis and a satisfactory empirical fit that will be useful to researchers and engineers in the field of underwater sound.
Abstract: The absorption of sound in seawater is considered as the sum of three contributions: those from pure water, magnesium sulfate, and boric acid. Contributions from other reactions are small and are not included. The pure water and magnesium sulfate contributions obtained from analyses of extensive oceanic measurements, including many in the Arctic, were discussed in Part I. In Part II, an analysis is made of all reported measurements in the low‐frequency region (0.2–10 kHz) to evaluate the contribution of boric acid. This is done by subtracting the pure water and magnesium sulfate contributions determined in Part I from the total absorption to give a more accurate estimate of the boric acid contribution than previously obtained. The three contributions are then combined to form an equation with both a theoretical basis and a satisfactory empirical fit that will be useful to researchers and engineers in the field of underwater sound. The equation applies to all oceanic conditions and frequencies from 200 Hz to 1 MHz.

494 citations


Journal ArticleDOI
TL;DR: In this paper, the results of laboratory measurements of sediment properties in cores from the Bering Sea, North Sea, Mediterranean Sea, equatorial Pacific, and other areas, have been combined with older measurements and the results, with statistical analyses, are presented (for various sediment types in three general environments) in tables, diagrams, and regression equations.
Abstract: New laboratory measurements of sediment properties in cores from the Bering Sea, North Sea, Mediterranean Sea, equatorial Pacific, and other areas, have been combined with older measurements and the results, with statistical analyses, are presented (for various sediment types in three general environments) in tables, diagrams, and regression equations. The measured properties are sound velocity, density, porosity, grain density, and grain size; computed properties are velocity ratios (sediment velocity/water velocity) and impedance. Mineral‐grain microstructures of sediments are critical in determining density, porosity, and sound velocity; compressibility of pore water is the critical factor in determining sound velocity. New regression equations are provided for important empirical relationships between properties. Corrections of laboratory values to sea‐floor values are discussed. It is concluded that sound velocity and density are about the same for a given sediment type in the same environment in any...

483 citations


Journal ArticleDOI
TL;DR: The cochlear frequency map derived from these single-neuron labeling experiments is compared to maps derived by a number of different physiological and psychophysical techniques, and the significance of the similarities and differences is discussed.
Abstract: Iontophoresis of horseradish peroxidase was used to label single auditory nerve fibers after determination of threshold tuning curves and rates of spontaneous discharge. The relation between characteristic frequency (CF) and cochlear longitudinal location is reconstructed from 52 labeled neurons in 16 cochleas. The length of the organ of Corti allotted to an octave of stimulus frequency increases steadily from low to high frequencies. Thus there is not a simple linear-distance-to-log frequency conversion. When comparing cochleas of different total length, the best predictor of CF at a given location is the distance from base or apex expressed as a percentage of the total length. The cochlear frequency map derived from these single-neuron labeling experiments is compared to maps derived by a number of different physiological and psychophysical techniques, and the significance of the similarities and differences is discussed.

434 citations


Journal ArticleDOI
TL;DR: In this paper, a procedure for the automatic extraction of the various pitch percepts which may be simultaneously evoked by complex tonal stimuli is described, based on the theory of virtual pitch.
Abstract: A procedure is described for the automatic extraction of the various pitch percepts which may be simultaneously evoked by complex tonal stimuli. The procedure is based on the theory of virtual pitch, and in particular on the principle, that the whole pitch percept is dependent both on analytic listening (yielding spectral pitch), and on holistic perception (yielding virtual pitch). The more or less ambiguous pitch percept governed by these two pitch modes is described by two pitch patterns: the spectral‐pitch pattern, and the virtual‐pitch pattern. Each of these patterns consists of a number of pitch (height) values, and associated weights, which account for the relative prominence of every individual pitch. The spectral‐pitch pattern is constructed by spectral analysis, extraction of tonal components, evaluation of masking effects (masking and pitch shifts), and weighting according to the principle of spectral dominance. The virtual‐pitch pattern is obtained from the spectral‐pitch pattern by an advanced...

391 citations


Journal ArticleDOI
TL;DR: Measurements confirm theoretical expectations and earlier observations that atmospheric attenuation is progressively more severe at higher frequencies and that the atmosphere acts as a low-pass filter for conducting sounds in the frequency range used for echolocation by bats.
Abstract: The absorption of sound propagating through the atmosphere under laboratory conditions of 25 degrees C and 50% relative humidity was measured at frequencies from 30 to 200 kHz. The attenuating effect on the passage of ultrasonic sounds through air ranged from 0.7 dB/m at 30 kHz. These measurements confirm theoretical expectations and earlier observations that atmospheric attenuation is progressively more severe at higher frequencies and that the atmosphere acts as a low-pass filter for conducting sounds in the frequency range used for echolocation by bats. Different species of bats use different portions of this range of frequencies, and bats emitting sonar signals predominantly above 100 kHz encounter especially severe attenuation of over 3 dB/m. With the greatly restricted operating distances for echolocation at such high frequencies, bats using these higher frequencies must be under compelling ecological pressures of a higher priority than long-range detection of targets.

373 citations


Journal ArticleDOI
TL;DR: In this paper, the contribution from MgSo4 is treated extensively using the authors' measurements of absorption in the ocean to construct a quantitative equation for absorption as a function of temperature, salinity, and depth.
Abstract: Between 10 and 1000 kHz the absorption of sound in sea water can be considered as the sum of contributions from pure water and magnesium sulfate. Near 10 kHz there is also a small contribution from boric acid. In this paper, the contribution from MgSo4 is treated extensively using the authors’ measurements of absorption in the ocean to construct a quantitative equation for absorption as a function of temperature, salinity, and depth. The frequency region below 10 kHz, where the boric acid contribution predominates, will be reviewed later in Part II which includes an equation for the total absorption in the frequency range 400 Hz to 1 MHz.

351 citations


Journal ArticleDOI
TL;DR: In this paper, the authors measured extensional and shear wave velocities in Massilon sandstone and Vycor porous glass as a function of continuously varying partial water saturation and relative humidity.
Abstract: The advent of new high resolution seismic reflection and borehole sonic techniques has stimulated renewed interest in what information stress wave propagation may carry about rock properties and pore fluids in situ. We have measured extensional and shear wave velocities. Ve and Vs, and their specific attenuation, Q−1e and Q−1s, in Massilon sandstone and Vycor porous glass as a function of continuously varying partial water saturation and relative humidity. Measurements were made at frequencies from 300 Hz to 14 kHz using a resonant bar technique and from 25–400 Hz using a torsional pendulum technique. Energy loss is very sensitive to partial water saturation. In Massilon sandstone, Q−1s is maximum and greater than Q−1e only at full saturation. Q−1e rises to a strong peak at 85% water saturation. Energy loss drops significantly as the Massilon becomes ’’very dry.’’ Q−1e and Q−1s in partially saturated Massilon and Vycor are strongly frequency dependent throughout the acoustic range, exhibiting peaks betwee...

321 citations


Journal ArticleDOI
TL;DR: The forward masking of a sinusoidal signal by asinusoid of the same frequency was investigated for frequencies ranging from 125 to 4000 Hz and the frequency effect is not large enough to change the interpretation of forward-masking data in studies of suppression or psychophysical tuning curves.
Abstract: The forward masking of a sinusoidal signal by a sinusoid of the same frequency was investigated for frequencies ranging from 125 to 4000 Hz. Forward masking in dB is proportional to both masker level and log signal delay at each frequency. More forward masking occurs at very low frequencies than at high frequencies, given equal‐sensation‐level maskers, and masked thresholds are greater at low frequencies than at high frequencies given equal‐SPL maskers. The data can be described equally well by assuming that the difference in forward masking as a function of frequency is due to a change in the time course of recovery from masking or to a change in the growth of masking at each signal delay. The frequency effect is not large enough to change the interpretation of forward‐masking data in studies of suppression or psychophysical tuning curves.

Journal ArticleDOI
TL;DR: Early cry and the later vocalizations of cooing and babbling appear to be vocal performances in continuity, from a strictly acoustic perspective.
Abstract: Recordings were obtained of the comfort‐state vocalizations of infants at 3, 6, and 9 months of age during a session of play and vocal interaction with the infant’s mother and the experimenter. Acoustic analysis, primarily spectrography, was used to determine utterance durations, formant frequencies of vocalic utterances, patterns of f0 frequency change during vocalizations, variations in source excitation of the vocal tract, and general properties of the utterances. Most utterances had durations of less than 400 ms although occasional sounds lasted 2 s or more. An increase in the ranges of both the F1 and F2 frequencies was observed across both periods of age increase, but the center of the F1–F2 plot for the group vowels appeared to change very little. Phonatory characteristics were at least generally compatible with published descriptions of infant cry. The f0 frequency averaged 445 Hz for 3‐month‐olds, 450 Hz for 6‐month‐olds, and 415 Hz for 9‐month‐olds. As has been previously reported for infant cry, the vocalizations frequently were associated with tremor (vibrato), harmonic doubling, abrupt f0 shift, vocal fry (or roll), and noise segments. Thus, from a strictly acoustic perspective, early cry and the later vocalizations of cooing and babbling appear to be vocal performances in continuity. Implications of the acoustic analyses are discussed for phonetic development and speech acquisition.

PatentDOI
TL;DR: In this paper, a speech recognizer includes a plurality of stored constrained hidden Markov model reference templates and a set of stored signals representative of prescribed acoustic features of the said plurality of reference patterns.
Abstract: A speech recognizer includes a plurality of stored constrained hidden Markov model reference templates and a set of stored signals representative of prescribed acoustic features of the said plurality of reference patterns. The Markov model template includes a set of N state signals. The number of states is preselected to be independent of the reference pattern acoustic features and preferably substantially smaller than the number of acoustic feature frames of the reference patterns. An input utterance is analyzed to form a sequence of said prescribed feature signals representative of the utterance. The utterance representative prescribed feature signal sequence is combined with the N state constrained hidden Markov model template signals to form a signal representative of the probability of the utterance being each reference pattern. The input speech pattern is identified as one of the reference patterns responsive to the probability representative signals.

Journal ArticleDOI
TL;DR: In this article, the authors studied the ability of human listeners to locate the origin of a sound in a room in a series of source azimuth identification experiments and found significant biases, as much as 2°; such biases are, of course, invisible in minimum audible angle experiments.
Abstract: We have studied the ability of human listeners to locate the origin of a sound in a room in a series of source azimuth identification experiments. All experiments were done in a small rectangular concert hall with variable geometry and acoustical properties. Subjects localized a 50‐ms, 500‐Hz sine pulse with an rms error of 3.3° (±0.6°) regardless of room reverberation time. Lowering the ceiling from 11.5 to 3.5 m decreased the error to 2.8° (±0.6°). Subjects localized broadband noise without attack transients with an rms error of 2.3° (±0.6°) if the reverberation time was 1 s. The error increased to 3.2° (±0.7°) if the reverberation time was 5 s. For complex tones without attack transients the localization error continuously increased as the intensity of spectral components decreased. Performance was nearly random for a 500‐Hz sine tone, but was significantly better than random for a 5000‐Hz sine tone. Our azimuth identification experiments revealed significant biases, as much as 2°; such biases are, of course, invisible in minimum audible angle experiments.

Journal ArticleDOI
TL;DR: An analysis of variance showed that all the main effects: T, age, and monaural versus binaural listening were significant and the scores declined with T for all ages.
Abstract: The Modified Rhyme Test (MRT) was processed through a room (volume 165 m3, reverberation time T = 0.4, 0.8, and 1.2 s). For both binaural and monaural earphone listening the tests were recorded with a manikin (Kemar) and equalization filters to compensate for the ear canal effect. Six groups of subjects, ten subjects each, had mean ages of 10, 27, 42, 54, 64, and 72 years and average hearing threshold levels, HTLs (for 0.5, 1, and 2 kHz) of 2.7, 5.6, 6.0, 10.9, 14.4, 17.5 dB, respectively. The individual scores for the MRT without reverberation were between 90% and 100% at 70 dB SPL. Children and the elderly required from 10 to 20 dB higher SPLs than young adults to obtain maximum scores. An analysis of variance showed that all the main effects: T, age, and monaural versus binaural listening were significant. The scores declined with T for all ages. The best scores were obtained by the young adults (27 year olds). The binaural scores were about 5% better than monaural scores. Factors contributing to the results and practical implications for amplification are discussed.

Journal ArticleDOI
TL;DR: In this paper, the authors investigated the ultrasonic properties of unconsolidated (loose) glass beads and of lightly fused (consolidated) glass bead when the pore space is saturated with water.
Abstract: We have investigated the ultrasonic properties of unconsolidated (loose) glass beads and of lightly fused (consolidated) glass beads when the pore space is saturated with water. At a frequency of 500 kHz we have observed a single compressional wave in the former whose speed is 1.79 km/s and two distinct compressional waves with speeds 2.81 km/s and 0.96 km/s in the latter. The Biot theory is shown to give an accurate description of this phenomenon. We also analyze the acoustics of low temperature He ii in packed powder superleaks; either the fast wave for unconsolidated systems or the slow wave in a highly consolidated (fused) frame may be considered to be the 4th sound mode. In all such systems, the acoustic properties can be very simply understood by considering the velocities of propagation as continuous functions of the elastic moduli of the solid skeletal frames.

Journal ArticleDOI
TL;DR: In this article, a diffuse field in a solid medium is found to partition its energy between transverse and dilatational waves in a fraction R = 2(cd/ct)3.
Abstract: A diffuse field in a solid medium is found to partition its energy between transverse and dilatational waves in a fraction R = 2(cd/ct)3. Energy flow rates from a diffuse field into a surface transducer from dilatational, transverse, and surface waves are compared. The relevance of the diffuse field concept for acoustic emission is discussed.

Journal ArticleDOI
TL;DR: Increased temporal difference limen and longer gap‐detection thresholds were found to correlate significantly with reduced speech intelligibility in noise, even when the effects of the pure‐tone threshold loss were partialed out.
Abstract: Four measures of auditory temporal processing were obtained from 16 normals and 16 individuals with a hearing loss of heterogeneous origin. These measures were: (1) temporal integration—the difference in detection thresholds between signals of 10‐ and 1000‐ms duration (which was determined to provide an estimate of the ability to integrate energy over time), (2) gap detection—the shortest duration of silence between two noise bursts that can be discriminated from an uninterrupted noise, (3) temporal difference limen—the increment in duration necessary to detect a difference in the duration of a noise burst, (4) gap difference limen—the increment in duration necessary to detect a difference in the duration of a silent interval between two noise bursts. Each measure was obtained for stimuli centered both at 500 and at 4000 Hz using a three‐alternative forced‐choice procedure. In addition, measures of identification and discrimination were obtained for two sets of synthetic speech syllables varying chiefly in a temporal parameter, voice‐onset‐time, from /ba/ to /pa/ and from /bi/ to /pi/. Finally, speech identification in noise was measured with the FAAF test. Most of the hearing‐impaired listeners displayed poorer temporal analysis than the normals on all of the psychoacoustical tasks, regardless of whether the two groups were compared at similar sound pressure levels or at similar sensation levels. Although the hearing‐impaired listeners displayed a reduction in the ability to discriminate subphonemic cues for the voiced–voiceless distinction, their identification of that distinction in stop consonants appeared to be normal. The hearing‐impaired group made about twice as many errors as did the normals on each of the consonant features of place, manner, and voicing when identifying speech in noise. Increased temporal difference limen and longer gap‐detection thresholds were found to correlate significantly with reduced speech intelligibility in noise, even when the effects of the pure‐tone threshold loss were partialed out.

Journal ArticleDOI
TL;DR: In this article, it was shown that small gas nuclei in a liquid acted on by microsecond ultrasonic pulses may grow into transient cavities that collapse violently, and the maximum pressures and temperatures generated by such collapsing cavities were found in these calculations to be of the order of 1000 to 70'000 bars and 1000'° to 20'000'K.
Abstract: Calculations reported here show that small gas nuclei in a liquid acted on by microsecond ultrasonic pulses may grow into transient cavities that collapse violently. The maximum pressures and temperatures generated by such collapsing cavities are found in these calculations to be of the order of 1000 to 70 000 bars and 1000 ° to 20 000 °K.

Journal ArticleDOI
TL;DR: Temporal resolution in all listeners showed systematic improvement with an increase in octave-band center frequency, and Resolution in the hearing-impaired subjects was significantly poorer than normal regardless of whether the comparisons were made at equal sound pressure level or at equal sensation level.
Abstract: Temporal resolution, estimated by measuring the minimum detectable gap (Δt msec) separating two successive signals, was assessed in five normal‐hearing and five cochlear‐impaired listeners. The signals were octave‐band noises (400–800 Hz, 800–1600 Hz, and 2000–4000 Hz) presented in a background of continuous, broadband notched noise that was applied to eliminate unwanted spectral cues. Temporal resolution in all listeners showed systematic improvement with an increase in octave‐band center frequency. Resolution in the hearing‐impaired subjects was significantly poorer than normal regardless of whether the comparisons were made at equal sound pressure level or at equal sensation level.

Journal ArticleDOI
TL;DR: Preliminary analyses of contextual influences on durations showed some expected changes, and also indicated that certain traditional predictions may not hold for informal connected speech.
Abstract: The data base, methods for a study of the durations of phonetic units in connected speech, and some preliminary results are described. From readings of two scripts by many talkers, two sets of seven talkers each were selected, based on total reading time, to form a fast group and a slow group of talkers. Using computer graphics and digital playback procedures, the recordings were segmented into breath groups and pauses, and the first four sentences in each script were segmented into phones. The hold and release (that is, plosion and/or frication) portions of stops were identified and measured; less than 50% of the stops included releases. To establish the usefulness of the data base, the first‐order statistics of the phonetic segments were determined, and a variety of durational characteristics were compared to existing reports. Analysis of number of breath groups, phonation time, and pause characterized the difference between so‐called average fast and average slow talkers; however, no script‐independent measure of these variables was found which would accurately predict the classification of individual talkers. The mean durations of various phonetic categories showed essentially the same percentage change when the fast and slow talkers were compared. Preliminary analyses of contextual influences on durations showed some expected changes, and also indicated that certain traditional predictions may not hold for informal connected speech. Gamma functions were fitted to the distributions of durations of various gross categories.

PatentDOI
TL;DR: In this paper, a hand-held surgical instrument for fragmentation and removal of animal tissue such as a cataract has a working tip which in addition to being longitudinally vibrated is laterally oscillated to enhance the operation thereof.
Abstract: A hand-held surgical instrument for fragmentation and removal of animal tissue such as a cataract has a working tip which in addition to being longitudinally vibrated is laterally oscillated to enhance the operation thereof. Also, the method of laterally oscillating the working tip in the range of about 5 degrees to about 60 degrees, longitudinally vibrating the tip ultrasonically, supplying treatment fluid to the region adjacent to the working tip and withdrawing the suspended particles of the cataract in the fluid.

PatentDOI
TL;DR: A speech recognition system includes speech presence detection which uses a first level threshold of ambient noise/silence above which speech start is decided for a signal as discussed by the authors, unless a predetermined time interval of speech is exceeded after start, causing a corrected second threshold to be calculated.
Abstract: A speech recognition system includes speech presence detection which uses a first level threshold of ambient noise/silence above which speech start is decided for a signal. Speech end is decided when the signal falls to a second threshold equal to the first, unless a predetermined time interval of speech is exceeded after start, causing a corrected second threshold to be calculated and used.

Journal ArticleDOI
TL;DR: In this article, sound pressure at the input to the cochlea at behavioral threshold is constant between 1 and 8 kHz, but increases as frequency is decreased below 1 kHz.
Abstract: Tones were delivered directly to the stapes in anesthetized cats after removal of the tympanic membrane, malleus, and incus. Measurements were made of the complex amplitudes of the sound pressure on the stapes PS, stapes velocity VS, and sound pressure in the vestibule PV. From these data, acoustic impedance of the stapes and cochlea ZSC delta equal to PS/US, and of the cochlea alone ZC delta equal PV/US were computed (US delta equal to volume velocity of the stapes = VS X area of the stapes footplate). Some measurements were made on modified preparations in which (1) holes were drilled into the vestibule and scala tympani, (2) the basal end of the basilar membrane was destroyed, (3) cochlear fluid was removed, or (4) static pressure was applied to the stapes. For frequencies between 0.5 and 5 kHz, ZSC approximately equal to ZC; this impedance is primarily resistive ([ZC] approximately equal to 1.2 X 10(6) dyn-s/cm5) and is determined by the basilar membrane and cochlear fluids. For frequencies below 0.3 kHz, [ZSC] greater than [ZC] and ZSC is primarily determined by the stiffness of the annular ligament; drying of the ligament or changes in the static pressure difference across the footplate can produce large changes in [ZSC]. For frequencies below 30 Hz, ZC is apparently controlled by the stiffness of the round-window membrane. All of the results can be represented by an network of eight lumped elements in which some of the elements can be associated with specific anatomical structures. Computations indicate that for the cat the sound pressure at the input to the cochlea at behavioral threshold is constant between 1 and 8 kHz, but increases as frequency is decreased below 1 kHz. Apparently, mechanisms within the chochlea (or more centrally) have an important influence on the frequency dependence of behavioral threshold at low frequencies.

Journal ArticleDOI
TL;DR: The results indicate that both rhythm of the inter-stress intervals and the presence of phrase-final lengthening influence listeners' perception of a phrase boundary, although the stress rhythm appears to be the more powerful perceptual cue.
Abstract: The presence of a phrase boundary is often marked in speech by phrase-final lengthening-a lengthening of the final stressed syllable of the phrase and pause at the phrase boundary. The present study investigates (a) whether listeners use the feature of phrase-final lengthening to parse syntactically ambiguous sentences such as "Kate or Pat and Tony will come," where the position of a phrase boundary after "Kate" represents one meaning, and after "Pat" another meaning, and (b) whether listeners use phrase-final lengthening directly to parse the sentence or indirectly via the effect that phrase-final lengthening has on the rhythm of the feet (the onsets of the stressed syllables) of the sentence. Four experiments are reported in which listeners are asked to judge the meaning of sentences which have been temporally manipulated so that the foot which originally did not contain the crucial phrase boundary is lengthened by (i) inserting a pause at the "false" phrase boundary [experiment I], (ii) inserting a pause and lengthening the final stressed syllable at the "false" phrase boundary [experiments II], (iii) lengthening all segments contained in the foot [experiment III], and (iv) lengthening only the conjunction within the foot [experiment IV]. The results indicate that both rhythm of the inter-stress intervals and the presence of phrase-final lengthening influence listeners' perception of a phrase boundary, although the stress rhythm appears to be the more powerful perceptual cue.

Journal ArticleDOI
TL;DR: A harmonics sieve was introduced to determine whether components are rejected or accepted at a candidate pitch, and a simple criterion, based on the components accepted and rejected, led to the decision on which candidate pitch was to be finally selected.
Abstract: Recent developments in hearing theory have resulted in the rather general acceptance of the idea that the perception of pitch of complex sounds is the result of the psychological pattern recognition process. The pitch is supposedly mediated by the fundamental of the harmonic spectrum which fits the spectrum of the complex sound optimally. The problem of finding the pitch is then equivalent to finding the best harmonic match. Goldstein [J. Acoust. Soc. Am. 54, 1496-1516 (1973)] has described an objective procedure for finding the best fit for stimuli containing relatively few spectral components. He uses maximum likelihood criterion. Application of this procedure to various data on the pitch of complex sounds yielded good results. This motivated our efforts to apply the pattern recognition theory of pitch to the problem of measuring pitch in speech. Although we were able to follow the main line of Goldstein's procedure, some essential changes had to be made. The most important is that in our implementation not all spectral components of the complex sound have to be classified as belonging to the harmonic pattern. We introduced a harmonics sieve to determine whether components are rejected or accepted at a candidate pitch. A simple criterion, based on the components accepted and rejected, led to the decision on which candidate pitch was to be finally selected. The performance and reliability of this psychoacoustically based pitch meter were tested in a LPC-vocoder system.

Journal ArticleDOI
TL;DR: In this paper, an analytical description for the field of a focusing source is derived for spherically concave sources with small aperture angle and large radius a, wavenumber k. The solution furnishes easy access to the sound distribution along the axis and in the focal plane.
Abstract: An analytical description for the field of a focusing source is derived. It is valid for spherically concave sources with small aperture angle and large ka (radius a, wavenumber k.) The solution furnishes easy access to the sound distribution along the axis and in the focal plane, as well as to parameters such as focusing gain, width of the focal spot, and phase shifting in the focal region. Experiments conducted with an f/2 lens coupled to a planar array are discussed. The results support the utility of the analytical model for describing the distribution of sound along the acoustic axis and across the focal plane.

PatentDOI
TL;DR: In this article, the spectral energy of the data signal is smeared by spreading it in a spectrum with energy packed in the lower frequency range under the conventional voice signal frequency in a manner which complements the standard C-message weighting curve.
Abstract: A communications system capable of simultaneously transmitting voice and data information The spectral energy of the data signal is smeared by spreading it in a spectrum with energy packed in the lower frequency range under the conventional voice signal frequency in a manner which complements the standard C-message weighting curve The use of the spread spectrum technique also eliminates thumping at the data rate since the harmonics that produce thumping are also spread throughout the bandwidth

Journal ArticleDOI
TL;DR: An operational definition of backscattering cross section is developed for the wideband reception of finite echoes in this paper, supported by relative measurements on a set of copper spheres by each of four echo sounders operating at frequencies from 38 to 120 kHz.
Abstract: An operational definition of backscattering cross section is developed for the wideband reception of finite echoes. This is supported by relative measurements on a set of copper spheres by each of four echo sounders operating at frequencies from 38 to 120 kHz. Experiential and theoretical arguments are advanced for the superiority of commercial, electrical–grade copper in the application. An optimization problem for determining the sphere size is then formulated, and solved for the case of calibration of a 38 kHz echo sounder by a sphere of the described material. The solution: that the copper sphere diameter be 60.00 mm, is tested through a variety of measurements. These demonstrate an accuracy of 0.1 dB. The further exercise of theory indicates the feasibility of precision calibration of diverse hydroacoustic equipment by copper spheres over most of the kilohertz frequency range.

Journal ArticleDOI
TL;DR: Results support recent finding that isolated vowels may be readily identified by listeners and show improvement in the fixed speaker context is correlated with improved statistical separation resulting from formant normalization, for the gated vowels.
Abstract: This study investigates conditions under which vowels are well recognized and relates perceptual identification of individual tokens to acoustic characteristics. Results support recent findings that isolated vowels may be readily identified by listeners. Two experiments provided evidence that certain response tasks result in inflated error rates. Subsequent experiments showed improved identification in a fixed speaker context, compared with randomized speakers, for isolated vowels and gated centers. Performance was worse for gated vowels, suggesting that dynamic properties (such as duration and diphthongization) supplement steady‐state cues. However, even‐speaker‐randomized gated vowels were well identified (14% errors). Measures of ’’steady‐state information’’ (formant frequencies and f0), ’’dynamic information’’ (formant slopes and duration), and ’’speaker information’’ (normalization) were adopted. Discriminant analyses of acoustic measurements indicated relatively little overlap between vowel categories. Using a new technique for relating acoustic measurements of individual tokens with identification by listeners, it is shown that (a) identification performance is clearly related to acoustic characteristics; (b) improvement in the fixed speaker context is correlated with improved statistical separation resulting from formant normalization, for the gated vowels; and (c) ’’dynamic information’’ is related to identification differences between full and gated isolated vowels.