scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the Acoustical Society of America in 1992"


Journal ArticleDOI
TL;DR: In this study, segmental lengthening in the vicinity of prosodic boundaries is examined and found to be restricted to the rhyme of the syllable preceding the boundary.
Abstract: Numerous studies have indicated that prosodic phrase boundaries may be marked by a variety of acoustic phenomena including segmental lengthening. It has not been established, however, whether this lengthening is restricted to the immediate vicinity of the boundary, or if it extends over some larger region. In this study, segmental lengthening in the vicinity of prosodic boundaries is examined and found to be restricted to the rhyme of the syllable preceding the boundary. By using a normalized measure of segmental lengthening, and by compensating for differences in speaking rate, it is also shown that at least four distinct types of boundaries can be distinguished on the basis of this lengthening.

797 citations


Journal ArticleDOI
TL;DR: Two experiments are described in which listeners judge the apparent directions of virtual sound sources-headphone-presented sounds that are processed in order to simulate free-field sounds, where the apparent direction is determined primarily by interaural intensity difference and pinna cues.
Abstract: Two experiments are described in which listeners judge the apparent directions of virtual sound sources–headphone‐presented sounds that are processed in order to simulate free‐field sounds. Previous results suggest that when the cues to sound direction are preserved by the simulation, the apparent directions of virtual sources are nearly the same as the apparent directions of real free‐field sources. In the experiments reported here, the interaural phase relations in the processing algorithms are manipulated in order to produce stimuli in which the interaural time difference cues signal one direction and interaural intensity and pinna cues signal another direction. The apparent directions of these conflicting cue stimuli almost always follow the interaural time cue, as long as the wideband stimuli include low frequencies. With low frequencies removed from the stimuli, the dominance of interaural time difference disappears, and apparent direction is determined primarily by interaural intensity difference and pinna cues.

763 citations


Journal ArticleDOI
TL;DR: Horsburgh et al. as mentioned in this paper used a laser scattering technique to obtain radius-time curves of a single gas bubble at pressure amplitudes on the order of 150 kPa (1.5 atm) at 21-25 kHz.
Abstract: High‐amplitude radial pulsations of a single gas bubble in several glycerine and water mixtures have been observed in an acoustic stationary wave system at acoustic pressure amplitudes on the order of 150 kPa (1.5 atm) at 21–25 kHz. Sonoluminescence (SL), a phenomenon generally attributed to the high temperatures generated during the collapse of cavitation bubbles, was observed as short light pulses occurring once every acoustic period. These emissions can be seen to originate at the geometric center of the bubble when observed through a microscope. It was observed that the light emissions occurred simultaneously with the bubble collapse. Using a laser scattering technique, experimental radius‐time curves have been obtained which confirm the absence of surface waves, which are expected at pressure amplitudes above 100 kPa. [S. Horsburgh, Ph.D. dissertation, University of Mississippi (1990)]. Also from these radius‐time curves, measurements of the pulsation amplitude, the timing of the major bubble collapse, and the number of rebounds were made and compared with several theories. The implications of this research on the current understanding of cavitation related phenomena such as rectified diffusion, surface wave excitation, and sonoluminescence are discussed.

645 citations


PatentDOI
TL;DR: In this paper, a variable rate coding of frames of digitized speech samples is proposed, comprising the steps of determining a level of speech activity for a frame of digitised speech samples, selecting an encoding rate from a set of rates based upon the determined level of activity within said frame, and coding said frame according to a predetermined coding format for said selected rate wherein each rate has a corresponding different coding format.
Abstract: A method of speech signal compression, by variable rate coding of frames of digitized speech samples, comprising the steps of: determining a level of speech activity for a frame of digitized speech samples; selecting an encoding rate from a set of rates based upon said determined level of speech activity within said frame; coding said frame according to a predetermined coding format for said selected rate wherein each rate has a corresponding different coding format; providing for said frame a corresponding output data packet at said selected rate.

552 citations


PatentDOI
TL;DR: The improved ultrasonic knife of the type for surgical incision in various types of tissue and/or for the removal of cement within the body has a reduced thermal footprint to minimize thermally induced tissue damage.
Abstract: Disclosed is an improved ultrasonic knife of the type for surgical incision in various types of tissue and/or for the removal of cement within the body. The knife has a reduced thermal footprint to minimize thermally induced tissue damage. Tooth configuration on the knife cooperates with the stroke of the ultrasonic drive to produce efficient cutting, as well as tactile feedback to the surgeon with respect to the rate of cutting, and changes in tissue density. Ultrasonic knife tip extenders are also disclosed for advancing the ultrasonic knife tip through the working channel of an endoscope. Methods utilizing the foregoing apparatus are also disclosed.

546 citations


Journal ArticleDOI
TL;DR: Free-field to eardrum transfer functions (HRTFs) were measured from both ears of 10 subjects with sound sources at 265 different positions and revealed that the HRTFs can be modeled as a linear combination of five basic spectral shapes (basis functions), and that this representation accounts for approximately 90% of the variance in the original HRTF magnitude functions.
Abstract: Free‐field to eardrum transfer functions (HRTFs) were measured from both ears of 10 subjects with sound sources at 265 different positions. A principal components analysis of the resulting 5300 HRTF magnitude functions revealed that the HRTFs can be modeled as a linear combination of five basic spectral shapes (basis functions), and that this representation accounts for approximately 90% of the variance in the original HRTF magnitude functions. HRTF phase was modeled by assuming that HRTFs are minimum‐phase functions and that interaural phase differences can be approximated by a simple time delay. Subjects’ judgments of the apparent directions of headphone‐presented sounds that had been synthesized from the modeled HRTFs were nearly identical to their judgments of sounds synthesized from measured HRTFs. With fewer than five basis functions used in the model, a less faithful reconstruction of the HRTF was produced, and the frequency of large localization errors increased dramatically.

528 citations


Journal ArticleDOI
TL;DR: In this paper, new expressions are given that can be used instead of the phenomenological equations of Delany and Bazley, and they provide similar predictions in the range of validity of these equations, and in addition are valid at low frequencies where the equations provided unphysical predictions.
Abstract: New expressions are given that can be used instead of the phenomenological equations of Delany and Bazley. They provide similar predictions in the range of validity of these equations, and in addition are valid at low frequencies where the equations of Delany and Bazley provide unphysical predictions. These new expressions have been worked out by using the general frequency dependence of the viscous forces in porous materials proposed by Johnson et al. [J. Fluid Mech. 176, 379 (1987)], with a transposition carried out to predict the dynamic bulk modulus of air. The model used suggests how sound propagation in fibrous materials can depend both on the diameter of the fibers and on the density of the material.

493 citations


PatentDOI
TL;DR: In this paper, a method and apparatus for automatically identifying a program broadcast by a radio station or by a television channel, or recorded on a medium, by adding an inaudible encoded message to the sound signal of the program, the message identifying the broadcasting channel or station, the program and/or the exact date.
Abstract: A method and apparatus for automatically identifying a program broadcast by a radio station or by a television channel, or recorded on a medium, by adding an inaudible encoded message to the sound signal of the program, the message identifying the broadcasting channel or station, the program, and/or the exact date. In one embodiment the sound signal is transmitted via an analog-to-digital converter to a data processor enabling frequency components to be split up, enabling the energy in some of the frequency components to be altered in a predetermined manner to form an encoded identification message, and with the output from the data processor being connected via a digital-to-analog converter to an audio output for broadcasting or recording the sound signal. In another embodiment, an analog bandpass filter is employed to separate a band of frequencies from the sound signal so that energy in the separated band may be thus altered to encode the sound signal. The invention is particularly applicable to measuring the audiences of programs that are broadcast by radio or television, or that are recorded.

490 citations


Journal ArticleDOI
TL;DR: Evolutionary perspectives invertebrates aspects of hearing among vertebrates anamniotes non-mammalian amniotes mammals epilogue.
Abstract: Evolutionary perspectives invertebrates aspects of hearing among vertebrates anamniotes non-mammalian amniotes mammals epilogue.

465 citations


Journal ArticleDOI
TL;DR: AM responses of single auditory-nerve fibers of the cat are studied, systematically varying modulation depth, frequency, and sound level, and changes in average rate due to modulation were small, and could be enhancement or suppression.
Abstract: Sinusoidally amplitude‐modulated (AM) tones are frequently used in psychophysical and physiological studies, yet a comprehensive study on the coding of AM tones in the auditory nerve is lacking. AM responses of single auditory‐nerve fibers of the cat are studied, systematically varying modulation depth, frequency, and sound level. Synchrony‐level functions were nonmonotonic with maximum values that were inversely correlated with spontaneous rate (SR). In most fibers, envelope phase‐locking showed a positive gain. Modulation transfer functions were uniformly low pass. Their corner frequency increased with characteristic frequency (CF), but changed little for CFs above 10 kHz. The highest modulation frequencies to which phase locking occurred were more than 0.8 oct lower than the highest frequencies to which phase locking to pure tones occurs. Cumulative, or unwrapped, phase increased linearly with modulation frequency: The slope was inversely related to CF, and slightly higher than group delays reported for pure tones. High SR, low CF fibers showed the poorest envelope phase locking. In some low CF fibers, phase locking increased at high levels, associated with ‘‘peak‐splitting’’ phenomena. Changes in average rate due to modulation were small, and could be enhancement or suppression.

453 citations


PatentDOI
Robert D. Strong1
TL;DR: A method of speech recognition which determines acoustic features in a sound sample; recognizes words comprising the acoustic features based on a language model, which determines the possible sequences of words that may be recognized; and the selection of an appropriate response based on the words recognized.
Abstract: A method of speech recognition which determines acoustic features in a sound sample; recognizes words comprising the acoustic features based on a language model, which determines the possible sequences of words that may be recognized; and the selection of an appropriate response based on the words recognized. Information about what words may be recognized, under which conditions those words may be recognized, and what response is appropriate when the words are recognized, is stored, in a preferred embodiment, in a data structure called a speech rule. These speech rules are partitioned according to the context in which they are active. When speech is detected, concurrent with acoustic feature extraction, the current state of the computer system is used to determine which rules are active and how they are to be combined in order to generate a language model for word recognition. A language model is dynamically generated and used to find the best interpretation of the acoustic features as a word sequence. This word sequence is then matched against active rules in order to determine the appropriate response. Rules that match all or part of the word sequence contribute data structures representing the "meaning" of the word sequence, and these data structures are used by the rule actions in order to generate an appropriate response to the spoken utterance.

Journal ArticleDOI
TL;DR: Two electromagnetic midsagittal articulometer systems that were developed for transducing articulatory movements during speech production are described and each one has a specific set of advantages and limitations.
Abstract: This paper describes two electromagnetic midsagittal articulometer (EMMA) systems that were developed for transducing articulatory movements during speech production. Alternating magnetic fields are generated by transmitter coils that are mounted in an assembly that fits on the head of a speaker. The fields induce alternating voltages in a number of small transducer coils that are attached to atriculators in the midline plane, inside and outside the vocal tract. The transducers are connected by fine lead wires to receiver electronics whose output voltages are processed to yield measures of transducer locations as a function of time. Measurement error can arise with this method, because as the articulators move and change shape, the transducers can undergo a varying amount of rotational misalignment with respect to the transmitter axes; both systems are designed to correct for transducer misalignment. For this purpose, one system uses two transmitters and biaxial transducers; the other uses three transmitters and single‐axis transducers. The systems have been compared with one another in terms of their performance, human subjects compatibility, and ease of use. Both systems can produce useful midsagittal‐plane data on articulator movement, and each one has a specific set of advantages and limitations. (Two commercially available systems are also described briefly for comparison purposes.) If appropriate experimental controls are used, the three‐transmitter system is preferable for practical reasons.

PatentDOI
TL;DR: In this article, a transducer is moved longitudinally within an artery while in a fixed radial position to display a planer or rectangular field of view image area of the artery.
Abstract: A catheter for ultrasonic imaging has a transducer fixed to a cutter. The transducer is moved longitudinally within an artery while in a fixed radial position. Ultrasonic reflections are received and processed to display a planer or rectangular field of view image area of the artery. Other axial planes of the artery can be imaged by radially turning the transducer to a different angular orientation within the artery and then longitudinally moving the transducer to obtain an image of another planer field of view.

PatentDOI
TL;DR: In this paper, a cellular core and a facing sheet formed with an array of holes are laser-drilled to provide: (i) hole size variation over the facing sheet; (ii) non-circular hole cross section; (iii) polygonal hole cross Section; (iv) hole locations not contiguous with walls of the cellular core; and (v) inclined holes passing through the face sheet in a direction inclined to the normal to the plane.
Abstract: A noise attenuation panel used to attenuate noise in aircraft engines includes a cellular core and a facing sheet formed with an array of holes The holes are laser drilled to provide: (i) hole size variation over the facing sheet; (ii) non-circular hole cross section; (iii) polygonal hole cross section; (iv) hole locations not contiguous with walls of the cellular core; and (v) inclined holes passing through the facing sheet in a direction inclined to the normal to the facing sheet

Journal ArticleDOI
TL;DR: The striking individuality of two legendary pianists, Alfred Cortot and Vladimir Horowitz, is objectively demonstrated here, as is the relative eccentricity of several other artists.
Abstract: This study attempts to characterize the temporal commonalities and differences among distinguished pianists’ interpretations of a well‐known piece, Robert Schumann’s ‘‘Traumerei.’’ Intertone onset intervals (IOIs) were measured in 28 recorded performances. These data were subjected to a variety of statistical analyses, including principal components analysis of longer stretches of music and curve fitting to series of IOIs within brief melodic gestures. Global timing patterns reflected the hierarchical grouping structure of the composition, with pronounced ritardandi at the ends of major sections and frequent expressive lengthening of accented tones within melodic gestures. Analysis of local timing patterns, particularly of within‐gesture ritardandi, revealed that they often followed a parabolic timing function. The major variation in these patterns can be modeled by families of parabolas with a single degree of freedom. The grouping structure, which prescribes the location of major tempo changes, and the ...

PatentDOI
TL;DR: In this paper, a speech bandwidth extension method and apparatus analyzes narrowband speech sampled at 8 kHz using LPC analysis to determine its spectral shape and inverse filtering to extract its excitation signal.
Abstract: A speech bandwidth extension method and apparatus analyzes narrowband speech sampled at 8 kHz using LPC analysis to determine its spectral shape and inverse filtering to extract its excitation signal. The excitation signal is interpolated to a sampling rate of 16 kHz and analyzed for pitch control and power level. A white noise generated wideband signal is then filtered to provide a synthesized wideband excitation signal. The narrowband shape is determined and compared to templates in respective vector quantizer codebooks, to select respective highband shape and gain. The synthesized wideband excitation signal is then filtered to provide a highband signal which is, in turn, added to the narrowband signal, interpolated to the 16 kHz sample rate, to produce an artificial wideband signal. The apparatus may be implemented on a digital signal processor chip.

PatentDOI
TL;DR: A system for synthesizing a speech signal from strings of words, which are themselves strings of characters, includes a memory in which predetermined syntax tags are stored in association with entered words and phonetic transcriptions are storedIn association with the syntax tags.
Abstract: A system for synthesizing a speech signal from strings of words, which are themselves strings of characters, includes a memory in which predetermined syntax tags are stored in association with entered words and phonetic transcriptions are stored in association with the syntax tags. A parser accesses the memory and groups the syntax tags of the entered words into phrases according to a first set of predetermined grammatical rules relating the syntax tags to one another. The parser also verifies the conformance of sequences of the phrases to a second set of predetermined grammatical rules relating the phrases to one another. The system retrieves the phonetic transcriptions associated with the syntax tags that were grouped into phrases conforming to the second set of rules, and also translates predetermined strings of characters into words. The system generates strings of phonetic transcriptions and prosody markers corresponding to respective strings of the words, and adds markers for rhythm and stress to the strings, which are then converted into data arrays having prosody information on a diphone-by-diphone basis. Predetermined diphone waveforms are retrieved from memory that correspond to the entered words, and these retrieved waveforms are adjusted based on the prosody information in the arrays. The adjusted diphone waveforms, which may also be adjusted for coarticulation, are then concatenated to form the speech signal. Methods in a digital computer are also disclosed.

Journal ArticleDOI
TL;DR: It was found that degree of accent is influenced by range effects, and the larger the proportion of native speakers included in a set of sentences being evaluated, the more strongly accented listeners judged sentences spoken by non-native speakers to be.
Abstract: Four experiments were carried out to examine listener- and talker-related factors that may influence degree of perceived foreign accent. In each, native English listeners rated English sentences for degree of accent. It was found that degree of accent is influenced by range effects. The larger the proportion of native (or near-native) speakers included in a set of sentences being evaluated, the more strongly accented listeners judged sentences spoken by non-native speakers to be. Foreign accent ratings were not stable. Listeners judged a set of non-native-produced sentences to be more strongly accented after, as compared to before, they became familiar with those sentences. One talker-related effect noted in the study was the finding that adults' pronunciation of an L2 may improve over time. Late L2 learners who had lived in the United States for an average of 14.3 years received significantly higher scores than late learners who had resided in the United States for 0.7 years. Another talker-related effect pertained to the age of L2 learning (AOL). Native Spanish subjects with an AOL of five to six years were not found to have an accent (i.e., to receive significantly lower scores than native English speakers), whereas native Chinese subjects with an average AOL of 7.6 years did have a measurable accent. The paper concludes with the presentation of several hypotheses concerning the relationship between AOL and degree of foreign accent.

Journal ArticleDOI
TL;DR: A test of the model, using techniques adapted from signal detection theory, indicated that subjects tend to use interaural level difference and spectral shape cues independently, limited only by a slight spatial correlation of the two cues.
Abstract: Human subjects localized brief 1/6‐oct bandpassed noise bursts that were centered at 6, 8, 10, and 12 kHz. All testing was done under binaural conditions. The horizontal component of subjects’ responses was accurate, comparable to that for broadband localization, but the vertical and front/back components exhibited systematic errors. Specifically, responses tended to cluster within restricted ranges that were specific for each center frequency. The directional transfer functions of the subjects’ external ears were measured for 360 horizontal and vertical locations. The spectra of the sounds that were present in the subjects’ ear canals, the ‘‘proximal stimulus’’ spectra, were computed by combining the spectra of the narrow‐band sound sources with the directional transfer functions for particular stimulus locations. Subjects consistently localized sounds to regions within which the associated directional transfer function correlated most closely with the proximal stimulus spectrum. A quantitative model was constructed that successfully predicted subjects’ responses based on interaural level difference and spectral cues. A test of the model, using techniques adapted from signal detection theory, indicated that subjects tend to use interaural level difference and spectral shape cues independently, limited only by a slight spatial correlation of the two cues. A testing procedure is described that provides a quantitative comparison of various predictive models of sound localization.

PatentDOI
TL;DR: Disclosed is a system and method for generating text from a voice input that divides the processing of each speech event into a dictation event and a text event, which includes the ability to distinguish between dictation text and commands.
Abstract: Disclosed is a system and method for generating text from a voice input that divides the processing of each speech event into a dictation event and a text event. Each dictation event handles the processing of data relating to the input into the system, and each text event deals with the generation of text from the inputted voice signals. In order to easily distinguish the dictation events from each other and text events from each other the system and method creates a data structure for storing certain information relating to each individual event. Such data structures enable the system and method to process both simple spoken words as well as spoken commands and to provide the necessary text generation in response to the spoken words or to execute an appropriate function in response to a command. Speech recognition includes the ability to distinguish between dictation text and commands.

Journal ArticleDOI
TL;DR: It appears that the modulations in the masker act as an important cue for the normal-hearing listeners, who experience up to 5-dB release from masking, while being hardly beneficial for the hearing impaired listeners.
Abstract: Speech‐reception thresholds (SRT) were measured for 17 normal‐hearing and 17 hearing‐impaired listeners in conditions simulating free‐field situations with between one and six interfering talkers. The stimuli, speech and noise with identical long‐term average spectra, were recorded with a KEMAR manikin in an anechoic room and presented to the subjects through headphones. The noise was modulated using the envelope fluctuations of the speech. Several conditions were simulated with the speaker always in front of the listener and the maskers either also in front, or positioned in a symmetrical or asymmetrical configuration around the listener. Results show that the hearing impaired have significantly poorer performance than the normal hearing in all conditions. The mean SRT differences between the groups range from 4.2–10 dB. It appears that the modulations in the masker act as an important cue for the normal‐hearing listeners, who experience up to 5‐dB release from masking, while being hardly beneficial for the hearing impaired listeners. The gain occurring when maskers are moved from the frontal position to positions around the listener varies from 1.5 to 8 dB for the normal hearing, and from 1 to 6.5 dB for the hearing impaired. It depends strongly on the number of maskers and their positions, but less on hearing impairment. The difference between the SRTs for binaural and best‐ear listening (the ‘‘cocktail party effect’’) is approximately 3 dB in all conditions for both the normal‐hearing and the hearing‐impaired listeners.

PatentDOI
Yoshio Satoh1, Osamu Ikata1, Tsutomu Miyashita1, Takashi Matsuda1, Mitsuo Takamatsu1 
TL;DR: In this article, a SAW filter comprising a piezoelectric substrate and at least two filter tracks formed on the substrate, each having at least 2 IDT electrodes for input and output, is presented.
Abstract: A SAW filter comprising a piezoelectric substrate and at least two filter tracks formed on the substrate, each having at least two IDT electrodes for input and output. The two filter tracks have substantially the same phase within a pass band, while it is substantially inverse-phased outside the pass band. For realizing the above-described conditions, input IDT electrode of one filter track is connected in parallel with input IDT electrode of the other filter track, while output IDT electrode of one filter track is connected in parallel with output IDT electrode of the other filter track. Furthermore, frequency values of said two filter tracks substantially coincide at a point 3dB lower from the peak transfer function value. Thus the above-configured SAW filter of the present invention is smaller in the overall size and offers a broad pass band and a steep attenuation characteristic.

Journal ArticleDOI
TL;DR: In this paper, three single-scattering approximations for coefficients in Biot's equations of poroelasticity are considered: the average T-matrix approximation, the coherent potential approximation, and the differential effective medium (DEM).
Abstract: Three single‐scattering approximations for coefficients in Biot’s equations of poroelasticity are considered: the average T‐matrix approximation (ATA), the coherent potential approximation (CPA), and the differential effective medium (DEM). The scattering coefficients used here are exact results obtained previously for scattering from a spherical inclusion of one Biot material imbedded in another otherwise homogeneous Biot material. The CPA has been shown previously to guarantee that, if the coefficients for the scattering materials satisfy Gassmann’s equation, then the effective coefficients for the composite medium satisfy Brown and Korringa’s generalization of Gassmann’s equation. A collection of similar results is obtained here showing that the coefficients derived from ATA, CPA, or DEM all satisfy the required conditions for consistency. It is also shown that Gassmann’s equation will result from any of these single‐scattering approximations if the collection of scatterers includes only spheres of fluid and of a single type of elastic solid.

Journal ArticleDOI
TL;DR: In this paper, a method employing holograms that conform with arbitrarily shaped sources has been developed for enhancing conventional near-field acoustic holography, which is limited to sources with simple geometries, e.g., planar or cylindrical surfaces.
Abstract: A method employing holograms that conform with arbitrarily shaped sources has been developed for enhancing conventional near‐field acoustic holography, which has been limited to sources with simple geometries, e.g., planar or cylindrical surfaces. Four holography transformation algorithms have been developed, based on acoustic holography theory and the boundary element method (BEM). Singular value decomposition (SVD) has been incorporated into the algorithms in order to alleviate the ill‐posed nature frequently encountered in backward reconstruction of a field. A pulsating sphere, a cylinder with spherical endcaps, and a vibrating piston set in a rigid sphere have been adopted in a numerical simulation for verifying the algorithms. Satisfactory agreement has been achieved between the holographically transformed results and the analytical solutions.

Journal ArticleDOI
TL;DR: In this article, an efficient method of transforming a discrete Fourier transform (DFT) into a constant Q transform, where Q is the ratio of center frequency to bandwidth, has been devised.
Abstract: An efficient method of transforming a discrete Fourier transform (DFT) into a constant Q transform, where Q is the ratio of center frequency to bandwidth, has been devised. This method involves the calculation of kernels that are then applied to each subsequent DFT. Only a few multiples are involved in the calculation of each component of the constant Q transform, so this transformation adds a small amount to the computation. In effect, this method makes it possible to take full advantage of the computational efficiency of the fast Fourier transform (FFT). Graphical examples of the application of this calculation to musical signals are given for sounds produced by a clarinet and a violin.

Journal ArticleDOI
TL;DR: The Springer Handbook of Auditory Research presents a series of comprehensive and synthetic reviews of the fundamental topics in modern auditory research aimed at all individuals with interests in hearing research including advanced graduate students, post-doctoral researchers, and clinical investigators.
Abstract: The Springer Handbook of Auditory Research presents a series of comprehensive and synthetic reviews of the fundamental topics in modern auditory research. It is aimed at all individuals with interests in hearing research including advanced graduate students, post-doctoral researchers, and clinical investigators. The volumes will introduce new investigators to important aspects of hearing science and help established investigators to better understand the fundamental theories and data in fields of hearing that they may not normally follow closely. Each volume is intended to present a particular topic comprehensively, and each chapter will serve as a synthetic overview and guide to the literature. As such, the chapters present neither exhaustive data reviews nor original research that has not appeared in peer-reviewed journals. The series focuses on topics that have developed a solid data and conceptual foundation rather than on those for which a literature is only beginning to develop. New research areas will be covered on a timely basis in the series as they begin to mature. Each volume in the series consists of five to eight substantial chapters on a particular topic. In some cases the topics will be ones of traditional interest for which there is a solid body of data and theory, such as auditory neuroanatomy (Vol. 1) and neurophysiology (Vol. 2).

Journal ArticleDOI
TL;DR: In this article, a computational model of musical dynamics is proposed that complements an earlier model of expressive timing, implemented in the artificial intelligence language LISP, is based on the observation that a musical phrase is often indicated by a crescendo/decrescendingo shape.
Abstract: A computational model of musical dynamics is proposed that complements an earlier model of expressive timing. The model, implemented in the artificial intelligence language LISP, is based on the observation that a musical phrase is often indicated by a crescendo/decrescendo shape. The functional form of this shape is derived by making two main assumptions. First, that musical dynamics and tempo are coupled, that is, ‘‘the faster the louder, the slower the softer.’’ This tempo/dynamics coupling, it is suggested, may be a characteristic of some classical and romantic styles perhaps exemplified by performances of Chopin. Second, that the tempo change is governed by analogy to physical movement. The allusion of musical expression to physical motion is further extended by the introduction of the concepts of energy and mass. The utility of the model, in addition to giving an insight into the nature of musical expression, is that it provides a basis for a method of performance style analysis.

Journal ArticleDOI
TL;DR: In this paper, measurements and analysis of a 13 cm−diam thermo-acoustic engine are presented, and the authors identify several causes of this amplitude-dependent deviation, including resonanceenhanced harmonic content in the acoustic wave, and a new, first-order temperature defect in thermoacoustic heat exchangers.
Abstract: Measurements and analysis of a 13‐cm‐diam thermoacoustic engine are presented. At its most powerful operating point, using 13.8‐bar helium, the engine delivered 630 W to an external acoustic load, converting heat to delivered acoustic power with an efficiency of 9%. At low acoustic amplitudes, where (linear) thermoacoustic theory is expected to apply, measurements of temperature difference and frequency agree with the predictions of theory to within 4%, over conditions spanning factors of 4 in mean pressure, 10 in pressure amplitude, 6 in frequency, and 3 in gas sound speeds. But measurements of the square of pressure amplitude versus heater power differ from the predictions of theory by 20%, twice the estimated uncertainty in the results. At higher pressure amplitudes (up to 16% of the mean pressure), even more significant deviation from existing thermoacoustic theory is observed. Several causes of this amplitude‐dependent deviation are identified, including resonance‐enhanced harmonic content in the acoustic wave, and a new, first‐order temperature defect in thermoacoustic heat exchangers. These causes explain some, but not all, of the amplitude‐dependent deviation of the high‐amplitude measurements from existing (linear) theory.

Journal ArticleDOI
TL;DR: The speech identification abilities of four subjects with bilateral symmetric sensorineural hearing impairment were investigated following provision of a single hearing aid, and the findings support the existence of perceptual acclimatization effects, and call into question short-term methods of hearing aid evaluation and selection by comparative speech identification tests.
Abstract: At high presentation levels, normally aided ears yield better performance for speech identification than normally unaided ears, while at low presentation levels the converse is true [S. Gatehouse, J. Acoust. Soc. Am. 86, 2103–2106 (1989)]. To explain this process further, the speech identification abilities of four subjects with bilateral symmetric sensorineural hearing impairment were investigated following provision of a single hearing aid. Results showed significant increases in the benefit from amplifying speech in the aided ear, but not in the control ear. In addition, a headphone simulation of the unaided condition for the fitted ear shows a decrease in speech identification. The benefits from providing a particular frequency spectrum do not emerge immediately, but over a time course of at least 6–12 weeks. The findings support the existence of perceptual acclimatization effects, and call into question short‐term methods of hearing aid evaluation and selection by comparative speech identification tests.

Journal ArticleDOI
TL;DR: In this paper, the effects of context on the perception of voicing contrasts specified by voice-onset-time (VOT) in syllable-initial stop consonants were examined.
Abstract: In this investigation, the effects of context on the perception of voicing contrasts specified by voice‐onset‐time (VOT) in syllable‐initial stop consonants were examined. In an earlier paper [J. L. Miller and L. E. Volaitis, Percept. Psychophys. 46, 505–512 (1989)], it was reported that the listener’s adjustment for one contextual variable, speaking rate, was not confined to the region of the phonetic category boundary, but extended throughout the phonetic category. The current investigation examines whether this type of perceptual remapping also occurs for another contextual variable, the place of articulation of the syllable‐initial consonant. In a preliminary experiment that involved acoustic measurement of natural speech, it was confirmed that as place of articulation moves from labial to velar, VOT increases, and it was established that this occurs across a range of speaking rates (syllable durations). In the main experiments, which focused on the voiceless category, it was found that this acoustic ...