Showing papers in "Journal of the Acoustical Society of America in 1987"

PDF

Open Access

Journal Article•DOI•

Review of text‐to‐speech conversion for English

[...]

01 Sep 1987-Journal of the Acoustical Society of America

TL;DR: This review traces the early work on the development of speech synthesizers, discovery of minimal acoustic cues for phonetic contrasts, evolution of phonemic rule programs, incorporation of prosodic rules, and formulation of techniques for text analysis.

...read moreread less

Abstract: The automatic conversion of English text to synthetic speech is presently being performed, remarkably well, by a number of laboratory systems and commercial devices. Progress in this area has been made possible by advances in linguistic theory, acoustic-phonetic characterization of English sound patterns, perceptual psychology, mathematical modeling of speech production, structured programming, and computer hardware design. This review traces the early work on the development of speech synthesizers, discovery of minimal acoustic cues for phonetic contrasts, evolution of phonemic rule programs, incorporation of prosodic rules, and formulation of techniques for text analysis. Examples of rules are used liberally to illustrate the state of the art. Many of the examples are taken from Klattalk, a text-to-speech system developed by the author. A number of scientific problems are identified that prevent current systems from achieving the goal of completely human-sounding speech. While the emphasis is on rule programs that drive a format synthesizer, alternatives such as articulatory synthesis and waveform concatenation are also reviewed. An extensive bibliography has been assembled to show both the breadth of synthesis activity and the wealth of phenomena covered by rules in the best of these programs. A recording of selected examples of the historical development of synthetic speech, enclosed as a 33 1/3-rpm record, is described in the Appendix.

...read moreread less

843 citations

Journal Article•DOI•

The effect of head‐induced interaural time and level differences on speech intelligibility in noise

[...]

Adelbert W. Bronkhorst, Reinier Plomp

01 May 1987-Journal of the Acoustical Society of America

TL;DR: Speech reception thresholds (SRT) for sentences in noise were determined for normal-hearing subjects and the effect of interaural time delay (ITD) and acoustic headshadow on binaural speech intelligibility in noise was made.

...read moreread less

Abstract: A study was made of the effect of interaural time delay (ITD) and acoustic headshadow on binaural speech intelligibility in noise. A free-field condition was simulated by presenting recordings, made with a KEMAR manikin in an anechoic room, through earphones. Recordings were made of speech, reproduced in front of the manikin, and of noise, emanating from seven angles in the azimuthal plane, ranging from 0 degree (frontal) to 180 degrees in steps of 30 degrees. From this noise, two signals were derived, one containing only ITD, the other containing only interaural level differences (ILD) due to headshadow. Using this material, speech reception thresholds (SRT) for sentences in noise were determined for a group of normal-hearing subjects. Results show that (1) for noise azimuths between 30 degrees and 150 degrees, the gain due to ITD lies between 3.9 and 5.1 dB, while the gain due to ILD ranges from 3.5 to 7.8 dB, and (2) ILD decreases the effectiveness of binaural unmasking due to ITD (on the average, the threshold shift drops from 4.6 to 2.6 dB). In a second experiment, also conducted with normal-hearing subjects, similar stimuli were used, but now presented monaurally or with an overall 20-dB attenuation in one channel, in order to simulate hearing loss. In addition, SRTs were determined for noise with fixed ITDs, for comparison with the results obtained with head-induced (frequency dependent) ITDs.(ABSTRACT TRUNCATED AT 250 WORDS)

...read moreread less

509 citations

Journal Article•DOI•

Gaussian beam tracing for computing ocean acoustic fields

[...]

Michael B. Porter, Homer Bucker

01 Oct 1987-Journal of the Acoustical Society of America

TL;DR: The Gaussian beam method as mentioned in this paper associates with each ray a beam with a Gaussian intensity profile normal to the ray, and the beamwidth and curvature are governed by an additional pair of differential equations, which are integrated along with the usual ray equations to compute the beam field.

...read moreread less

Abstract: The method of Gaussian beam tracing has recently received a great deal of attention in the seismological community. In comparison to standard ray tracing, the method has the advantage of being free of certain ray‐tracing artifacts such as perfect shadows and infinitely high energy at caustics. It also obviates the need for eigenray computations. The technique is especially attractive for high‐frequency, range‐dependent problems where normal mode, FFP, or parabolic models are not practical alternatives. The Gaussian beam method associates with each ray a beam with a Gaussian intensity profile normal to the ray. The beamwidth and curvature are governed by an additional pair of differential equations, which are integrated along with the usual ray equations to compute the beam field in the vicinity of the central ray of the beam. We have adapted the beam‐tracing method to the typical ocean acoustic problem of a point source in a cylindrically symmetric waveguide with depth‐dependent sound speed. We present an overview of the method and a comparison of results obtained by conventional ray‐tracing, beam‐tracing, and full‐wave theories. These results suggest that beam tracing is markedly superior to conventional ray tracing.

...read moreread less

454 citations

Journal Article•DOI•

Fish target strengths for use in echo integrator surveys

[...]

Kenneth G. Foote

01 Sep 1987-Journal of the Acoustical Society of America

TL;DR: In this paper, in situ measurements of fish target strength are selected for use in echo integrator surveys at 38 kHz, and the results are expressed through equations in which the mean target strength TS is regressed on the mean fish length l in centimeters.

...read moreread less

Abstract: In situ measurements of fish target strength are selected for use in echo integrator surveys at 38 kHz. The results are expressed through equations in which the mean target strength TS is regressed on the mean fish length l in centimeters. For physoclists, TS=20 log l−67.4, and for clupeoids, TS=20 log l−71.9. These equations are supported by independent measurements on tethered, caged, and freely aggregating fish and by theoretical computations based on the swimbladder form. Causes of data variability are attributed to differences in species, behavior, and, possibly, swimbladder state.

...read moreread less

419 citations

Patent•DOI•

Ultra-thin acoustic transducer and balloon catheter using same in imaging array subassembly

[...]

James M. Griffith, Mario Maciel, Joseph L. Pope, Walter L. Henry, Paul J. Zalesky - Show less +1 more

26 May 1987-Journal of the Acoustical Society of America

TL;DR: An array of miniature ultrasound crystals mounted on preassembled subassembly which is, in turn, mounted on a small lumen catheter provides dimensional and other quantitative information relating to arterial wall geometry and character at disease or obstruction sites as mentioned in this paper.

...read moreread less

Abstract: An array of miniature ultrasound crystals mounted on preassembled subassembly which is, in turn, mounted on a small lumen catheter provides dimensional and other quantitative information relating to arterial wall geometry and character at disease or obstruction sites Balloons also mounted to the catheter make it possible to use the catheter for the angioplasty (PCTA) procedure while actually imaging, in real time, the artery being dilatated and unblocked by the procedure Efficient, highly miniature transducers are presented along with several different configurations for catheter structure containing fluid lumen, through-lumen, and electrical microcable assemblies for conducting electrical signals to and from the transducers

...read moreread less

385 citations

Patent•DOI•

Vibration wave motor

[...]

Hitoshi Mukohjima¹, Akira Hiramatsu¹, Kazuhiro Izukawa¹, Takuo Okuno¹, Ichiro Okumura¹, Takayuki Tsukimoto¹ - Show less +2 more•Institutions (1)

Canon Inc.¹

06 Jul 1987-Journal of the Acoustical Society of America

TL;DR: A vibration wave device having a resilient member and electro-mechanical energy conversion means for inducing a travelling vibration wave in the resilient member, the traveling vibration wave being used as a drive source, includes means provided on at least one member of the vibrational wave device at a location integer times or approximately integer times 1/2 of the wavelength of a vibration wave which may produce noise and making the dynamic rigidity of the member non-uniform.

...read moreread less

Abstract: A vibration wave device having a resilient member, and electro-mechanical energy conversion means for inducing a travelling vibration wave in the resilient member, the travelling vibration wave being used as a drive source, includes means provided on at least one member of the vibration wave device at a location integer times or approximately integer times 1/2 of the wavelength of a vibration wave which may produce noise and making the dynamic rigidity of the member non-uniform.

...read moreread less

360 citations

Patent•DOI•

Speech recognition system

[...]

Lalit R. Bahl¹, Peter Vincent Desouza¹, Steven Vincent Degennaro¹, Robert Leroy Mercer¹•Institutions (1)

IBM¹

20 Mar 1987-Journal of the Acoustical Society of America

TL;DR: In this paper, the likelihood of a word in a vocabulary of words is evaluated for each word, each total score being the re-sult of combining at least two word scores generated by differing algorithms.

...read moreread less

Abstract: SPEECH RECOGNITION SYSTEM ABSTRACT Apparatus and method for evaluating the likelihood of a word in a vocabulary of words wherein a total score is evaluated for each word, each total score being the re-sult of combining at least two word scores generated by differing algorithms. In one embodiment, a detailed acoustic match word score is combined with an approxi-mate acoustic match word score to provide a total word score for a subject word. In another embodiment, a polling word score is combined with an acoustic match word score to provide a total word score for a subject word. The acoustic models employed in the acoustic matching may correspond, alternatively, to phonetic el-ements or to fenemes. Fenemes represent labels generated by an acoustic processor in response to a spoken input. Apparatus and method for determining word scores ac-cording to approximate acoustic matching and for deter-mining word scores according to a polling methodology are disclosed.

...read moreread less

357 citations

Journal Article•DOI•

Speech waveform envelope cues for consonant recognition

[...]

Dianne J. Van Tasell¹, Sigfrid D. Soli, Virginia M. Kirby, Gregory P. Widin•Institutions (1)

University of Minnesota¹

01 Oct 1987-Journal of the Acoustical Society of America

TL;DR: It is suggested that near-perfect consonant identification performance could be attained by subjects who receive only enveme and viseme information and no spectral information.

...read moreread less

Abstract: This study investigated the cues for consonant recognition that are available in the time‐intensity envelope of speech. Twelve normal‐hearing subjects listened to three sets of spectrally identical noise stimuli created by multiplying noise with the speech envelopes of 19 /aCa/ natural‐speech nonsense syllables. The speech envelope for each of the three noise conditions was derived using a different low‐pass filter cutoff (20, 200, and 2000 Hz). Average consonant identification performance was above chance for the three noise conditions and improved significantly with the increase in envelope bandwidth from 20–200 Hz. SINDSCAL multidimensional scaling analysis of the consonant confusions data identified three speech envelope features that divided the 19 consonants into four envelope feature groups (‘‘envemes’’). The enveme groups in combination with visually distinctive speech feature groupings (‘‘visemes’’) can distinguish most of the 19 consonants. These results suggest that near‐perfect consonant identification performance could be attained by subjects who receive only enveme and viseme information and no spectral information.

...read moreread less

313 citations

Patent•DOI•

Intravascular ultrasonic catheter/probe and method for treating intravascular blockage

[...]

Anthony T. Donmichael, Robert J. Siegel, Eugene A. Decastro

13 Nov 1987-Journal of the Acoustical Society of America

TL;DR: In this article, an approach for treating atherosclerotic plaque and thromboses by the application of ultrasonic energy to a site of intravascular blockage is described.

...read moreread less

Abstract: Apparatus and method are disclosed for treating atherosclerotic plaque and thromboses by the application of ultrasonic energy to a site of intravascular blockage The ultrasonic apparatus includes a solid wire probe having a bulbous tip at one end and coupled to an ultrasonic energy source at the other end, the probe being carried within a hollow catheter The catheter and probe are inserted into a blood vessel and are advanced to the site of a stenosis, where the probe is extended from the catheter and caused to vibrate ultrasonically, resulting in the destruction of the arterial plaque The ultrasonic apparatus includes a fitting for delivering a radiographic contrast solution to the probe tip by flowing the solution into the catheter, the contrast fluid being released into the blood vessel to assist in positioning the apparatus and determining the effectiveness of treatment A physiologic solution may also be carried to the probe tip by flowing the solution through the catheter, thereby controlling the temperature of the probe tip during the procedure

...read moreread less

302 citations

Journal Article•DOI•

Pitch perception by cochlear implant subjects

[...]

Brent Townshend, Neil Cotter, Dirk Van Compernolle, Robert L. White

01 Jul 1987-Journal of the Acoustical Society of America

TL;DR: The effects of the three basic stimulus parameters of level, repetition rate, and stimulation location on subjects' percepts were examined and their impact on speech-processing strategies and their relevance to acoustic pitch perception were discussed.

...read moreread less

Abstract: Direct electrical stimulation of the auditory nerve can be used to restore some degree of hearing to the profoundly deaf. Percepts due to electrical stimulation have characteristics corresponding approximately to the acoustic percepts of loudness, pitch, and timbre. To encode speech as a pattern of electrical stimulation, it is necessary to determine the effects of the stimulus parameters on these percepts. The effects of the three basic stimulus parameters of level, repetition rate, and stimulation location on subjects' percepts were examined. Pitch difference limens arising from changes in rate of stimulation increase as the stimulating rate increases, up to a saturation point of between 200 and 1000 pulses per second. Changes in pitch due to electrode selection depend upon the subject, but generally agree with a tonotopic organization of the human cochlea. Further, the discriminability of such place-pitch percepts seems to be dependent on the degree of current spread in the cochlea. The effect of stimulus level on perceived pitch is significant but is highly dependent on the individual tested. The results of these experiments are discussed in terms of their impact on speech-processing strategies and their relevance to acoustic pitch perception.

...read moreread less

300 citations

Patent•DOI•

Ultrasound enhancement of transdermal drug delivery

[...]

Joseph Kost¹, Drora Levy¹, Robert Langer¹•Institutions (1)

Massachusetts Institute of Technology¹

29 Jun 1987-Journal of the Acoustical Society of America

TL;DR: In this paper, a method using ultrasound for enhancing and controlling transbuccal permeation of a molecule, including drugs, antigens, vitamins, inorganic and organic compounds, through the buccal membranes and into the circulatory system is presented.

...read moreread less

Abstract: A method using ultrasound for enhancing and controlling transbuccal permeation of a molecule, including drugs, antigens, vitamins, inorganic and organic compounds, and various combinations of these substances, through the buccal membranes and into the circulatory system. The frequency and intensity of ultrasonic energy which is applied, and the length of time of exposure are determined according to the location and nature of the buccal membrane and the substance to be infused. Levels of the infused molecules in the blood and urine measured over a period of time are initially used to determine under what conditions optimum transfer occurs. In a variation of the method, whereby ultrasound is applied directly to the compound and site where the compound is to be infused through the buccal membranes, the compound can be placed within a delivery device. In one variation, the ultrasound can control release both by direct interaction with the compound and membrane but also with the delivery device. In another variation, the delivery device helps to modulate release and infusion rate. The compound can also be administered in combination with a chemical agent which alters permeability of the buccal membrane, thereby aiding infusion of the compound into the circulatory system.

...read moreread less

Journal Article•DOI•

Nearfield acoustic holography (NAH) II. Holographic reconstruction algorithms and computer implementation

[...]

W. A. Veronesi, Julian D. Maynard

01 May 1987-Journal of the Acoustical Society of America

TL;DR: In this article, the approximations and assumptions necessary to reduce the infinite and continuous convolution integrals encountered in these problems to a finite and discrete form, suitable for high speed numerical processing, are illuminated theoretically and tested numerically.

...read moreread less

Abstract: The basic theory treating steady‐state acoustic radiation problems in the nearfield has been presented in several articles on nearfield acoustic holography. In this article, the approximations and assumptions necessary to reduce the infinite and continuous convolution integrals encountered in these problems to a finite and discrete form, suitable for high‐speed numerical processing, are illuminated theoretically and tested numerically. To evaluate the convolution integrals two assumptions are made: First, the boundary field may be replaced with a patchwise constant field for reasonably small patches; and, second, the field is negligible outside of a finite region. With these two assumptions, the problem reduces to one of representing the Green’s functions. Six methods of sampling or representing the Green’s functions are developed, and these are compared theoretically and numerically.

...read moreread less

Journal Article•DOI•

The timing of prenuclear high accents in English

[...]

Kim E. A. Silverman, Janet B. Pierrehumbert

01 Nov 1987-Journal of the Acoustical Society of America

TL;DR: In this article, an experimental study of the alignment of prenuclear accent peaks with their associated syllables is described, and it is concluded that rules for generating phonetic details from phonological structure must access information about the upcoming prosodic context.

...read moreread less

Abstract: In English, the alignment of intonation peaks with their syllables exhibits a great deal of contextually governed variation. Understanding this variation is of theoretical interest, and modeling it correctly is important for good quality intonation synthesis. An experimental study of the alignment of prenuclear accent peaks with their associated syllables will be described. Two speakers produced repetitions of names of the form “Ma Lemm,” “Mom LeMann,” “Mamalie Lemonick,” and “Mama Lemonick,” with all combinations of the four first names and three surnames. Segmental durations and the F0 peak location in the first name were measured. Results show that although both speaking rate and prosodic context affect syllable duration, they exert different influences on peak alignment. Specifically, when a syllable is lengthened by a word boundary (e.g., Ma Le Man versus Mama Lemm) or stress clash (e.g., Ma Lemm), the peak falls disproportionately earlier in the vowel. This seems to be related to the syllable‐internal durational patterns. It is concluded that rules for generating phonetic details from phonological structure must access information about the upcoming prosodic context.

...read moreread less

Journal Article•DOI•

The 20‐Hz signals of finback whales (Balaenoptera physalus)

[...]

William A. Watkins¹, Peter L. Tyack, Karen E. Moore, James E Bird•Institutions (1)

Woods Hole Oceanographic Institution¹

01 Dec 1987-Journal of the Acoustical Society of America

TL;DR: Direct association of the bouts with the reproductive season for this species points to the 20-Hz signals as possible reproductive displays by finback whales.

...read moreread less

Abstract: The 20‐Hz signals of finback whales (Balaenoptera physalus) were analyzed from more than 25 years of recordings at a variety of geographic locations on near‐surface hydrophones close to whales and on deep hydrophone systems. These signals were composed of 1‐s pulses of sinusoidal waveform with downward sweeping frequency from approximately 23 to 18 Hz at variable source levels up to 186 dB (re: 1 μPa at 1 m), usually with slightly lower levels for the pulses at the beginning and end of sequences. These ‘‘20‐Hz’’ pulses were produced in signal bouts (separated by more than 2 h) lasting as long as 32.5 h. Bouts were composed of regularly repeated pulses at intervals of 7–26 s (typically), either at one nominal pulse rate or at two alternating (doublet) pulse intervals. Signal bouts were interrupted by rests of 1–20 min at roughly 15‐min intervals and by irregular gaps lasting between 20 and 120 min. The distribution of these signals throughout the year and their temporal sequence were analyzed from the cont...

...read moreread less

Journal Article•DOI•

Potentials evoked by the sinusoidal modulation of the amplitude or frequency of a tone

[...]

Terence W. Picton, Christopher R. Skinner, Sandra C. Champagne, Adrian J. C. Kellett, Anita C. Maiste - Show less +1 more

01 Jul 1987-Journal of the Acoustical Society of America

TL;DR: Steady state responses to the sinusoidal modulation of the amplitude or frequency of a tone were recorded from the human scalp for both amplitude modulation (AM) and frequency modulation (FM), the responses were most consistent at modulation frequencies between 30 and 50 Hz.

...read moreread less

Abstract: Steady state responses to the sinusoidal modulation of the amplitude or frequency of a tone were recorded from the human scalp. For both amplitude modulation (AM) and frequency modulation (FM), the responses were most consistent at modulation frequencies between 30 and 50 Hz. However, reliable responses could also be recorded at lower frequencies, particularly at 2-5 Hz for AM and at 3-7 Hz for FM. With increasing modulation depth at 40 Hz, both the AM and FM response increased in amplitude, but the AM response tended to saturate at large modulation depths. Neither response showed any significant change in phase with changes in modulation depth. Both responses increased in amplitude and decreased in phase delay with increasing intensity of the carrier tone, the FM response showing some saturation of amplitude at high intensities. Both responses could be recorded at modulation depths close to the subjective threshold for detecting the modulation and at intensities close to the subjective threshold for hearing the stimulus. The responses were variable but did not consistently adapt over periods of 10 min. The 40-Hz AM and FM responses appear to originate in the same generator, this generator being activated by separate auditory systems that detect changes in either amplitude or frequency.

...read moreread less

Journal Article•DOI•

Multidimensional tactile displays:identification of vibratory intensity, frequency, and contactor area

[...]

William M. Rabinowitz¹, Ajm Adrian Houtsma, N. I. Durlach, Lorraine A. Delhorne•Institutions (1)

Massachusetts Institute of Technology¹

01 Oct 1987-Journal of the Acoustical Society of America

TL;DR: Of the three dimensions considered, performance on the intensity variable was most affected, and performance on contactor area least affected, by simultaneous variations in the other dimensions.

...read moreread less

Abstract: Experiments were conducted to determine the ability of subjects to identify vibrotactile stimuli presented to the distal pad of the middle finger. The stimulus sets varied along one or more of the following dimensions: intensity of vibration, frequency of vibration, and contactor area. Identification performance was measured by information transfer. One‐dimensional stimulus sets produced values in the range 1–2 bits and, for most subjects, three‐dimensional sets produced values in the range 4–5 bits. Of the three dimensions considered, performance on the intensity variable was most affected, and performance on contactor area least affected, by simultaneous variations in the other dimensions.

...read moreread less

Journal Article•DOI•

Auditory‐perceptual interpretation of the vowel

[...]

James D. Miller¹•Institutions (1)

Central Institute for the Deaf¹

01 May 1987-Journal of the Acoustical Society of America

TL;DR: In this article, an account of the simple vowels of American English is given in terms of the author's auditory-perceptual theory, where the perceptualized spectral patterns associated with the non-retroflex vowels are shown to fall into regions within a slab in a three-dimensional, auditory perception space.

...read moreread less

Abstract: An account of the simple vowels of American English will be given in terms of the author's auditory‐perceptual theory. Data from many sources will be used to illustrate and support this view. In addition to theoretical suggestions concerning an auditory‐perceptual transformation and a segmentation rule, the perceptualized spectral patterns associated with the nonretroflex vowels will be shown to fall into regions within a slab in a three‐dimensional, auditory‐perceptual space. These regions form a “vowel map” that by simple rotations can be related to various “vowel charts” such as those of Jones, Pike, or Fant. While the presentheory is an elaboration of traditional formant‐ratio theory, it will be shown that a concept of a perceptual reference is needed not only for talker normalization but also to disambiguate vowels not distinguished by simple formant‐ratio theory. Additionally, loci for the retroflex and nasalized vowels illustrate the utility of describing vowel spectra by five numbers combined into three dimensions. Simple F1 by F2 descriptions are not adequate except for enigmatophiles. [Work supported by NINCDS and AFOSR.]

...read moreread less

Patent•DOI•

Speech recognition method

[...]

James K. Baker

03 Apr 1987-Journal of the Acoustical Society of America

TL;DR: This paper used smoothed frame labeling to detect acoustic patterns, in the form of known diphone models, in response to each such match, it associates with the speech an evidence score for each vocabulary word in which that pattern is known to occur, and then a combined-displaced evidence method is used to determine which words occur in the speech.

...read moreread less

Abstract: Smoothed frame labeling associates phonetic frame labels with a given speech frame as a function of (a) the closeness with which the given frame compares to each of a plurality of acoustic models, (b) which frame labels correspond with a neighboring frame, and (c) transition probabilities which indicate, for the frame labels associated with the neighboring frame, which frame labels are probably associated with the given frame. The smoothed frame labeling is used to divide the speech into segments of frames having the same class of labels. The invention represents words as a collection of known diphone models, each of which models the sound before and after a boundary between segments derived by the smoothed frame labeling. At recognition time, the speech is divided into segments by smoothed frame labeling; diphone models are derived for each boundary between the resulting segments; and the resulting diphone models are compared against the known diphone models to determine which of the known diphone models match the segment boundaries in the speech. Then a combined-displaced-evidence method is used to determine which words occur in the speech. This method detects which acoustic patterns, in the form of the known diphone models, match various portions of the speech. In response to each such match, it associates with the speech an evidence score for each vocabulary word in which that pattern is known to occur. It displaces each such score from the location of its associated matched pattern by the known distance between that pattern and the beginning of the score's word. Then all the evidence scores for a word located in a given portion of the speech are combined to produce a score which indicates the probability of that word starting in that portion of the speech. This score is combined with a score produced by comparing a histogram from a portion of the speech against a histogram of each word. The resulting combined score determines whether a given word should undergo a more detailed comparison against the speech to be recognized.

...read moreread less

Patent•DOI•

Speech recognition method

[...]

Akihiro Kuroda¹, Masafumi Nishimura¹, Kazuhide Sugawara¹•Institutions (1)

IBM¹

04 Feb 1987-Journal of the Acoustical Society of America

TL;DR: In this paper, confusion coefficients between the labels of the label alphabet for initial training and those for adaptation are determined by alignment of adaption speech with the corresponding initially trained Markov model.

...read moreread less

Abstract: For circumstance adaption, for example, speaker adaption, confusion coefficients between the labels of the label alphabet for initial training and those for adaption are determined by alignment of adaption speech with the corresponding initially trained Markov model. That is, each piece of adaptation speech is aligned with a corresponding initially trained Markov model by the Viterbi algorithm, and each label in the adaption speech is mapped onto one of the states of the Markov models. In respect of each adaptation lable ID, the parameter values for each initial training label of the states which are mapped onto the adaptation label in concern are accumulated and normalized to generate a confusion coefficient between each initial training label and each adaptation label. The parameter table of each Markov model is rewritten in respect of the adaptation label alphabet using the confusion coefficients.

...read moreread less

Patent•DOI•

Constructing Markov model word baseforms from multiple utterances by concatenating model sequences for word segments

[...]

Lalit R. Bahl¹, Peter Vincent Desouza¹, Robert Leroy Mercer¹, Michael Picheny¹•Institutions (1)

IBM¹

16 Dec 1987-Journal of the Acoustical Society of America

TL;DR: In this paper, a method for segmenting multiple utterances of a vocabulary word in a consistent and coherent manner and determining a Markov model sequence for each segment was presented, where fenemic Markov models correspond to each label.

...read moreread less

Abstract: The present invention relates to apparatus and method for segmenting multiple utterances of a vocabulary word in a consistent and coherent manner and determining a Markov model sequence for each segment A fenemic Markov model corresponds to each label

...read moreread less

Patent•DOI•

Collocational grammar system

[...]

Henry Kucera¹, Alwin B. Carus¹, Jeffrey G. Hopkins¹•Institutions (1)

Houghton Mifflin Harcourt¹

07 Oct 1987-Journal of the Acoustical Society of America

TL;DR: In this article, an empirical probability of collocation function defined on pairs of tags is iteratively extended to a selected set of tag sequences of increasing length so as to select a most probable tag for each word of a sequence of ambiguously-tagged words.

...read moreread less

Abstract: A system for the grammatical annotation of natural language receives natural language text and annotates each word with a set of tags indicative of its possible grammatical or syntactic uses. An empirical probability of collocation function defined on pairs of tags is iteratively extended to a selected set of tag sequences of increasing length so as to select a most probable tag for each word of a sequence of ambiguously-tagged words. For listed pairs of commonly confused words a substitute calculation reveals erroneous use of the wrong word. For words with tags having abnormally low frequency of occurrence, a stored table of reduced probability factors corrects the calculation. Once the text words have been annotated with their most probable tags, the tagged text is parsed by a parser which successively applies phrasal, predicate and clausal analysis to build higher structures from the disambiguated tag strings. A voice/text translator including such a tag annotator resolves sound or spelling ambiguity of words by their differing tags. A database retrieval system, such as a spelling checker, includes a tag annotator to identify desired data by syntactic features.

...read moreread less

Journal Article•DOI•

Derivation of primary parameters and procedures for use in speech intelligibility predictions.

[...]

Chaslav V. Pavlovic

01 Aug 1987-Journal of the Acoustical Society of America

TL;DR: The literature on various parameters that appear in the articulation index-type calculations of speech intelligibility is reexamined and the best estimates of these parameters and the most appropriate procedures for their use are suggested.

...read moreread less

Abstract: The literature on various parameters that appear in the articulation index‐type calculations of speech intelligibility is reexamined. Based on the reported data, the best estimates of these parameters and the most appropriate procedures for their use are suggested. These included: (1) the analysis and specification of the importance of various frequency bands to speech intelligibility; (2) the procedures used for measuring threshold and the calculation of threshold‐based parameters used for predicting intelligibility of low‐level speech; and (3) the calculation and measurement of relevant speech parameters. All results are given so that the calculations can be performed either in critical bands, 1/3 octaves, or octaves.

...read moreread less

Patent•DOI•

Programmable hearing aid

[...]

Jan Topholm

22 Jan 1987-Journal of the Acoustical Society of America

TL;DR: In this paper, a programmable hearing aid with an amplifier and transmission section whose transmission characteristics can be controlled, with a control unit (1), with a transmitter for wireless transmission of control signals to the hearing aid (6) and a receiver (7) located therein for receiving and demodulating control signals, was described.

...read moreread less

Abstract: The invention relates to a programmable hearing aid with an amplifier and transmission section whose transmission characteristics can be controlled, with a control unit (1), with a transmitter for wireless transmission of control signals to the hearing aid (6) and a receiver (7) located therein for receiving and demodulating control signals, whereby the external control unit (1) contains an initial memory (20) for some of the parameters which determine the transmission characteristics of the hearing aid, a control panel with entry keypad (5) for recalling such parameters from the memory, a transmitter (4) which can be modulated with these parameters as control signals and a digital control unit (3) and whereby the hearing aid contains a further control unit (8) which can be activated by the control signals after they have been demodulated, for control of the transmission section.

...read moreread less

Patent•DOI•

Method and system for measuring azimuthal anisotropy effects using acoustic multipole transducers

[...]

Becker David F, Sen-Tsuen Chen, Perelberg Azik I, Graham A. Winbow

08 Sep 1987-Journal of the Acoustical Society of America

TL;DR: In this article, an acoustic borehole well logging method and apparatus for measuring azimuthal anisotropy of a formation traversed by a borehole using at least one multipole transducer were presented.

...read moreread less

Abstract: An acoustic borehole well logging method and apparatus for measuring azimuthal anisotropy of a formation traversed by a borehole using at least one multipole transducer. In a preferred embodiment, a dipole wave transmitter and at least one detector sensitive to dipole waves are employed. In an alternative preferred embodiment, a monopole transmitter and at least one multipole detector are employed. In the inventive method, two acoustic wave arrivals are detected, each associated with a different azimuthal orientation relative to the longitudinal axis of the borehole (i.e. each is transmitted by a multipole transmitter oriented at such angle, or is detected by a multipole detector oriented at such angle, or both). The inventive apparatus preferably includes at least one transducer unit including two or more multipole transmitters (or two or more multipole detectors) oriented at different azimuthal angles relative to the tool's longitudinal axis.

...read moreread less

Patent•DOI•

Mass production auditory canal hearing aid.

[...]

Barry Voroba, Dennis A. Oberlander

20 May 1987-Journal of the Acoustical Society of America

TL;DR: In this article, a patient can select a form fitting shell (20) with a malleable covering (30) having a hook (13) and twist which precisely conforms to the patient's ear.

...read moreread less

Abstract: The in-the-canal hearing aid (10) shown has patient selected physical (30, 134) and electronic (90, 110, C2, R1, R2, R3) components. A patient may personally select the best suited hearing aid (10) during the testing process and walk away with the hearing aid personally selected. The patient is allowed to select a form fitting shell (20) with a malleable covering (30) having a hook (13) and twist which precisely conforms to the patient's ear. The patient then listens to sounds with or without background noise and from various directions using electronic components (60, 70, 80, 90, 120) which conform to the specifications of the hearing aid and personally chooses those electronics (90, 110, C2, R1-R3) which best aid or assist his or her hearing loss. These electronics are quickly inserted into the chosen shell. Replacement is easily accomplished by replacing the shell if physical discomfort occurs or by replacing the electronics if an unexpected sound environment exists.

...read moreread less

Journal Article•DOI•

A pulse ribbon model of monaural phase perception.

[...]

Roy D. Patterson

01 Nov 1987-Journal of the Acoustical Society of America

TL;DR: A highly simplified model of the cochlea, consisting of an auditory filter bank and units that record the times of the larger peaks in the filter outputs, is developed to explain the two contrasting sets of results.

...read moreread less

Abstract: This article presents two sets of experiments concerning the ability to discriminate changes in the phase spectra of wideband periodic sounds. In the first set, a series of local phase changes is used to modify the envelopes of the waves appearing at the outputs of a range of auditory filters. The size of the local phase change required for discrimination is shown to be strongly dependent on the repetition rate, intensity, and spectral location of the signal. In the second set of experiments, a global phase change is used to produce a progressive phase shift between the outputs of successive auditory filters, without changing the envelopes of the filtered waves. Contrary to what is often assumed, listeners can discriminate between‐channel phase shifts once the total time delay across the channels containing the signal reaches 4–5 ms. In this case, however, the discrimination is largely independent of signal parameters other than bandwidth. A highly simplified model of the cochlea, consisting of an auditory filter bank and units that record the times of the larger peaks in the filter outputs, is developed to explain the two contrasting sets of results.

...read moreread less

Journal Article•DOI•

Wheel/rail rolling noise, I: Theoretical analysis

[...]

Paul J. Remington

01 Jun 1987-Journal of the Acoustical Society of America

TL;DR: In this article, a comprehensive analytical model of the wayside noise generated by a railroad wheel rolling on straight track is presented, which assumes that the small-scale roughness on the running surfaces of the wheel and rail is the primary mechanism for the noise generation.

...read moreread less

Abstract: A comprehensive analytical model of the wayside noise generated by a railroad wheel rolling on straight track is presented. The model assumes that the small‐scale roughness on the running surfaces of the wheel and rail is the primary mechanism for the noise generation. Included in the model are such effects as the spatial filtering of the roughness due to the finite area of contact between the wheel and rail; the interaction between the wheel and rail, including local contact stiffness; axial response of the wheel; the radiation efficiencies of the wheel and rail; and the influence of sound propagation, including finite ground impedance.

...read moreread less

Journal Article•DOI•

Evidence for mora timing in Japanese

[...]

Robert F. Port, Jonathan M. Dalby, Michael O’Dell

01 May 1987-Journal of the Acoustical Society of America

TL;DR: Four experiments are reported investigating segmental timing in Japanese in order to test several straightforward hypotheses about mora timing, finding that the duration of a word stays very close to a target duration that depends on the number of moras in it.

...read moreread less

Abstract: Japanese has long been described as a "mora-timed" language by linguists. Japanese pedagogy has traditionally claimed that moras are constant in duration. Four experiments are reported investigating segmental timing in Japanese in order to test several straightforward hypotheses about mora timing. First, it is demonstrated that words with an increasing number of moras increase in duration by nearly constant increments. The next two experiments explored the mechanisms by which constant mora durations are achieved given that there are large universal differences in the inherent duration of various segment types (e.g., /u/vs/a/), and given that some syllables are supposed to be two mora long (such as, those with long vowels or final consonants). In each case, it was found that the duration of a word stays very close to a target duration that depends on the number of moras in it. This is achieved by stretching or compressing the duration of neighboring segments and adjacent moras. Thus increasing the number of segments in two-mora syllables results in lengthening, not the expected shortening, of other segments in the heavier syllable.

...read moreread less

Patent•DOI•

Stereo enhancement system

[...]

Arnold I. Klayman

27 Jan 1987-Journal of the Acoustical Society of America

TL;DR: In this article, a stereo enhancement system processes the difference signal component generated from a pair of left and right input signals to create a broadened stereo image reproduced through a pair speakers or through a surround sound system.

...read moreread less

Abstract: A stereo enhancement system processes the difference signal component generated from a pair of left and right input signals to create a broadened stereo image reproduced through a pair of speakers or through a surround sound system. Processing of the difference signal component occurs through equalization characterized by amplification of the low and high range of auditory frequencies. The processed difference signal is combined with a sum signal, generated from the left and right input signals, and the original left and right input signals to create enhanced left and right output signals.

...read moreread less

Patent•DOI•

System of therapeutic ultrasound and real-time ultrasonic scanning

[...]

D. Jackson Coleman¹, F.L. Lizzi¹•Institutions (1)

Cornell University¹

02 Nov 1987-Journal of the Acoustical Society of America

TL;DR: In this paper, a system is described for obtaining real-time cross-sectional and 3D images of a body under study using ultrasonic energy, where a transducer is electronically swept or physically rotated to produce a series of sectored scan planes which are separated by a known angular distance.

...read moreread less

Abstract: A system is described for obtaining in real-time cross-sectional and 3-dimensional images of a body under study using ultrasonic energy. A piezoelectric transducer is positioned to emit ultrasonic energy and receive echo pulses. The transducer is electronically swept or physically rotated to produce a series of sectored scan planes which are separated by a known angular distance. The echo pulses are processed to produce an ultrasonic image in pseudo 3-dimensional display. By using data from one scan plane, processed as a B-scan image, cross-sectional data can be obtained. Such is combined in a display with an overlay to visually portray the object and positioning information or comparative data. The system is combined with a computer for data analysis and a therapeutic transducer for treatment.

...read moreread less

Collapse