scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the Acoustical Society of America in 2015"


Journal ArticleDOI
TL;DR: The main principles, landmarks in the development, and state-of-the-art for techniques that are based on geometrical acoustics principles are described.
Abstract: Computerized room acoustics modeling has been practiced for almost 50 years up to date. These modeling techniques play an important role in room acoustic design nowadays, often including auralization, but can also help in the construction of virtual environments for such applications as computer games, cognitive research, and training. This overview describes the main principles, landmarks in the development, and state-of-the-art for techniques that are based on geometrical acoustics principles. A focus is given to their capabilities to model the different aspects of sound propagation: specular vs diffuse reflections, and diffraction.

226 citations


Journal ArticleDOI
TL;DR: An investigation was conducted to assess the effects that distributions of ABHs embedded in plate-like structures have on both vibration and structure radiated sound, focusing on characterizing and improving low frequency performance.
Abstract: The concept of an Acoustic Black Hole (ABH) has been developed and exploited as an approach for passively attenuating structural vibration. The basic principle of the ABH relies on proper tailoring of the structure geometrical properties in order to produce a gradual reduction of the flexural wave speed, theoretically approaching zero. For practical systems the idealized “zero” wave speed condition cannot be achieved so the structural areas of low wave speed are treated with surface damping layers to allow the ABH to approach the idealized dissipation level. In this work, an investigation was conducted to assess the effects that distributions of ABHs embedded in plate-like structures have on both vibration and structure radiated sound, focusing on characterizing and improving low frequency performance. Finite Element and Boundary Element models were used to assess the vibration response and radiated sound power performance of several plate configurations, comparing baseline uniform plates with embedded periodic ABH designs. The computed modal loss factors showed the importance of the ABH unit cell low order modes in the overall vibration reduction effectiveness of the embedded ABH plates at low frequencies where the free plate bending wavelengths are longer than the scale of the ABH.

148 citations


Journal ArticleDOI
TL;DR: In this article, a two-dimensional acoustic cloak that is invisible in a prescribed direction was designed for military use since a target object is hidden from the enemy in front can still be identified by friendly at the back.
Abstract: The concept of acoustic parity-time (PT) symmetry is introduced and used for the study of extraordinary scattering behavior in acoustic PT-symmetric media consist of loss and gain units. The analytical study of acoustic PT-symmetric media shows that these media can be designed to achieve unidirectional transparency at specific frequencies named exceptional points (EPs). This unidirectional transparency occurs at the EPs is due to the asymmetrical arrangement of the periodic loss and gain units that results in different Bragg scatterings on the two sides of the PT-symmetric media. A close look at the phases of the reflections on both sides reveals a sudden jump of the reflection phase on one side at the EPs. This step-function like behavior causes an infinite delay time of the reflected wave on that side, and hence the media become reflectionless in that direction. Combining the concept of acoustic PT-symmetry with transformation acoustics, we design a two-dimensional acoustic cloak that is invisible in a prescribed direction. This kind of directional cloak is important especially for military use since a target object is hidden from the enemy in front can still be identified by friendly at the back. Other useful applications such as directional acoustic imaging, noise cancellation, architectural acoustics, acoustic amplification, etc., can also be developed.

139 citations


Journal ArticleDOI
TL;DR: The frequency range in which these porous materials exhibit high values of the absorption coefficient can be extended by using Helmholtz resonators with a range of carefully tuned neck lengths.
Abstract: This paper studies the acoustical properties of hard-backed porous layers with periodically embedded air filled Helmholtz resonators. It is demonstrated that some enhancements in the acoustic absorption coefficient can be achieved in the viscous and inertial regimes at wavelengths much larger than the layer thickness. This enhancement is attributed to the excitation of two specific modes: Helmholtz resonance in the viscous regime and a trapped mode in the inertial regime. The enhancement in the absorption that is attributed to the Helmholtz resonance can be further improved when a small amount of porous material is removed from the resonator necks. In this way the frequency range in which these porous materials exhibit high values of the absorption coefficient can be extended by using Helmholtz resonators with a range of carefully tuned neck lengths.

125 citations


Journal ArticleDOI
TL;DR: Although musicians outperformed non-musicians on a measure of frequency discrimination, they showed no advantage in perceiving masked speech, and non-verbal IQ, rather than musicianship, significantly predicted speech reception thresholds in noise.
Abstract: There is much interest in the idea that musicians perform better than non-musicians in understanding speech in background noise. Research in this area has often used energetic maskers, which have their effects primarily at the auditory periphery. However, masking interference can also occur at more central auditory levels, known as informational masking. This experiment extends existing research by using multiple maskers that vary in their informational content and similarity to speech, in order to examine differences in perception of masked speech between trained musicians (n = 25) and non-musicians (n = 25). Although musicians outperformed non-musicians on a measure of frequency discrimination, they showed no advantage in perceiving masked speech. Further analysis revealed that non-verbal IQ, rather than musicianship, significantly predicted speech reception thresholds in noise. The results strongly suggest that the contribution of general cognitive abilities needs to be taken into account in any investigations of individual variability for perceiving speech in noise.

121 citations


Journal ArticleDOI
TL;DR: In this review, the methods employed by the groups conducting marine mammal TTS experiments are described and the relationships between the experimental conditions, the noise exposure parameters, and the observed TTS are summarized.
Abstract: One of the most widely recognized effects of intense noise exposure is a noise-induced threshold shift—an elevation of hearing thresholds following cessation of the noise. Over the past twenty years, as concerns over the potential effects of human-generated noise on marine mammals have increased, a number of studies have been conducted to investigate noise-induced threshold shift phenomena in marine mammals. The experiments have focused on measuring temporary threshold shift (TTS)—a noise-induced threshold shift that fully recovers over time—in marine mammals exposed to intense tones, band-limited noise, and underwater impulses with various sound pressure levels, frequencies, durations, and temporal patterns. In this review, the methods employed by the groups conducting marine mammal TTS experiments are described and the relationships between the experimental conditions, the noise exposure parameters, and the observed TTS are summarized. An attempt has been made to synthesize the major findings across experiments to provide the current state of knowledge for the effects of noise on marine mammal hearing.

116 citations


Journal ArticleDOI
TL;DR: In this paper, compressive sensing (CS) is used to reconstruct the direction of arrival (DOA) of multiple sources using a sparsity constraint, where the acoustic pressure at each sensor is expressed as a phase-lagged superposition of source amplitudes at all hypothetical DOAs.
Abstract: For a sound field observed on a sensor array, compressive sensing (CS) reconstructs the direction of arrival (DOA) of multiple sources using a sparsity constraint. The DOA estimation is posed as an underdetermined problem by expressing the acoustic pressure at each sensor as a phase-lagged superposition of source amplitudes at all hypothetical DOAs. Regularizing with an l1-norm constraint renders the problem solvable with convex optimization, and promoting sparsity gives high-resolution DOA maps. Here the sparse source distribution is derived using maximum a posteriori estimates for both single and multiple snapshots. CS does not require inversion of the data covariance matrix and thus works well even for a single snapshot where it gives higher resolution than conventional beamforming. For multiple snapshots, CS outperforms conventional high-resolution methods even with coherent arrivals and at low signal-to-noise ratio. The superior resolution of CS is demonstrated with vertical array data from the SWellEx96 experiment for coherent multi-paths.

113 citations


Journal ArticleDOI
TL;DR: A passive beamforming method is presented that provides greatly improved spatial accuracy over the conventionally used time exposure acoustics (TEA) PAM reconstruction algorithm and is followed by experimental results from in vitro experiments and in vivo oncolytic viral therapy trials that show improved results.
Abstract: Passive acoustic mapping (PAM) is a promising imaging method that enables real-time three-dimensional monitoring of ultrasound therapy through the reconstruction of acoustic emissions passively received on an array of ultrasonic sensors. A passive beamforming method is presented that provides greatly improved spatial accuracy over the conventionally used time exposure acoustics (TEA) PAM reconstruction algorithm. Both the Capon beamformer and the robust Capon beamformer (RCB) for PAM are suggested as methods to reduce interference artifacts and improve resolution, which has been one of the experimental issues previously observed with TEA. Simulation results that replicate the experimental artifacts are shown to suggest that bubble interactions are the chief cause. Analysis is provided to show that these multiple bubble artifacts are generally not reduced by TEA, while Capon-based methods are able to reduce the artifacts. This is followed by experimental results from in vitro experiments and in vivo oncolytic viral therapy trials that show improved results in PAM, where RCB is able to more accurately localize the acoustic activity than TEA.

109 citations


Journal ArticleDOI
TL;DR: In this article, a continuous formulation of the direction-of-arrival (DOA) estimation problem is employed and an optimization procedure is introduced, which promotes sparsity on a continuous optimization variable.
Abstract: The direction-of-arrival (DOA) estimation problem involves the localization of a few sources from a limited number of observations on an array of sensors, thus it can be formulated as a sparse signal reconstruction problem and solved efficiently with compressive sensing (CS) to achieve high-resolution imaging. On a discrete angular grid, the CS reconstruction degrades due to basis mismatch when the DOAs do not coincide with the angular directions on the grid. To overcome this limitation, a continuous formulation of the DOA problem is employed and an optimization procedure is introduced, which promotes sparsity on a continuous optimization variable. The DOA estimation problem with infinitely many unknowns, i.e., source locations and amplitudes, is solved over a few optimization variables with semidefinite programming. The grid-free CS reconstruction provides high-resolution imaging even with non-uniform arrays, single-snapshot data and under noisy conditions as demonstrated on experimental towed array data.

107 citations


Journal ArticleDOI
TL;DR: Toward a consensus on symbolic notation of harmonics, resonances, and formants in vocalization, and towards a recognition of the role of symbols in musical expression.
Abstract: Toward a consensus on symbolic notation of harmonics, resonances, and formants in vocalization

105 citations


Journal ArticleDOI
TL;DR: A quantitative acoustic analysis of a large number of vocalizations produced by marmosets in a social environment within a captive colony shows that marmoset have a complex vocal repertoire in captivity that consists of multiple vocalization types, including both simple calls and compound calls composed of sequences of simple calls.
Abstract: The common marmoset (Callithrix jacchus), a highly vocal New World primate species, has emerged in recent years as a promising animal model for studying brain mechanisms underlying perception, vocal production, and cognition. The present study provides a quantitative acoustic analysis of a large number of vocalizations produced by marmosets in a social environment within a captive colony. Previous classifications of the marmoset vocal repertoire were mostly based on qualitative observations. In the present study a variety of vocalizations from individually identified marmosets were sampled and multiple acoustic features of each type of vocalization were measured. Results show that marmosets have a complex vocal repertoire in captivity that consists of multiple vocalization types, including both simple calls and compound calls composed of sequences of simple calls. A detailed quantification of the vocal repertoire of the marmoset can serve as a solid basis for studying the behavioral significance of their vocalizations and is essential for carrying out studies that investigate such properties as perceptual boundaries between call types and among individual callers as well as neural coding mechanisms for vocalizations. It can also serve as the basis for evaluating abnormal vocal behaviors resulting from diseases or genetic manipulations.

Journal ArticleDOI
TL;DR: A training method that uses a real-time analysis of the acoustic properties of vowels produced by non-native speakers to provide them with immediate, trial-by-trial visual feedback about their articulation alongside that of the same vowelsproduced by native speakers is introduced.
Abstract: Second-language learners often experience major difficulties in producing non-native speech sounds. This paper introduces a training method that uses a real-time analysis of the acoustic properties of vowels produced by non-native speakers to provide them with immediate, trial-by-trial visual feedback about their articulation alongside that of the same vowels produced by native speakers. The Mahalanobis acoustic distance between non-native productions and target native acoustic spaces was used to assess L2 production accuracy. The experiment shows that 1 h of training per vowel improves the production of four non-native Danish vowels: the learners' productions were closer to the corresponding Danish target vowels after training. The production performance of a control group remained unchanged. Comparisons of pre- and post-training vowel discrimination performance in the experimental group showed improvements in perception. Correlational analyses of training-related changes in production and perception revealed no relationship. These results suggest, first, that this training method is effective in improving non-native vowel production. Second, training purely on production improves perception. Finally, it appears that improvements in production and perception do not systematically progress at equal rates within individuals.

Journal ArticleDOI
TL;DR: Overall, results suggest that holography can be implemented successfully as a metrological tool with small, quantifiable errors.
Abstract: Acoustic holography is a powerful technique for characterizing ultrasound sources and the fields they radiate, with the ability to quantify source vibrations and reduce the number of required measurements. These capabilities are increasingly appealing for meeting measurement standards in medical ultrasound; however, associated uncertainties have not been investigated systematically. Here errors associated with holographic representations of a linear, continuous-wave ultrasound field are studied. To facilitate the analysis, error metrics are defined explicitly, and a detailed description of a holography formulation based on the Rayleigh integral is provided. Errors are evaluated both for simulations of a typical therapeutic ultrasound source and for physical experiments with three different ultrasound sources. Simulated experiments explore sampling errors introduced by the use of a finite number of measurements, geometric uncertainties in the actual positions of acquired measurements, and uncertainties in the properties of the propagation medium. Results demonstrate the theoretical feasibility of keeping errors less than about 1%. Typical errors in physical experiments were somewhat larger, on the order of a few percent; comparison with simulations provides specific guidelines for improving the experimental implementation to reduce these errors. Overall, results suggest that holography can be implemented successfully as a metrological tool with small, quantifiable errors.

Journal ArticleDOI
TL;DR: Results support a top-down model for successful adaptation to, and recognition of, accented speech; they add to recent theories that allocate a prominent role for executive function to effective speech comprehension in adverse listening conditions.
Abstract: The present study investigated the effects of inhibition, vocabulary knowledge, and working memory on perceptual adaptation to accented speech. One hundred young, normal-hearing adults listened to sentences spoken in a constructed, unfamiliar accent presented in speech-shaped background noise. Speech Reception Thresholds (SRTs) corresponding to 50% speech recognition accuracy provided a measurement of adaptation to the accented speech. Stroop, vocabulary knowledge, and working memory tests were performed to measure cognitive ability. Participants adapted to the unfamiliar accent as revealed by a decrease in SRTs over time. Better inhibition (lower Stroop scores) predicted greater and faster adaptation to the unfamiliar accent. Vocabulary knowledge predicted better recognition of the unfamiliar accent, while working memory had a smaller, indirect effect on speech recognition mediated by vocabulary score. Results support a top-down model for successful adaptation to, and recognition of, accented speech; they add to recent theories that allocate a prominent role for executive function to effective speech comprehension in adverse listening conditions.

Journal ArticleDOI
TL;DR: Comparison of unoccupied and occupied data showed that unoccupied acoustic conditions affect the noise levels occurring during lessons, and room height and the amount of glazing need to be controlled to reduce reverberation to suitable levels for teaching and learning.
Abstract: An acoustic survey of secondary schools in England has been undertaken. Room acoustic parameters and background noise levels were measured in 185 unoccupied spaces in 13 schools to provide information on the typical acoustic environment of secondary schools. The unoccupied acoustic and noise data were correlated with various physical characteristics of the spaces. Room height and the amount of glazing were related to the unoccupied reverberation time and therefore need to be controlled to reduce reverberation to suitable levels for teaching and learning. Further analysis of the unoccupied data showed that the introduction of legislation relating to school acoustics in England and Wales in 2003 approximately doubled the number of school spaces complying with current standards. Noise levels were also measured during 274 lessons to examine typical levels generated during teaching activities in secondary schools and to investigate the influence of acoustic design on working noise levels in the classroom. Comparison of unoccupied and occupied data showed that unoccupied acoustic conditions affect the noise levels occurring during lessons. They were also related to the time spent in disruption to the lessons (e.g., students talking or shouting) and so may also have an impact upon student behavior in the classroom.

Journal ArticleDOI
TL;DR: A misalignment between an expected and an observed speech signal for the face-prime trials is suggested, which indicates that social information about a speaker can trigger linguistic associations that come with processing benefits and costs.
Abstract: Socio-indexical cues and paralinguistic information are often beneficial to speech processing as this information assists listeners in parsing the speech stream. Associations that particular populations speak in a certain speech style can, however, make it such that socio-indexical cues have a cost. In this study, native speakers of Canadian English who identify as Chinese Canadian and White Canadian read sentences that were presented to listeners in noise. Half of the sentences were presented with a visual-prime in the form of a photo of the speaker and half were presented in control trials with fixation crosses. Sentences produced by Chinese Canadians showed an intelligibility cost in the face-prime condition, whereas sentences produced by White Canadians did not. In an accentedness rating task, listeners rated White Canadians as less accented in the face-prime trials, but Chinese Canadians showed no such change in perceived accentedness. These results suggest a misalignment between an expected and an observed speech signal for the face-prime trials, which indicates that social information about a speaker can trigger linguistic associations that come with processing benefits and costs.

Journal ArticleDOI
Eric W. Healy1, Sarah E. Yoho1, Jitong Chen1, Yuxuan Wang1, DeLiang Wang1 
TL;DR: Substantial sentence-intelligibility benefit was observed for hearing-impaired listeners in both noise types, despite the use of unseen noise segments during the test stage, which highlights the importance of evaluating these algorithms not only in human subjects, but in members of the actual target population.
Abstract: Machine learning algorithms to segregate speech from background noise hold considerable promise for alleviating limitations associated with hearing impairment. One of the most important considerations for implementing these algorithms into devices such as hearing aids and cochlear implants involves their ability to generalize to conditions not employed during the training stage. A major challenge involves the generalization to novel noise segments. In the current study, sentences were segregated from multi-talker babble and from cafeteria noise using an algorithm that employs deep neural networks to estimate the ideal ratio mask. Importantly, the algorithm was trained on segments of noise and tested using entirely novel segments of the same nonstationary noise type. Substantial sentence-intelligibility benefit was observed for hearing-impaired listeners in both noise types, despite the use of unseen noise segments during the test stage. Interestingly, normal-hearing listeners displayed benefit in babble but not in cafeteria noise. This result highlights the importance of evaluating these algorithms not only in human subjects, but in members of the actual target population.

Journal ArticleDOI
TL;DR: It is shown that a valid CS framework can be derived from ultrasound propagation theory, and that this framework could be used to compute images of scatterers using only one plane wave as a transmit beam.
Abstract: Ultrasound imaging is a wide spread technique used in medical imaging as well as in non-destructive testing. The technique offers many advantages such as real-time imaging, good resolution, prompt acquisition, ease of use, and low cost compared to other techniques such as x-ray imaging. However, the maximum frame rate achievable is limited as several beams must be emitted to compute a single image. For each emitted beam, one must wait for the wave to propagate back and forth, thus imposing a limit to the frame rate. Several attempts have been made to use less beams while maintaining image quality. Although efficiently increasing the frame rate, these techniques still use several transmit beams. Compressive Sensing (CS), a universal data completion scheme based on convex optimization, has been successfully applied to a number of imaging modalities over the past few years. Using a priori knowledge of the signal, it can compute an image using less data allowing for shorter acquisition times. In this paper, it is shown that a valid CS framework can be derived from ultrasound propagation theory, and that this framework can be used to compute images of scatterers using only one plane wave as a transmit beam.

Journal ArticleDOI
TL;DR: The numerical design of a directional invisibility cloak for backward scattered elastic waves propagating in a thin plate (A0 Lamb waves) shows the best directional cloaking was obtained when the resonators' length decreases from the central to the outermost ring.
Abstract: This paper deals with the numerical design of a directional invisibility cloak for backward scattered elastic waves propagating in a thin plate (A 0 Lamb waves). The directional cloak is based on a set of resonating beams that are attached perpendicular to the plate and are arranged at a sub-wavelength scale in ten concentric rings. The exotic effective properties of this locally resonant metamaterial ensure coexistence of bandgaps and directional cloaking for certain beam configurations over a large frequency band. The best directional cloaking was obtained when the resonators' length decreases from the central to the outermost ring. In this case, flexural waves experience a vanishing index of refraction when they cross the outer layers, leading to a frequency bandgap that protects the central part of the cloak. Numerical simulation shows that there is no back-scattering in these configurations. These results might have applications in the design of seismic-wave protection devices.

PatentDOI
TL;DR: In this paper, an interactive response system mixes HSR subsystems with ASR subsystems to facilitate overall capability of voice user interfaces, and an ASR proxy is used to implement an IVR system, and the proxy dynamically determines how many ASR and HSR systems are to perform recognition for any particular utterance, based on factors such as confidence thresholds of the ASRs and availability of human resources for HSRs.
Abstract: An interactive response system mixes HSR subsystems with ASR subsystems to facilitate overall capability of voice user interfaces. The system permits imperfect ASR subsystems to nonetheless relieve burden on HSR subsystems. An ASR proxy is used to implement an IVR system, and the proxy dynamically determines how many ASR and HSR subsystems are to perform recognition for any particular utterance, based on factors such as confidence thresholds of the ASRs and availability of human resources for HSRs.

Journal ArticleDOI
TL;DR: A general formulation is presented for the optimum controller in an active system for local sound control in a spatially random primary field, where the sound field in a control region is selectively attenuated using secondary sources, driven by reference sensors, all of which are potentially remote from this control region.
Abstract: A general formulation is presented for the optimum controller in an active system for local sound control in a spatially random primary field. The sound field in a control region is selectively attenuated using secondary sources, driven by reference sensors, all of which are potentially remote from this control region. It is shown that the optimal controller is formed of the combination of a least-squares estimation of the primary source signals from the reference signals, and a least-squares controller driven by the primary source signals themselves. The optimum controller is also calculated using the remote microphone technique, in both the frequency and the time domains. The sound field under control is assumed to be stationary and generated by an array of primary sources, whose source strengths are specified using a spectral density matrix. This can easily be used to synthesize a diffuse primary field, if the primary sources are uncorrelated and far from the control region, but can also generate primary fields dominated by contributions from a particular direction, for example, which is shown to significantly affect the shape of the resulting zone of quiet.

Journal ArticleDOI
TL;DR: From the perspective of robust ASR, the results showed that spectral and temporal processing can be performed independently and are not required to interact with each other.
Abstract: To test if simultaneous spectral and temporal processing is required to extract robust features for automatic speech recognition (ASR), the robust spectro-temporal two-dimensional-Gabor filter bank (GBFB) front-end from Schadler, Meyer, and Kollmeier [J. Acoust. Soc. Am. 131, 4134–4151 (2012)] was de-composed into a spectral one-dimensional-Gabor filter bank and a temporal one-dimensional-Gabor filter bank. A feature set that is extracted with these separate spectral and temporal modulation filter banks was introduced, the separate Gabor filter bank (SGBFB) features, and evaluated on the CHiME (Computational Hearing in Multisource Environments) keywords-in-noise recognition task. From the perspective of robust ASR, the results showed that spectral and temporal processing can be performed independently and are not required to interact with each other. Using SGBFB features permitted the signal-to-noise ratio (SNR) to be lowered by 1.2 dB while still performing as well as the GBFB-based reference system, which corresponds to a relative improvement of the word error rate by 12.8%. Additionally, the real time factor of the spectro-temporal processing could be reduced by more than an order of magnitude. Compared to human listeners, the SNR needed to be 13 dB higher when using Mel-frequency cepstral coefficient features, 11 dB higher when using GBFB features, and 9 dB higher when using SGBFB features to achieve the same recognition performance.

Journal ArticleDOI
TL;DR: It was concluded that between-speaker variability of acoustically measurable speech rhythm is strong and robust against various sources of within-speakers variability.
Abstract: Between-speaker variability of acoustically measurable speech rhythm [%V, ΔV(ln), ΔC(ln), and Δpeak(ln)] was investigated when within-speaker variability of (a) articulation rate and (b) linguistic structural characteristics was introduced. To study (a), 12 speakers of Standard German read seven lexically identical sentences under five different intended tempo conditions (very slow, slow, normal, fast, very fast). To study (b), 16 speakers of Zurich Swiss German produced 16 spontaneous utterances each (256 in total) for which transcripts were made and then read by all speakers (4096 sentences; 16 speaker × 256 sentences). Between-speaker variability was tested using analysis of variance with repeated measures on within-speaker factors. Results revealed strong and consistent between-speaker variability while within-speaker variability as a function of articulation rate and linguistic characteristics was typically not significant. It was concluded that between-speaker variability of acoustically measurable speech rhythm is strong and robust against various sources of within-speaker variability. Idiosyncratic articulatory movements were found to be the most plausible factor explaining between-speaker differences.

Journal ArticleDOI
TL;DR: This study compares two response-time measures of listening effort that can be combined with a clinical speech test for a more comprehensive evaluation of total listening experience; verbal response times to auditory stimuli (RT(aud) and response time to a visual task (RTs(vis) in a dual-task paradigm).
Abstract: This study compares two response-time measures of listening effort that can be combined with a clinical speech test for a more comprehensive evaluation of total listening experience; verbal response times to auditory stimuli (RTaud) and response times to a visual task (RTsvis) in a dual-task paradigm. The listening task was presented in five masker conditions; no noise, and two types of noise at two fixed intelligibility levels. Both the RTsaud and RTsvis showed effects of noise. However, only RTsaud showed an effect of intelligibility. Because of its simplicity in implementation, RTsaud may be a useful effort measure for clinical applications.

Journal ArticleDOI
TL;DR: 2D models reduce the scattering mechanism in the Rayleigh regime and 2D parametric studies illustrate the mesh sampling requirements for two different types of mesh to ensure modelling accuracy and present useful guidelines for future works.
Abstract: Finite element modelling is a promising tool for further progressing the development of ultrasonic non-destructive evaluation of polycrystalline materials. Yet its widespread adoption has been held back due to a high computational cost, which has restricted current works to relatively small models and to two dimensions. However, the emergence of sufficiently powerful computing, such as highly efficient solutions on graphics processors, is enabling a step improvement in possibilities. This article aims to realise those capabilities to simulate ultrasonic scattering of longitudinal waves in an equiaxed polycrystalline material in both two (2D) and three dimensions (3D). The modelling relies on an established Voronoi approach to randomly generate a representative grain morphology. It is shown that both 2D and 3D numerical data show good agreement across a range of scattering regimes in comparison to well-established theoretical predictions for attenuation and phase velocity. In addition, 2D parametric studies illustrate the mesh sampling requirements for two different types of mesh to ensure modelling accuracy and present useful guidelines for future works. Modelling limitations are also shown. It is found that 2D models reduce the scattering mechanism in the Rayleigh regime.

Journal ArticleDOI
TL;DR: This study focuses on imaging local changes in heterogeneous media by modeled through a diffuse sensitivity kernel, based on the intensity transport in the medium, and solves the inverse problem with a linear least square algorithm.
Abstract: This study focuses on imaging local changes in heterogeneous media. The method employed is demonstrated and validated using numerical experiments of acoustic wave propagation in a multiple scattering medium. Changes are simulated by adding new scatterers of different sizes at various positions in the medium, and the induced decorrelation of the diffuse (coda) waveforms is measured for different pairs of sensors. The spatial and temporal dependences of the decorrelation are modeled through a diffuse sensitivity kernel, based on the intensity transport in the medium. The inverse problem is then solved with a linear least square algorithm, which leads to a map of scattering cross section density of the changes.

Journal ArticleDOI
TL;DR: An automated birdsong phrase classification algorithm for limited data is developed and achieves the highest classification accuracies of 94% and 89% on manually segmented and automatically segmented phrases, respectively, from unseen Cassin's Vireo individuals, using five training samples per class.
Abstract: Annotation of phrases in birdsongs can be helpful to behavioral and population studies. To reduce the need for manual annotation, an automated birdsong phrase classification algorithm for limited data is developed. Limited data occur because of limited recordings or the existence of rare phrases. In this paper, classification of up to 81 phrase classes of Cassin's Vireo is performed using one to five training samples per class. The algorithm involves dynamic time warping (DTW) and two passes of sparse representation (SR) classification. DTW improves the similarity between training and test phrases from the same class in the presence of individual bird differences and phrase segmentation inconsistencies. The SR classifier works by finding a sparse linear combination of training feature vectors from all classes that best approximates the test feature vector. When the class decisions from DTW and the first pass SR classification are different, SR classification is repeated using training samples from these two conflicting classes. Compared to DTW, support vector machines, and an SR classifier without DTW, the proposed classifier achieves the highest classification accuracies of 94% and 89% on manually segmented and automatically segmented phrases, respectively, from unseen Cassin's Vireo individuals, using five training samples per class.

Journal ArticleDOI
TL;DR: The results are consistent with neurophysiological studies indicating that (1) phase locking to lower frequency sounds emerges earlier in life than phaselocking to higher frequency sounds and (2) myelination continues to increase in the first year of life.
Abstract: Previous studies have evaluated representation of the fundamental frequency (F0) in the frequency following response (FFR) of infants, but the development of other aspects of the FFR, such as timing and harmonics, has not yet been examined. Here, FFRs were recorded to a speech syllable in 28 infants, ages three to ten months. The F0 amplitude of the response was variable among individuals but was strongly represented in some infants as young as three months of age. The harmonics, however, showed a systematic increase in amplitude with age. In the time domain, onset, offset, and inter-peak latencies decreased with age. These results are consistent with neurophysiological studies indicating that (1) phase locking to lower frequency sounds emerges earlier in life than phase locking to higher frequency sounds and (2) myelination continues to increase in the first year of life. Early representation of low frequencies may reflect greater exposure to low frequency stimulation in utero. The improvement in temporal precision likely parallels an increase in the efficiency of neural transmission accompanied by exposure to speech during the first year of life.

Journal ArticleDOI
TL;DR: This study investigates a small lightweight loudspeaker using a dielectric elastomer actuator vibrating in the breathing mode (the pulsating mode such as the expansion and contraction of a balloon), and acoustic testing with regard to repeatability, sound pressure, vibration mode profiles, and acoustic radiation patterns indicate that dielectrics elastomers may be feasible.
Abstract: Although indoor acoustic characteristics should ideally be assessed by measuring the reverberation time using a point sound source, a regular polyhedron loudspeaker, which has multiple loudspeakers on a chassis, is typically used. However, such a configuration is not a point sound source if the size of the loudspeaker is large relative to the target sound field. This study investigates a small lightweight loudspeaker using a dielectric elastomer actuator vibrating in the breathing mode (the pulsating mode such as the expansion and contraction of a balloon). Acoustic testing with regard to repeatability, sound pressure, vibration mode profiles, and acoustic radiation patterns indicate that dielectric elastomer loudspeakers may be feasible.

Journal ArticleDOI
TL;DR: Perceptual weighting of the two cues was modeled with mixed-effects logistic regression, and was found to systematically vary with spectral resolution, and showed moderately good correspondence with word recognition scores.
Abstract: In this study, spectral properties of speech sounds were used to test functional spectral resolution in people who use cochlear implants (CIs). Specifically, perception of the /ba/-/da/ contrast was tested using two spectral cues: Formant transitions (a fine-resolution cue) and spectral tilt (a coarse-resolution cue). Higher weighting of the formant cues was used as an index of better spectral cue perception. Participants included 19 CI listeners and 10 listeners with normal hearing (NH), for whom spectral resolution was explicitly controlled using a noise vocoder with variable carrier filter widths to simulate electrical current spread. Perceptual weighting of the two cues was modeled with mixed-effects logistic regression, and was found to systematically vary with spectral resolution. The use of formant cues was greatest for NH listeners for unprocessed speech, and declined in the two vocoded conditions. Compared to NH listeners, CI listeners relied less on formant transitions, and more on spectral tilt. Cue-weighting results showed moderately good correspondence with word recognition scores. The current approach to testing functional spectral resolution uses auditory cues that are known to be important for speech categorization, and can thus potentially serve as the basis upon which CI processing strategies and innovations are tested.