scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

3D binaural sound reproduction using a virtual ambisonic approach

TL;DR: To increase the computational efficiency of the proposed system, a virtual ambisonic approach is used, that result in a bank of time-invariant HRTF filter independent of the number of sources to encode.
Abstract: Convincing binaural sound reproduction via headphones requires filtering the virtual sound source signals with head related transfer functions (HRTFs). Furthermore, humans are able to improve their localization capabilities by small unconscious head movements. Therefore it is important to incorporate head-tracking. This yields the problem of high-quality, time-varying interpolation between different HRTFs. A further improvement of human localization accuracy can be done by considering room simulation yielding a huge amount of virtual sound sources. To increase the computational efficiency of the proposed system, a virtual ambisonic approach is used, that result in a bank of time-invariant HRTF filter independent of the number of sources to encode.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: Is it possible for an individual to learn an architectural environment without being physically present?
Abstract: Navigation within a closed environment requires analysis of a variety of acoustic cues, a task that is well developed in many visually impaired individuals, and for which sighted individuals rely almost entirely on visual information. For blind people, the act of creating cognitive maps for spaces, such as home or office buildings, can be a long process, for which the individual may repeat various paths numerous times. While this action is typically performed by the individual on-site, it is of some interest to investigate at which point this task can be performed off-site, at the individual's discretion. In short, is it possible for an individual to learn an architectural environment without being physically present? If so, such a system could prove beneficial for navigation preparation in new and unknown environments. The main goal of the present research can therefore be summarized as investigating the possibilities of assisting blind individuals in learning a spatial environment configuration through the listening of audio events and their interactions with these events within a virtual reality experience. A comparison of two types of learning through auditory exploration has been performed: in situ real displacement and active navigation in a virtual architecture. The virtual navigation rendered only acoustic information. Results for two groups of five participants showed that interactive exploration of virtual acoustic room simulations can provide sufficient information for the construction of coherent spatial mental maps, although some variations were found between the two environments tested in the experiments. Furthermore, the mental representation of the virtually navigated environments preserved topological and metric properties, as was found through actual navigation.

120 citations


Cites methods from "3D binaural sound reproduction usin..."

  • ...The resulting 2nd-order Ambisonic 9-channel audio stream, comprising the sum of the static sources, finger snapping, and footstep noise, was rendered binaurally over headphones employing the approach of virtual speakers (see McKeag and McGrath, 1996; Noisternig et al., 1997)....

    [...]

Journal ArticleDOI
11 Mar 2019-PLOS ONE
TL;DR: The technical details of the 3D Tune-In Toolkit are presented, outlining its architecture and describing the processes implemented in each of its components, followed by a comparison between the features offered by the 3DTI Toolkit and those found in other currently available open- and closed-source binaural renderers.
Abstract: The 3D Tune-In Toolkit (3DTI Toolkit) is an open-source standard C++ library which includes a binaural spatialiser. This paper presents the technical details of this renderer, outlining its architecture and describing the processes implemented in each of its components. In order to put this description into context, the basic concepts behind binaural spatialisation are reviewed through a chronology of research milestones in the field in the last 40 years. The 3DTI Toolkit renders the anechoic signal path by convolving sound sources with Head Related Impulse Responses (HRIRs), obtained by interpolating those extracted from a set that can be loaded from any file in a standard audio format. Interaural time differences are managed separately, in order to be able to customise the rendering according the head size of the listener, and to reduce comb-filtering when interpolating between different HRIRs. In addition, geometrical and frequency-dependent corrections for simulating near-field sources are included. Reverberation is computed separately using a virtual loudspeakers Ambisonic approach and convolution with Binaural Room Impulse Responses (BRIRs). In all these processes, special care has been put in avoiding audible artefacts produced by changes in gains and audio filters due to the movements of sources and of the listener. The 3DTI Toolkit performance, as well as some other relevant metrics such as non-linear distortion, are assessed and presented, followed by a comparison between the features offered by the 3DTI Toolkit and those found in other currently available open- and closed-source binaural renderers.

36 citations

Journal ArticleDOI
TL;DR: When both the HRTF and HMD were applied, the influence of sound externalization in the VR environment increased along with the sense of immersion and realism, including the annoyance level of the road traffic noise.

36 citations

Journal ArticleDOI
TL;DR: A new indoor noise assessment methodology for environmental factors is proposed herein, confirming that test subjects respond to noise more sensitively in a virtual environment where audiovisual information is provided.

27 citations

Proceedings ArticleDOI
01 Oct 2017
TL;DR: This paper proposes an improved DirAC method that directly synthesises the binaural cues based on the estimated spatial parameters and can accommodate higher-order Ambisonics (HOA) signals and has reduced computational requirements, making it suitable for lightweight processing with fast update rates and head-tracking support.
Abstract: Headphone reproduction of recorded spatial sound scenes is becoming increasingly relevant to immersive audiovisual applications. Popular non-parametric reproduction methods, such as firstorder ambisonics (FOA), can now be surpassed through the use of perceptually-motivated parametric reproduction methods. One such established method is Directional Audio Coding (DirAC). The earlier version of DirAC for headphones was itself limited to FOA input and achieved binaural rendering through a virtual loudspeaker approach; resulting in a high computational overhead. Therefore, this paper proposes an improved DirAC method that directly synthesises the binaural cues based on the estimated spatial parameters. This method can accommodate higher-order Ambisonics (HOA) signals and has reduced computational requirements; thus, making it suitable for lightweight processing with fast update rates and head-tracking support. According to listening tests, when utilising only FOA signals, the method results in equivalent or higher spatial accuracy than third-order Ambisonics and significantly outperforms FOA reproduction.

26 citations


Cites methods from "3D binaural sound reproduction usin..."

  • ...With regard to headphone reproduction, B-format signals are commonly decoded using Ambisonics [1], a non-parametric method which aims to reconstruct the appropriate binaural cues by applying an ambisonic decoding matrix that integrates a set of headrelated-transfer functions (HRTFs) [2, 3, 4, 5, 6, 7]....

    [...]

References
More filters
Proceedings ArticleDOI
21 Oct 2001
TL;DR: A public-domain database of high-spatial-resolution head-related transfer functions measured at the UC Davis CIPIC Interface Laboratory and the methods used to collect the data are described.
Abstract: This paper describes a public-domain database of high-spatial-resolution head-related transfer functions measured at the UC Davis CIPIC Interface Laboratory and the methods used to collect the data.. Release 1.0 (see http://interface.cipic.ucdavis.edu) includes head-related impulse responses for 45 subjects at 25 different azimuths and 50 different elevations (1250 directions) at approximately 5/spl deg/ angular increments. In addition, the database contains anthropometric measurements for each subject. Statistics of anthropometric parameters and correlations between anthropometry and some temporal and spectral features of the HRTFs are reported.

1,017 citations


"3D binaural sound reproduction usin..." refers methods in this paper

  • ...In the proposed system generic HRTFs have been incorporated using the KEMAR [2] as well as the CIPIC database [3]....

    [...]

Journal ArticleDOI
TL;DR: Data suggest that while the interaural cues to horizontal location are robust, the spectral cues considered important for resolving location along a particular cone-of-confusion are distorted by a synthesis process that uses nonindividualized HRTFs.
Abstract: A recent development in human-computer interfaces is the virtual acoustic display, a device that synthesizes three-dimensional, spatial auditory information over headphones using digital filters constructed from head-related transfer functions (HRTFs). The utility of such a display depends on the accuracy with which listeners can localize virtual sound sources. A previous study [F. L. Wightman and D. J. Kistler, J. Acoust. Soc. Am. 85, 868-878 (1989)] observed accurate localization by listeners for free-field sources and for virtual sources generated from the subjects' own HRTFs. In practice, measurement of the HRTFs of each potential user of a spatial auditory display may not be feasible. Thus, a critical research question is whether listeners can obtain adequate localization cues from stimuli based on nonindividualized transforms. Here, inexperienced listeners judged the apparent direction (azimuth and elevation) of wideband noisebursts presented in the free-field or over headphones; headphone stimuli were synthesized using HRTFs from a representative subject of Wightman and Kistler. When confusions were resolved, localization of virtual sources was quite accurate and comparable to the free-field sources for 12 of the 16 subjects. Of the remaining subjects, 2 showed poor elevation accuracy in both stimulus conditions, and 2 showed degraded elevation accuracy with virtual sources. Many of the listeners also showed high rates of front-back and up-down confusions that increased significantly for virtual sources compared to the free-field stimuli. These data suggest that while the interaural cues to horizontal location are robust, the spectral cues considered important for resolving location along a particular cone-of-confusion are distorted by a synthesis process that uses nonindividualized HRTFs.

910 citations


"3D binaural sound reproduction usin..." refers background in this paper

  • ...Wenzel et al. state in [1] that the use of nonindividualized transfer functions yields a degradation of humans localization accuracy....

    [...]

Journal ArticleDOI

583 citations


"3D binaural sound reproduction usin..." refers methods in this paper

  • ...In the proposed system generic HRTFs have been incorporated using the KEMAR [2] as well as the CIPIC database [3]....

    [...]

Journal Article
TL;DR: A study of sound localization performance was conducted using headphone-delivered virtual speech stimuli, rendered via HRTF-based acoustic auralization software and hardware, and blocked-meatus HRTF measurements.
Abstract: A study of sound localization performance was conducted using headphone-delivered virtual speech stimuli, rendered via HRTF-based acoustic auralization software and hardware, and blocked-meatus HRTF measurements. The independent variables were chosen to evaluate commonly held assumptions in the literature regarding improved localization: inclusion of head tracking, individualized HRTFs, and early and diffuse reflections. Significant effects were found for azimuth and elevation error, reversal rates, and externalization.

347 citations


"3D binaural sound reproduction usin..." refers background in this paper

  • ...Begault and Wenzel state in [5] that the incorporation of head tracking and room simulation improves the localization accuracy as well as the perceived externalization....

    [...]

Journal Article

316 citations


"3D binaural sound reproduction usin..." refers methods in this paper

  • ...In section III a binaural sound reproduction system is developed using the Ambisonic approach, incorporating head tracking as well as room simulation....

    [...]