scispace - formally typeset
Search or ask a question

Showing papers presented at "International Conference on Auditory Display in 2005"


Proceedings Article
01 Jul 2005
TL;DR: It was found that a focused attention task and sound objects“ motion characteristics (approaching or receding) play an important role in self-motion perception and stronger sensations for auditory induced self-translation than for previously investigated self-rotation suggest a strong ecological validity bias.
Abstract: Creating a sense of illusory self-motion is crucial for many Virtual Reality applications and the auditory modality is an essential, but often neglected, component for such stimulations. In this paper, perceptual optimization of auditory-induced, translational self-motion (vection) simulation is studied using binaurally synthesized and reproduced sound fields. The results suggest that auditory scene consistency and ecologically validity makes a minimum set of acoustic cues sufficient for eliciting auditory-induced vection. Specifically, it was found that a focused attention task and sound objects’ motion characteristics (approaching or receding) play an important role in self-motion perception. In addition, stronger sensations for auditory induced self-translation than for previously investigated self-rotation also suggest a strong ecological validity bias, as translation is the most common movement direction.

37 citations


Proceedings Article
01 Jul 2005
TL;DR: In this paper, the authors examined loudness asymmetry between increasing and decreasing levels for 1-kHz tones over the range 60-80 dB SPL and over four ramp durations (2, 5, 10 and 20 s) using direct global and continuous loudness ratings made by subjects.
Abstract: Studies of loudness change for tones with linearly varying levels using different loudness rating methods, such as direct estimation or indirect estimation based on the start and end levels, have revealed an asymmetry depending on the direction of change (increasing vs decreasing). The present study examines loudness asymmetry between increasing and decreasing levels for 1-kHz tones over the range 60-80 dB SPL and over four ramp durations (2, 5, 10 and 20 s) using direct global and continuous loudness ratings made by subjects. Three measures extracted from continuous ratings (loudness duration, loudness change, loudness slope), on the one hand, and the global loudness rating, on the other hand are examined and analyzed separately. Measures extracted from continuous ratings do not reveal any significant perceptual asymmetry between an increasing and a decreasing ramp. However, direct estimation of the global loudness is higher for an increasing ramp than for a decreasing ramp. This result can be explained by a short-term auditory memory effect called the "recency effect".

34 citations


Proceedings Article
01 Jan 2005
TL;DR: This study provides an insight into how people perceive and identify product sounds but also supplies preliminary structured information in order to create an exclusive lexicon for product sounds.
Abstract: Listeners use different types of descriptions for domestic product sounds depending on the level of identification. By conducting labeling and identification tasks, we classified these descriptions into 11 semantically different groups. These groups are organized within a perceptual framework that describes the identification process of product sounds. The results of this investigation indicate that product sounds have associated meanings. This study not only provides an insight into how people perceive and identify product sounds but also supplies preliminary structured information in order to create an exclusive lexicon for product sounds.

15 citations


Proceedings Article
06 Jul 2005
TL;DR: The results suggest that A-weighting performs the worst while results obtained with loudness metrics appear to depend on the type of signals, which validates the usability of the selective processing approach for real-time applications.
Abstract: This paper studies various priority metrics that can be used to progressively select sub-parts of a number of audio signals for realtime processing. In particular, five level-related metrics were examined: RMS level, A-weighted level, Zwicker and Moore loudness models and a masking threshold-based model. We conducted a pilot subjective evaluation study aimed at evaluating which metric would perform best at reconstructing mixtures of various types (speech, ambient and music) using only a budget amount of original audio data. Our results suggest that A-weighting performs the worst while results obtained with loudness metrics appear to depend on the type of signals. RMS level offers a good compromise for all cases. Our results also show that significant sub-parts of the original audio data can be omitted in most cases, without noticeable degradation in the generated mixtures, which validates the usability of our selective processing approach for real-time applications. In this context, we successfully implemented a prototype 3D audio rendering pipeline using our selective approach.

13 citations


Proceedings Article
01 Jul 2005
TL;DR: It is argued that the use of sonification for time series analysis is superior in the case where intrinsic non-stationarity of an experiment cannot be ruled out and both a good statistic and sonification reveal the presence of “motifs”, preferred short firing sequences which are due to the deterministic spiking mechanism.
Abstract: We study cross-correlations in irregularly spiking systems. A single system displays spiking sequences that resemble a stochastic (Poisson) process. Linear coupling between two systems leaves the inter-spike interval distribution qualitatively unchanged but induces cross-correlations between the units. For strong coupling this leads to synchronization as expected but for weak coupling, both a good statistic and sonification reveal the presence of “motifs”, preferred short firing sequences which are due to the deterministic spiking mechanism. We argue that the use of sonification for time series analysis is superior in the case where intrinsic non-stationarity of an experiment cannot be ruled out.

9 citations


Proceedings Article
01 Jul 2005
TL;DR: A new sonification model for the exploration of topographically ordered high-dimensional data (multi-parameter maps, volume data) where each data item consists of a position and feature vector and heat can be induced locally.
Abstract: This paper presents a new sonification model for the exploration of topographically ordered high-dimensional data (multi-parameter maps, volume data) where each data item consists of a position and feature vector The sonification modelimplementsacommonmetaphorfromthermodynamics that heat can be interpreted as stochastic motion of ’molecules’ The latter are determined by the data under examination, and ’live’ only in the feature space Heat-induced interactions cause acoustic events that fuse to a granular sound texture which conveys meaningful information about the underlying distribution in feature space As a second ingredient of the model, data selection is achieved by a separated navigation process in position space using a dynamic aura model, such that heat can be induced locally Both, a visual and an auditory display are driven by the underlying model We exemplify the sonification by means of interaction examples for different high-dimensional distributions

6 citations


Proceedings Article
06 Jul 2005
TL;DR: The TrioSon software allows users to map musical patterns to input data variables via a graphical user interface (GUI) and is hoped that the compact and intuitive nature of the application will make it a straightforward means of investigating the Sonification of data sets.
Abstract: The TrioSon software allows users to map musical patterns to input data variables via a graphical user interface (GUI). The application is a Java routine designed to take input files of standard Comma Separated Values (CSV) format and output Standard Midi Files (SMF) using the internal Java Sound API. TrioSon renders output Sonifications from input data files for up to 3 user-defined parameters, allocated as bass, chord and melody instruments for the purposes of arrangement. In this manner each parameter concerned is distinguished by its individual instrumental timbre, with the option of rendering any combination of 1 to 3 parameters as required. The software parses indexed input data relating to individual variables for each user-defined parameter, and provides the means to allocate musical patterns to each variable for Sonification using drag and drop functionality. Control over the Rhythmic Parsing of the Sonification is provided, alongside individual control of the volume, panning, muting and timbre of each instrument in the trio. Sonifications can be rendered as full output files of the entire data, or can also be auditioned by index as required. This feature is designed to allow the user complete control over the data they are sonifying- either on an individual or collective basis. Context for each output Sonification is provided by Midi events defined by the index of the input data, which are mapped to percussive timbres in the final SMF (via track 10). Java development provides the added advantage of portability, with the final application being small enough (200kb) to attach in an email document. It is hoped that the compact and intuitive nature of the application will make it a straightforward means of investigating the Sonification of data sets.

6 citations


Proceedings Article
01 Jul 2005
TL;DR: The possible advantages of using non-speech audio media such as music are discussed – richness of the representations possible, the aesthetic appeal, and the possibilities of such interfaces being able to handle abstraction and consistency across the interface.
Abstract: A number of experiments, which have been carried out using non-speech auditory interfaces, are reviewed and the advantages and disadvantages of each are discussed. The possible advantages of using non-speech audio media such as music are discussed – richness of the representations possible, the aesthetic appeal, and the possibilities of such interfaces being able to handle abstraction and consistency across the interface.

5 citations


Proceedings Article
01 Jul 2005
TL;DR: In this article, the authors present a multimodal storytelling system, CONFUCIUS, which automatically generates 3D animation speech and non-speech audio from natural language sentences.
Abstract: Audio presentation is an important modality in virtual storytelling. In this paper we present our work on audio presentation in our intelligent multimodal storytelling system, CONFUCIUS, which automatically generates 3D animation speech, and non-speech audio from natural language sentences. We provide an overview of the system and describe speech and non-speech audio in virtual storytelling by using linguistic approaches. We discuss several issues in auditory display, such as its relation to verb and adjective ontology, concepts and modalities, and media allocation. Finally we conclude that introducing linguistic knowledge provides more intelligent virtual storytelling, especially in audio presentation.

1 citations