scispace - formally typeset
Search or ask a question

Showing papers by "Unto K. Laine published in 2013"


Journal ArticleDOI
TL;DR: The findings suggest that caregivers' feedback can act as an important signal in guiding infants' articulatory learning, and that the speech inversion problem can be effectively approached from the perspective of early speech acquisition.

16 citations


Journal ArticleDOI
TL;DR: The current work proposes that the temporal and spectral integration are a result of a system optimized for pattern detection from ecologically relevant acoustic inputs, and suggests that the observed integration characteristics are learnable from acoustic inputs of the auditory environment using a Hebbian-like learning rule.
Abstract: Several psychoacoustic phenomena such as loudness perception, absolute thresholds of hearing, and perceptual grouping in time are affected by temporal integration of the signal in the auditory system Similarly, the frequency resolution of the hearing system, often expressed in terms of critical bands, implies signal integration across neighboring frequencies Although progress has been made in understanding the neurophysiological mechanisms behind these processes, the underlying reasons for the observed integration characteristics have remained poorly understood The current work proposes that the temporal and spectral integration are a result of a system optimized for pattern detection from ecologically relevant acoustic inputs This argument is supported by a simulation where the average time-frequency structure of speech that is derived from a large set of speech signals shows a good match to the time-frequency characteristics of the human auditory system The results also suggest that the observed integration characteristics are learnable from acoustic inputs of the auditory environment using a Hebbian-like learning rule

11 citations


Proceedings ArticleDOI
26 May 2013
TL;DR: This work proposes a method for data reduction based on theories of human attention that detects temporally salient events based on the context in which they occur and retains only those sections of the input signal.
Abstract: Since modern computational devices are required to store and process increasing amounts of data generated from various sources, efficient algorithms for identification of significant information in the data are becoming essential. Sensory recordings are one example where automatic and continuous storing and processing of large amounts of data is needed. Therefore, algorithms that can alleviate the computational load of the devices and reduce their storage requirements by removing uninformative data are important. In this work we propose a method for data reduction based on theories of human attention. The method detects temporally salient events based on the context in which they occur and retains only those sections of the input signal. The algorithm is tested as a pre-processing stage in a weakly supervised keyword learning experiment where it is shown to significantly improve the quality of the codebooks used in the pattern discovery process.

2 citations


Proceedings ArticleDOI
25 Aug 2013
TL;DR: A methodological framework for automatic discovery of statistical associations between a high bit-rate and noisy sensory signal (speech) and temporally discrete categorical data with different temporal granularity (text) is presented.
Abstract: Discovery of statistically significant patterns from data and learning of associative links between qualitatively different data streams is becoming increasingly important in dealing with the so-called Big Data problem of the modern society. In this work, a methodological framework for automatic discovery of statistical associations between a high bit-rate and noisy sensory signal (speech) and temporally discrete categorical data with different temporal granularity (text) is presented. The proposed approach does not utilize any phonetic or linguistic knowledge in the analysis, but simply learns the meaningful units of text and speech and their mutual mappings in an unsupervised manner. The first experiments with a limited vocabulary of childdirected speech show that, after a period of learning, the method is successful in the generation of a textual representation of continuous speech.

1 citations