Institution
Nuance Communications
Company•Vienna, Austria•
About: Nuance Communications is a company organization based out in Vienna, Austria. It is known for research contribution in the topics: Speech processing & Voice activity detection. The organization has 1518 authors who have published 1701 publications receiving 54891 citations. The organization is also known as: ScanSoft & ScanSoft Inc..
Papers published on a yearly basis
Papers
More filters
••
TL;DR: It is suggested that dialogue participants expect their own word and structure choices to be repeated back to them, and this is general to the task situation rather than specific to their communicative partners.
Abstract: It is now well established that people in conversations repeat each other’s words and structures Does doing so reflect dialogue participants’ expectations that their own choices of words or structures will be repeated back to them? In two experiments, subjects and confederates (purportedly) took turns describing pictures to each other On critical trials, we measured response latencies to choose pictures when labels (eg, stroller) or syntactic structures (a prepositional dative) that subjects had just produced were repeated back to them, versus when they heard reasonable alternatives (baby carriage or a double-object structure) Experiment 1 showed that repeated words and syntactic structures both elicit faster responses Experiment 2 showed that the effect happens even when subjects hear descriptions from computers, instead of from their addressees, and that the repeated-word effect was not due to preferences for labels These observations suggest that dialogue participants expect their own word and structure choices to be repeated back to them, and this is general to the task situation rather than specific to their communicative partners
19 citations
•
18 Jun 2014TL;DR: In this article, the authors proposed a method for training a multi-layer artificial neural network for speech recognition, which consists of determining for a first speech pattern of the plurality of speech patterns, using a first processing pipeline, network activations for a plurality of nodes of the Artificial Neural Networks in response to providing the first speech patterns as input to the artificial neural networks.
Abstract: Methods and apparatus for training a multi-layer artificial neural network for use in speech recognition. The method comprises determining for a first speech pattern of the plurality of speech patterns, using a first processing pipeline, network activations for a plurality of nodes of the artificial neural network in response to providing the first speech pattern as input to the artificial neural network, determining based, at least in part, on the network activations and a selection criterion, whether the artificial neural network should be trained on the first speech pattern, and updating, using a second processing pipeline, network weights between nodes of the artificial neural network based, at least in part, on the network activations when it is determined that the artificial neural network should be trained on the first speech pattern.
19 citations
•
15 Aug 2008TL;DR: In this paper, a two-path search including a speech segment search and a prosody modification value search is performed to obtain prosody with both high accuracy and high sound quality.
Abstract: Waveform concatenation speech synthesis with high sound quality. Prosody with both high accuracy and high sound quality is achieved by performing a two-path search including a speech segment search and a prosody modification value search. An accurate accent is secured by evaluating the consistency of the prosody by using a statistical model of prosody variations (the slope of fundamental frequency) for both of two paths of the speech segment selection and the modification value search. In the prosody modification value search, a prosody modification value sequence that minimizes a modified prosody cost is searched for. This allows a search for a modification value sequence that can increase the likelihood of absolute values or variations of the prosody to the statistical model as high as possible with minimum modification values.
19 citations
•
03 Aug 2009TL;DR: In this article, a system that outputs phonemes and accents of texts was proposed, where the system has a storage section storing a first corpus in which spellings, phoneme and accent of a text input beforehand are recorded separately for individual segmentations of the words that are contained in the text.
Abstract: A system that outputs phonemes and accents of texts. The system has a storage section storing a first corpus in which spellings, phonemes, and accents of a text input beforehand are recorded separately for individual segmentations of the words that are contained in the text. A text for which phonemes and accents are to be output is acquired and the first corpus is searched to retrieve at least one set of spellings that match the spellings in the text from among sets of contiguous spellings. Then, the combination of a phoneme and an accent that has a higher probability of occurrence in the first corpus than a predetermined reference probability is selected as the phonemes and accent of the text.
19 citations
•
05 Nov 2009TL;DR: In this article, a speech recognition processor is based on a numeric language that represents a subset of a vocabulary, which includes a set of words identified as being for interpreting and understanding number strings.
Abstract: A system and a method are provided. A speech recognition processor receives unconstrained input speech and outputs a string of words. The speech recognition processor is based on a numeric language that represents a subset of a vocabulary. The subset includes a set of words identified as being for interpreting and understanding number strings. A numeric understanding processor contains classes of rules for converting the string of words into a sequence of digits. The speech recognition processor utilizes an acoustic model database. A validation database stores a set of valid sequences of digits. A string validation processor outputs validity information based on a comparison of a sequence of digits output by the numeric understanding processor with valid sequences of digits in the validation database.
19 citations
Authors
Showing all 1521 results
Name | H-index | Papers | Citations |
---|---|---|---|
Vinayak P. Dravid | 103 | 817 | 43612 |
Mehryar Mohri | 75 | 320 | 22868 |
Jinsong Wu | 70 | 566 | 16282 |
Horacio D. Espinosa | 67 | 315 | 16270 |
Shumin Zhai | 67 | 200 | 13447 |
Shang-Hua Teng | 66 | 265 | 16647 |
Dimitri Kanevsky | 62 | 362 | 14072 |
Marilyn A. Walker | 62 | 309 | 13429 |
Tara N. Sainath | 61 | 274 | 25183 |
Kenneth Church | 61 | 295 | 21179 |
John B Ketterson | 60 | 814 | 16929 |
Pascal Frossard | 59 | 637 | 22749 |
Michael Picheny | 57 | 244 | 11759 |
G. R. Scott Budinger | 56 | 196 | 12063 |
Jun Wu | 53 | 359 | 12110 |