scispace - formally typeset
Search or ask a question
Institution

Nuance Communications

CompanyVienna, Austria
About: Nuance Communications is a company organization based out in Vienna, Austria. It is known for research contribution in the topics: Speech processing & Voice activity detection. The organization has 1518 authors who have published 1701 publications receiving 54891 citations. The organization is also known as: ScanSoft & ScanSoft Inc..


Papers
More filters
Journal ArticleDOI
TL;DR: It is suggested that dialogue participants expect their own word and structure choices to be repeated back to them, and this is general to the task situation rather than specific to their communicative partners.
Abstract: It is now well established that people in conversations repeat each other’s words and structures Does doing so reflect dialogue participants’ expectations that their own choices of words or structures will be repeated back to them? In two experiments, subjects and confederates (purportedly) took turns describing pictures to each other On critical trials, we measured response latencies to choose pictures when labels (eg, stroller) or syntactic structures (a prepositional dative) that subjects had just produced were repeated back to them, versus when they heard reasonable alternatives (baby carriage or a double-object structure) Experiment 1 showed that repeated words and syntactic structures both elicit faster responses Experiment 2 showed that the effect happens even when subjects hear descriptions from computers, instead of from their addressees, and that the repeated-word effect was not due to preferences for labels These observations suggest that dialogue participants expect their own word and structure choices to be repeated back to them, and this is general to the task situation rather than specific to their communicative partners

19 citations

Patent
18 Jun 2014
TL;DR: In this article, the authors proposed a method for training a multi-layer artificial neural network for speech recognition, which consists of determining for a first speech pattern of the plurality of speech patterns, using a first processing pipeline, network activations for a plurality of nodes of the Artificial Neural Networks in response to providing the first speech patterns as input to the artificial neural networks.
Abstract: Methods and apparatus for training a multi-layer artificial neural network for use in speech recognition. The method comprises determining for a first speech pattern of the plurality of speech patterns, using a first processing pipeline, network activations for a plurality of nodes of the artificial neural network in response to providing the first speech pattern as input to the artificial neural network, determining based, at least in part, on the network activations and a selection criterion, whether the artificial neural network should be trained on the first speech pattern, and updating, using a second processing pipeline, network weights between nodes of the artificial neural network based, at least in part, on the network activations when it is determined that the artificial neural network should be trained on the first speech pattern.

19 citations

Patent
15 Aug 2008
TL;DR: In this paper, a two-path search including a speech segment search and a prosody modification value search is performed to obtain prosody with both high accuracy and high sound quality.
Abstract: Waveform concatenation speech synthesis with high sound quality. Prosody with both high accuracy and high sound quality is achieved by performing a two-path search including a speech segment search and a prosody modification value search. An accurate accent is secured by evaluating the consistency of the prosody by using a statistical model of prosody variations (the slope of fundamental frequency) for both of two paths of the speech segment selection and the modification value search. In the prosody modification value search, a prosody modification value sequence that minimizes a modified prosody cost is searched for. This allows a search for a modification value sequence that can increase the likelihood of absolute values or variations of the prosody to the statistical model as high as possible with minimum modification values.

19 citations

Patent
03 Aug 2009
TL;DR: In this article, a system that outputs phonemes and accents of texts was proposed, where the system has a storage section storing a first corpus in which spellings, phoneme and accent of a text input beforehand are recorded separately for individual segmentations of the words that are contained in the text.
Abstract: A system that outputs phonemes and accents of texts. The system has a storage section storing a first corpus in which spellings, phonemes, and accents of a text input beforehand are recorded separately for individual segmentations of the words that are contained in the text. A text for which phonemes and accents are to be output is acquired and the first corpus is searched to retrieve at least one set of spellings that match the spellings in the text from among sets of contiguous spellings. Then, the combination of a phoneme and an accent that has a higher probability of occurrence in the first corpus than a predetermined reference probability is selected as the phonemes and accent of the text.

19 citations

Patent
05 Nov 2009
TL;DR: In this article, a speech recognition processor is based on a numeric language that represents a subset of a vocabulary, which includes a set of words identified as being for interpreting and understanding number strings.
Abstract: A system and a method are provided. A speech recognition processor receives unconstrained input speech and outputs a string of words. The speech recognition processor is based on a numeric language that represents a subset of a vocabulary. The subset includes a set of words identified as being for interpreting and understanding number strings. A numeric understanding processor contains classes of rules for converting the string of words into a sequence of digits. The speech recognition processor utilizes an acoustic model database. A validation database stores a set of valid sequences of digits. A string validation processor outputs validity information based on a comparison of a sequence of digits output by the numeric understanding processor with valid sequences of digits in the validation database.

19 citations


Authors

Showing all 1521 results

NameH-indexPapersCitations
Vinayak P. Dravid10381743612
Mehryar Mohri7532022868
Jinsong Wu7056616282
Horacio D. Espinosa6731516270
Shumin Zhai6720013447
Shang-Hua Teng6626516647
Dimitri Kanevsky6236214072
Marilyn A. Walker6230913429
Tara N. Sainath6127425183
Kenneth Church6129521179
John B Ketterson6081416929
Pascal Frossard5963722749
Michael Picheny5724411759
G. R. Scott Budinger5619612063
Jun Wu5335912110
Network Information
Related Institutions (5)
Google
39.8K papers, 2.1M citations

82% related

Microsoft
86.9K papers, 4.1M citations

82% related

Carnegie Mellon University
104.3K papers, 5.9M citations

80% related

Nokia
28.3K papers, 695.7K citations

79% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20223
202124
202042
201955
201841
201753