Institution

Nuance Communications

Company•Vienna, Austria•

About: Nuance Communications is a company organization based out in Vienna, Austria. It is known for research contribution in the topics: Speech processing & Voice activity detection. The organization has 1518 authors who have published 1701 publications receiving 54891 citations. The organization is also known as: ScanSoft & ScanSoft Inc..

...read moreread less

Topics: Speech processing, Voice activity detection, Speaker recognition, Signal, Acoustic model ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Patent•

Detecting barge-in in a speech dialogue system

[...]

Markus Buck¹, Franz Gerl¹, Tim Haulick¹, Tobias Herbig¹, Gerhard Schmidt¹, Matthias Schulz¹ - Show less +2 more•Institutions (1)

Nuance Communications¹

31 Mar 2009

TL;DR: In this paper, a method for detecting barge-in in a speech dialogue system comprising determining whether a speech prompt is output by the speech dialog system, and detecting whether speech activity is present in an input signal based on a time-varying sensitivity threshold of a speech activity detector and/or based on speaker information is presented.

...read moreread less

Abstract: A method for detecting barge-in in a speech dialogue system comprising determining whether a speech prompt is output by the speech dialogue system, and detecting whether speech activity is present in an input signal based on a time-varying sensitivity threshold of a speech activity detector and/or based on speaker information, where the sensitivity threshold is increased if output of a speech prompt is determined and decreased if no output of a speech prompt is determined. If speech activity is detected in the input signal, the speech prompt may be interrupted or faded out. A speech dialogue system configured to detect barge-in is also disclosed.

...read moreread less

20 citations

Proceedings Article•DOI•

An investigation of subspace modeling for phonetic and speaker variability in automatic speech recognition

[...]

Richard Rose¹, Shou-Chun Yin¹, Yun Tang²•Institutions (2)

McGill University¹, Nuance Communications²

22 May 2011

TL;DR: The subspace based Gaussian mixture model (SGMM) is shown to provide an 18% reduction in word error rate (WER) for speaker independent ASR relative to the continuous density HMM(CDHMM) in the resource management CSR domain.

...read moreread less

Abstract: This paper investigates the impact of subspace based techniques for acoustic modeling in automatic speech recognition (ASR). There are many well known approaches to subspace based speaker adaptation which represent sources of variability as a projection within a low dimensional subspace. A new approach to acoustic modeling in ASR, referred to as the subspace based Gaussian mixture model (SGMM), represents phonetic variability as a set of projections applied at the state level in a hidden Markov model (HMM) based acoustic model. The impact of the SGMM in modeling these intrinsic sources of variability is evaluated for a continuous speech recognition (CSR) task. The SGMM is shown to provide an 18% reduction in word error rate (WER) for speaker independent (SI) ASR relative to the continuous density HMM(CDHMM) in the resource management CSR domain. The SI performance obtained from SGMM also represents a 5% reduction in WER relative to subspace based speaker adaption in an unsupervised speaker adaptation scenario.

...read moreread less

20 citations

Patent•

Methods and apparatus for reducing latency in speech recognition applications

[...]

Mark Fanty¹•Institutions (1)

Nuance Communications¹

26 May 2015

TL;DR: In this article, the authors propose a method for reducing latency in speech recognition applications, which comprises receiving first audio comprising speech from a user of a computing device, detecting an end-of-speech in the first audio, generating an ASR result based, at least in part, on a portion of audio prior to the detected end of speech.

...read moreread less

Abstract: Methods and apparatus for reducing latency in speech recognition applications. The method comprises receive first audio comprising speech from a user of a computing device, detecting an end of speech in the first audio, generating an ASR result based, at least in part, on a portion of the first audio prior to the detected end of speech, determining whether a valid action can be performed by a speech-enabled application installed on the computing device using the ASR result, and processing second audio when it is determined that a valid action cannot be performed by the speech-enabled application using the ASR result.

...read moreread less

20 citations

Patent•

Methods and apparatus for delivering information of various types to a user

[...]

Marc W. Regan¹, Vladimir Sejnoha¹, Sean P. Brown¹•Institutions (1)

Nuance Communications¹

30 Aug 2011

TL;DR: In this article, a user may issue a search query, and the search engine or engines to which that query is provided may be determined dynamically based on any of a variety of factors.

...read moreread less

Abstract: Some embodiments relate to techniques for performing a search for content, in which a user may issue a search query, and the search engine or engines to which that query is provided may be determined dynamically based on any of a variety of factors. For example, in some embodiments, the search engine or engines to which the query is provided may be determined based on the content of the search query, and/or auxiliary information such as the user's location, demographics, query history and/or browsing history.

...read moreread less

20 citations

Patent•

Caption correction apparatus

[...]

Kenichi Arakawa¹, Kohtaroh Miyamoto², Toshiya Ohgane, 俊也大鐘, 晃太郎宮本, 健一荒川 - Show less +2 more•Institutions (2)

Nuance Communications¹, IBM²

24 Mar 2006

TL;DR: In this article, a caption correction system is proposed for real-time captions to a presentation or the like, where a manual judgment of the voice recognition result is performed on the basis of the processed voice.

...read moreread less

Abstract: PROBLEM TO BE SOLVED: To solve problems that manual provision of real-time captions to a presentation or the like has low popularization in costs, a high recognition rate can not be expected only by an automatic voice recognition apparatus and there is a problem of incorrect translation and to provide an inexpensive apparatus or the like. SOLUTION: The caption correction apparatus obtains character strings and a degree of confidence of a voice recognition result. A time monitoring monitor monitors time and judges whether processing is delayed or not on the basis of the degree of confidence and time status. When the processing is not delayed, manual judgment is requested to a checker. In this case, voice is processed and the manual judgment of the voice recognition result is performed on the basis of the processed voice. When the processing is delayed, automatic judgment is performed on the basis of the degree of confidence. When the validity of the voice recognition result is judged as the result of manual judgment or automatic judgment, the character strings are displayed as determined character strings. When the invalidity of the voice recognition result is judged, the voice recognition result is automatically corrected by matching on the basis of a succeeding candidate based on voice recognition, the text/attributes of the presentation, the text of a script, and so on. Automatically corrected character strings are displayed as indefinite character strings. COPYRIGHT: (C)2008,JPO&INPIT

...read moreread less

20 citations

Collapse

Authors

Showing all 1521 results

Name	H-index	Papers	Citations
Vinayak P. Dravid	103	817	43612
Mehryar Mohri	75	320	22868
Jinsong Wu	70	566	16282
Horacio D. Espinosa	67	315	16270
Shumin Zhai	67	200	13447
Shang-Hua Teng	66	265	16647
Dimitri Kanevsky	62	362	14072
Marilyn A. Walker	62	309	13429
Tara N. Sainath	61	274	25183
Kenneth Church	61	295	21179
John B Ketterson	60	814	16929
Pascal Frossard	59	637	22749
Michael Picheny	57	244	11759
G. R. Scott Budinger	56	196	12063
Jun Wu	53	359	12110

Network Information

Related Institutions (5)

Google

39.8K papers, 2.1M citations

82% related

Microsoft

86.9K papers, 4.1M citations

82% related

Carnegie Mellon University

104.3K papers, 5.9M citations

80% related

Nokia

28.3K papers, 695.7K citations

38.6K papers, 1.3M citations

79% related

Performance

Metrics

1,704

Papers

56,595

Citations

No. of papers from the Institution in previous years
Year	Papers
2022	3
2021	24
2020	42
2019	55
2018	41
2017	53