Institution
Nuance Communications
Company•Vienna, Austria•
About: Nuance Communications is a company organization based out in Vienna, Austria. It is known for research contribution in the topics: Speech processing & Voice activity detection. The organization has 1518 authors who have published 1701 publications receiving 54891 citations. The organization is also known as: ScanSoft & ScanSoft Inc..
Papers published on a yearly basis
Papers
More filters
•
06 Jan 2012
TL;DR: In this paper, the results of the local and remote speech recognition engines are combined based, at least in part, on logic stored by one or more components of the client/server architecture.
Abstract: Techniques for combining the results of multiple recognizers in a distributed speech recognition architecture. Speech data input to a client device is encoded and processed both locally and remotely by different recognizers configured to be proficient at different speech recognition tasks. The client/server architecture is configurable to enable network providers to specify a policy directed to a trade-off between reducing recognition latency perceived by a user and usage of network resources. The results of the local and remote speech recognition engines are combined based, at least in part, on logic stored by one or more components of the client/server architecture.
10 citations
•
20 Sep 2017TL;DR: In this article, a system for automatically processing text comprising information regarding a patient encounter to prioritize medical billing codes derived from the text is presented, which includes at least one storage medium storing processor-executable instructions.
Abstract: Some aspect include a system for automatically processing text comprising information regarding a patient encounter to prioritize medical billing codes derived from the text. The system comprises at least one storage medium storing processor-executable instructions, and at least one processor configured to execute the processor-executable instructions to analyze the text to extract a plurality of facts from the text, assign a plurality of medical billing codes to the text based at least in part on the plurality of facts, using a model trained at least in part on feedback from a user, order the plurality of medical billing codes in a sequence beginning with a primary medical billing code corresponding to a primary diagnosis associated with the text, and present the ordered sequence of medical billing codes to the user for review.
10 citations
••
25 Mar 2012TL;DR: This paper introduces a segmentation process consisting of two phases, first, forced alignment is performed using an HMM-GMM model and the resulting segmentation is then locally refined using an SVM based boundary model.
Abstract: Phonetic segmentation is an important step in the development of a concatenative TTS voice. This paper introduces a segmentation process consisting of two phases. First, forced alignment is performed using an HMM-GMM model. The resulting segmentation is then locally refined using an SVM based boundary model. Both the models are derived from multi-speaker data using a speaker adaptive training procedure. Evaluation results are obtained on the TIMIT corpus and on a proprietary single-speaker TTS corpus.
10 citations
•
22 May 2014TL;DR: In this article, a method to extract a target speaker's speech using a known speaker voiceprint from an audio recording that includes the target speakers' speech and the known speakers' speeches is presented.
Abstract: In many scenarios, speaker verification systems can be given a single-channel audio with recordings of multiple speakers. To perform accurate speaker verification, a system can isolate the speech of a speaker. In one embodiment, a method, and corresponding system, of speaker verification includes extracting a target speaker's speech, using a known speaker voiceprint, from an audio recording that includes the target speaker's speech and the known speaker's speech. The known speaker voiceprint can correspond to the known speaker. Extracting the target speaker's speech can include determining portions of the audio recording where the known speaker voiceprint matches the known speaker's speech above a particular threshold, and extracting the target speaker's speech from other portions of the audio recording. In this manner, speaker verification is performed on the target speaker's speech without interference from the known speaker's speech and allows for a more accurate verification.
9 citations
•
30 Sep 2011TL;DR: In this paper, the authors present a technique for receiving a query from a user of a mobile device, and for conveying to the user not only search results, but also feedback relating to query.
Abstract: Some embodiments of the invention provide techniques for receiving a query from a user of a mobile device, and for conveying to the user not only search results, but also feedback relating to query. For example, the user may be prompted to elicit supplemental information relating to the query, or provided other feedback. The feedback may be conveyed in a manner which minimizes how much of the mobile device's display screen is dedicated to presenting the feedback.
9 citations
Authors
Showing all 1521 results
Name | H-index | Papers | Citations |
---|---|---|---|
Vinayak P. Dravid | 103 | 817 | 43612 |
Mehryar Mohri | 75 | 320 | 22868 |
Jinsong Wu | 70 | 566 | 16282 |
Horacio D. Espinosa | 67 | 315 | 16270 |
Shumin Zhai | 67 | 200 | 13447 |
Shang-Hua Teng | 66 | 265 | 16647 |
Dimitri Kanevsky | 62 | 362 | 14072 |
Marilyn A. Walker | 62 | 309 | 13429 |
Tara N. Sainath | 61 | 274 | 25183 |
Kenneth Church | 61 | 295 | 21179 |
John B Ketterson | 60 | 814 | 16929 |
Pascal Frossard | 59 | 637 | 22749 |
Michael Picheny | 57 | 244 | 11759 |
G. R. Scott Budinger | 56 | 196 | 12063 |
Jun Wu | 53 | 359 | 12110 |