scispace - formally typeset
Search or ask a question
Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.


Papers
More filters
Journal ArticleDOI

67 citations

Patent
18 Sep 2002
TL;DR: In this paper, a method and apparatus for analyzing documents and determining the association between words in a language is presented, which includes providing a collection of documents (306), selecting a first word or word strings, and a second word or string occurring in the documents.
Abstract: A method and apparatus for analyzing documents and thereby determining the association between words in a language (Fig. 3). The method includes providing a collection of documents (306), selecting a first word or word strings, and a second word or word string occurring in the documents. The method further involves associating first word or word strings and second word or word strings with common word or word string based on frequency of occurrence of the common word or word strings within the ranges (304).

67 citations

Journal ArticleDOI
TL;DR: The results suggest that measurement of image-plane similarity to a few (subject-specific) feature patterns is a better model than mental rotation for the mechanism used by the human visual system to recognize objects across changes in their 3-D orientation.

67 citations

Posted Content
TL;DR: This work examines the ability of E2E models to generalize to unseen domains, and proposes two complementary solutions to address this: training on diverse acoustic data, and LSTM state manipulation to simulate long-form audio when training using short utterances.
Abstract: All-neural end-to-end (E2E) automatic speech recognition (ASR) systems that use a single neural network to transduce audio to word sequences have been shown to achieve state-of-the-art results on several tasks. In this work, we examine the ability of E2E models to generalize to unseen domains, where we find that models trained on short utterances fail to generalize to long-form speech. We propose two complementary solutions to address this: training on diverse acoustic data, and LSTM state manipulation to simulate long-form audio when training using short utterances. On a synthesized long-form test set, adding data diversity improves word error rate (WER) by 90% relative, while simulating long-form training improves it by 67% relative, though the combination doesn't improve over data diversity alone. On a real long-form call-center test set, adding data diversity improves WER by 40% relative. Simulating long-form training on top of data diversity improves performance by an additional 27% relative.

67 citations

Patent
18 Nov 1998
TL;DR: In this article, a method and apparatus for extracting key terms from a data set, including the steps of identifying a first set of one or more word groups of words that occur more than once in the data set and removing from this first set a second set of word groups that are sub-strings of longer word groups in the first set.
Abstract: A method and apparatus is provided for extracting key terms from a data set, the method including the steps of identifying a first set of one or more word groups of one or more words that occur more than once in the data set, and removing from this first set a second set of word groups that are sub-strings of longer word groups in the first set The remaining word groups are key terms Each word group is weighted according to its frequency of occurrence within the data set The weighting of any word group may be increased by the frequency of any sub-string of words occurring in the second set and then dividing each weighting by the number of words in the word group This weighting process operates to determine the order of occurrence of the word groups Prefixes and suffixes are also removed from each word in the data set This produces a neutral form of each word so that the weighting values are prefix and suffix independent

67 citations


Network Information
Related Topics (5)
Deep learning
79.8K papers, 2.1M citations
88% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Artificial neural network
207K papers, 4.5M citations
84% related
Cluster analysis
146.5K papers, 2.9M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023271
2022562
2021640
2020643
2019633
2018528