scispace - formally typeset
P

Pedro J. Moreno

Researcher at Google

Publications -  128
Citations -  7944

Pedro J. Moreno is an academic researcher from Google. The author has contributed to research in topics: Language model & Word error rate. The author has an hindex of 45, co-authored 118 publications receiving 7206 citations. Previous affiliations of Pedro J. Moreno include Carnegie Mellon University & Hewlett-Packard.

Papers
More filters
Patent

Computer method and apparatus for uniform representation of genome sequences

TL;DR: A comparison database stores a predefined number of known biological sequences in the database and a comparison routine compares and scores a subject sequence against each known sequence in the comparison database as discussed by the authors.
Posted Content

Parrotron: An End-to-End Speech-to-Speech Conversion Model and its Applications to Hearing-Impaired Speech and Speech Separation

TL;DR: It is demonstrated that this model can be trained to normalize speech from any speaker regardless of accent, prosody, and background noise, into the voice of a single canonical target speaker with a fixed accent and consistent articulation and prosody.
Proceedings Article

Discriminative Topic Segmentation of Text and Speech

TL;DR: Two new discriminative topic segmentation algorithms are given which employ a new measure of text similarity based on word co-occurrence and it is demonstrated that by using a lattice of competing hypotheses rather than just the one-best hypothesis as input to the segmentation algorithm, the performance of the algorithm can be improved.
Journal ArticleDOI

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

TL;DR: Universal Speech Model (USM) as mentioned in this paper pre-trains the encoder of the model on a large unlabeled multilingual dataset of 12 million (M) hours spanning over 300 languages, and fine-tunes on a smaller labeled dataset.
Proceedings ArticleDOI

Selection and combination of hypotheses for dialectal speech recognition

TL;DR: This paper presents two methods to select and combine the best decoded hypothesis from a pool of dialectal recognizers, following a Machine Learning approach and extracts features from the Speech Recognition output along with Word Embeddings and use Shallow Neural Networks for classification.