W
Wei-Ning Hsu
Researcher at Facebook
Publications - 94
Citations - 3775
Wei-Ning Hsu is an academic researcher from Facebook. The author has contributed to research in topics: Computer science & Engineering. The author has an hindex of 23, co-authored 66 publications receiving 1942 citations. Previous affiliations of Wei-Ning Hsu include Massachusetts Institute of Technology & National Taiwan University.
Papers
More filters
Proceedings ArticleDOI
An Unsupervised Autoregressive Model for Speech Representation Learning.
TL;DR: The authors proposed an unsupervised autoregressive neural model for learning generic speech representations, which is designed to preserve information for a wide range of downstream tasks, such as phone classification and speaker verification.
Proceedings Article
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
TL;DR: Data2vec is a framework that uses the same learning method for either speech, NLP or computer vision to predict latent representations of the full input data based on a masked view of the input in a self-distillation setup using a standard Transformer architecture.
Journal ArticleDOI
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units
Wei-Ning Hsu,Benjamin Bolte,Yao-Hung Hubert Tsai,Kushal Lakhotia,Ruslan Salakhutdinov,Abdelrahman Mohamed +5 more
TL;DR: HuBERT as mentioned in this paper utilizes an offline clustering step to provide aligned target labels for a BERT-like prediction loss, which forces the model to learn a combined acoustic and language model over the continuous inputs.
Proceedings Article
Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data
TL;DR: A factorized hierarchical variational autoencoder, which learns disentangled and interpretable representations from sequential data without supervision by formulating it explicitly within a factorsized hierarchical graphical model that imposes sequence-dependent priors and sequence-independent priors to different sets of latent variables.
Posted Content
Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Jonathan Shen,Patrick Nguyen,Yonghui Wu,Zhifeng Chen,Mia Xu Chen,Ye Jia,Anjuli Kannan,Tara N. Sainath,Yuan Cao,Chung-Cheng Chiu,Yanzhang He,Jan Chorowski,Smit Hinsu,Stella Marie Laurenzo,James Qin,Orhan Firat,Wolfgang Macherey,Suyog Gupta,Ankur Bapna,Shuyuan Zhang,Ruoming Pang,Ron Weiss,Rohit Prabhavalkar,Qiao Liang,Benoit Jacob,Bowen Liang,HyoukJoong Lee,Ciprian Chelba,Sébastien Jean,Bo Li,Melvin Johnson,Rohan Anil,Rajat Tibrewal,Xiaobing Liu,Akiko Eriguchi,Navdeep Jaitly,Naveen Ari,Colin Cherry,Parisa Haghani,Otavio Good,Youlong Cheng,Raziel Alvarez,Isaac Caswell,Wei-Ning Hsu,Zongheng Yang,Kuan-Chieh Wang,Ekaterina Gonina,Katrin Tomanek,Ben Vanik,Zelin Wu,Llion Jones,Mike Schuster,Yanping Huang,Dehao Chen,Kazuki Irie,George Foster,John Richardson,Klaus Macherey,Antoine Bruguier,Heiga Zen,Colin Raffel,Shankar Kumar,Kanishka Rao,David Rybach,Matthew Murray,Vijayaditya Peddinti,Maxim Krikun,Michiel Bacchiani,Thomas B. Jablin,Robert Suderman,Ian Williams,Benjamin N. Lee,Deepti Bhatia,Justin Carlson,Semih Yavuz,Yu Zhang,Ian McGraw,Max Galkin,Qi Ge,Golan Pundak,Chad Whipkey,Todd Wang,Uri Alon,Dmitry Lepikhin,Ye Tian,Sara Sabour,William Chan,Shubham Toshniwal,Baohua Liao,Michael Nirschl,Pat Rondon +90 more
TL;DR: This document outlines the underlying design of Lingvo and serves as an introduction to the various pieces of the framework, while also offering examples of advanced features that showcase the capabilities of the Framework.