Z
Zhaoheng Ni
Researcher at City University of New York
Publications - 26
Citations - 361
Zhaoheng Ni is an academic researcher from City University of New York. The author has contributed to research in topics: Computer science & Engineering. The author has an hindex of 5, co-authored 18 publications receiving 184 citations. Previous affiliations of Zhaoheng Ni include Beihang University & Brooklyn College.
Papers
More filters
Posted Content
CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings
Shinji Watanabe,Michael I. Mandel,Jon Barker,Emmanuel Vincent,Ashish Arora,Xuankai Chang,Sanjeev Khudanpur,Vimal Manohar,Daniel Povey,Desh Raj,David Snyder,Aswin Shanmugam Subramanian,Jan Trmal,Bar Ben Yair,Christoph Boeddeker,Zhaoheng Ni,Yusuke Fujita,Shota Horiguchi,Naoyuki Kanda,Takuya Yoshioka,Neville Ryant +20 more
TL;DR: Of note, Track 2 is the first challenge activity in the community to tackle an unsegmented multispeaker speech recognition scenario with a complete set of reproducible open source baselines providing speech enhancement, speaker diarization, and speech recognition modules.
Proceedings ArticleDOI
CHiME-6 Challenge: Tackling multispeaker speech recognition for unsegmented recordings
Shinji Watanabe,Michael I. Mandel,Jon Barker,Emmanuel Vincent,Ashish Arora,Xuankai Chang,Sanjeev Khudanpur,Vimal Manohar,Daniel Povey,Desh Raj,David Snyder,Aswin Shanmugam Subramanian,Jan Trmal,Bar Ben Yair,Christoph Boeddeker,Zhaoheng Ni,Yusuke Fujita,Shota Horiguchi,Naoyuki Kanda,Takuya Yoshioka,Neville Ryant +20 more
TL;DR: The 6th CHiME Speech Separation and Recognition Challenge (CHiME-6) as mentioned in this paper was the first challenge activity in the community to tackle an unsegmented multispeaker speech recognition scenario with a complete set of reproducible open source baselines.
Proceedings ArticleDOI
Confused or not Confused?: Disentangling Brain Activity from EEG Data Using Bidirectional LSTM Recurrent Neural Networks
TL;DR: The Bidirectional LSTM Recurrent Neural Networks model achieves the state-of-the-art performance compared with other machine learning approaches, and shows strong robustness as evaluated by cross-validation.
Journal ArticleDOI
Anatomical Entity Recognition with a Hierarchical Framework Augmented by External Resources
TL;DR: The use of the hierarchical framework, which combines the recognition of named entities of various types (diseases, clinical tests, treatments) with information embedded in external knowledge bases, resulted in a 5.08% increment in F1.
Posted Content
TorchAudio: Building Blocks for Audio and Speech Processing
Yao-Yuan Yang,Moto Hira,Zhaoheng Ni,Anjali Chourdia,Artyom Astafurov,Caroline Chen,Ching-Feng Yeh,Christian Puhrsch,David Pollack,Dmitriy Genzel,Donny Greenberg,Edward Z. Yang,Jason Lian,Jay Mahadeokar,Jeff Hwang,Ji Chen,Peter Goldsborough,Prabhat Roy,Sean Narenthiran,Shinji Watanabe,Soumith Chintala,Vincent Quenneville-Bélair,Yangyang Shi +22 more
TL;DR: Torchaudio as discussed by the authors is a set of building blocks for machine learning applications in the audio and speech processing domain that can be easily installed from Python Package Index repository and the source code is publicly available under a BSD-2-Clause License.