Zhaoheng Ni

Posted Content

CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings

- 20 Apr 2020 -

TL;DR: Of note, Track 2 is the first challenge activity in the community to tackle an unsegmented multispeaker speech recognition scenario with a complete set of reproducible open source baselines providing speech enhancement, speaker diarization, and speech recognition modules.

...read moreread less

Proceedings ArticleDOI

CHiME-6 Challenge: Tackling multispeaker speech recognition for unsegmented recordings

Shinji Watanabe, +20 more

TL;DR: The 6th CHiME Speech Separation and Recognition Challenge (CHiME-6) as mentioned in this paper was the first challenge activity in the community to tackle an unsegmented multispeaker speech recognition scenario with a complete set of reproducible open source baselines.

...read moreread less

Proceedings ArticleDOI

Confused or not Confused?: Disentangling Brain Activity from EEG Data Using Bidirectional LSTM Recurrent Neural Networks

Zhaoheng Ni, +4 more

TL;DR: The Bidirectional LSTM Recurrent Neural Networks model achieves the state-of-the-art performance compared with other machine learning approaches, and shows strong robustness as evaluated by cross-validation.

...read moreread less

Journal ArticleDOI

Anatomical Entity Recognition with a Hierarchical Framework Augmented by External Resources

Yan Xu, +7 more

- 24 Oct 2014 -

PLOS ONE

TL;DR: The use of the hierarchical framework, which combines the recognition of named entities of various types (diseases, clinical tests, treatments) with information embedded in external knowledge bases, resulted in a 5.08% increment in F1.

...read moreread less

Posted Content

TorchAudio: Building Blocks for Audio and Speech Processing

Yao-Yuan Yang, +22 more

- 28 Oct 2021 -

arXiv: Audio and Speech Processing

TL;DR: Torchaudio as discussed by the authors is a set of building blocks for machine learning applications in the audio and speech processing domain that can be easily installed from Python Package Index repository and the source code is publicly available under a BSD-2-Clause License.

...read moreread less

Papers

CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings

CHiME-6 Challenge: Tackling multispeaker speech recognition for unsegmented recordings

Confused or not Confused?: Disentangling Brain Activity from EEG Data Using Bidirectional LSTM Recurrent Neural Networks

Anatomical Entity Recognition with a Hierarchical Framework Augmented by External Resources

TorchAudio: Building Blocks for Audio and Speech Processing