Sunit Sivasankaran

Proceedings ArticleDOI

Asteroid: The PyTorch-Based Audio Source Separation Toolkit for Researchers.

TL;DR: In this paper, the PyTorch-based audio source separation toolkit Asteroid is described, inspired by the most successful neural source separation systems, it provides all neu-ral building blocks required to build such a system.

...read moreread less

Proceedings ArticleDOI

Robust ASR using neural network based speech enhancement and feature simulation

Sunit Sivasankaran, +6 more

TL;DR: A deep neural network based multichannel speech enhancement technique, where the speech and noise spectra are estimated using a DNN based regressor and the spatial parameters are derived in an expectation-maximization (EM) like fashion.

...read moreread less

Proceedings ArticleDOI

Keyword Based Speaker Localization: Localizing a Target Speaker in a Multi-speaker Environment.

Sunit Sivasankaran, +2 more

TL;DR: This work introduces the new task of localizing the speaker who uttered a given keyword, e.g., the wake-up word of a distant-microphone voice command system, in the presence of overlapping speech.

...read moreread less

Proceedings ArticleDOI

A French corpus for distant-microphone speech processing in real homes

Nancy Bertin, +11 more

TL;DR: A new corpus for distant- microphone speech processing in domestic environments that includes reverberated, noisy speech signals spoken by native French talkers in a lounge and recorded by an 8-microphone device at various angles and distances and in various noise conditions is introduced.

...read moreread less

Proceedings ArticleDOI

Phone Merging For Code-Switched Speech Recognition

Sunit Sivasankaran, +4 more

TL;DR: Evidence that phone sharing between languages improves the Acoustic Model performance for Hindi-English code-switched speech is shown and multiple data-driven methods to identify phones to be merged across the languages are investigated.

...read moreread less

Papers

Asteroid: The PyTorch-Based Audio Source Separation Toolkit for Researchers.

Robust ASR using neural network based speech enhancement and feature simulation

Keyword Based Speaker Localization: Localizing a Target Speaker in a Multi-speaker Environment.

A French corpus for distant-microphone speech processing in real homes

Phone Merging For Code-Switched Speech Recognition