Thilo von Neumann

Researcher at Nippon Telegraph and Telephone

Publications - 16

Citations - 235

Thilo von Neumann is an academic researcher from Nippon Telegraph and Telephone. The author has contributed to research in topics: Source separation & Computer science. The author has an hindex of 6, co-authored 11 publications receiving 146 citations. Previous affiliations of Thilo von Neumann include University of Paderborn.

Papers

PDF

Open Access

More filters

Proceedings ArticleDOI

All-neural Online Source Separation, Counting, and Diarization for Meeting Analysis

Thilo von Neumann, +5 more

TL;DR: In this paper, an all-neural approach to simultaneous speaker counting, diarization and source separation is presented, where the neural network is recurrent over time as well as over the number of sources.

...read moreread less

Proceedings ArticleDOI

End-to-End Training of Time Domain Audio Separation and Recognition

Thilo von Neumann, +6 more

TL;DR: In this article, a Convolutional Time Domain Audio Separation Network (Conv-TasNet) is combined with an end-to-end speech recognizer and trained jointly by distributing it over multiple GPUs or approximating truncated back-propagation for the convolutional front-end.

...read moreread less

Posted Content

All-neural online source separation, counting, and diarization for meeting analysis.

Thilo von Neumann, +5 more

- 21 Feb 2019 -

arXiv: Audio and Speech Processing

TL;DR: This paper presents for the first time an all-neural approach to simultaneous speaker counting, diarization and source separation, using an NN-based estimator that operates in a block-online fashion and tracks speakers even if they remain silent for a number of time blocks, thus learning a stable output order for the separated sources.

...read moreread less

Proceedings ArticleDOI

Deep Attractor Networks for Speaker Re-Identification and Blind Source Separation

Lukas Drude, +2 more

TL;DR: This model structure improves the signal to distortion ratio (SDR) over a DAN baseline and provides up to 61% and up to 34% relative reduction in permutation error rate and re-identification error rate compared to an i-vector baseline, respectively.

...read moreread less

Proceedings ArticleDOI

Multi-Talker ASR for an Unknown Number of Sources: Joint Training of Source Counting, Separation and ASR.

Thilo von Neumann, +6 more

TL;DR: In this article, an iterative speech extraction system with mechanisms to count the number of sources and combine it with a single-talker speech recognizer to form the first end-to-end multi-talkers automatic speech recognition system for an unknown number of active speakers.

...read moreread less