Tamás Gábor Csapó

Proceedings ArticleDOI

DNN-Based Ultrasound-to-Speech Conversion for a Silent Speech Interface.

TL;DR: It is found that the representation that used several neighboring image frames in combination with a feature selection method was preferred both by the subjects taking part in the listening experiments, and in terms of the Normalized Mean Squared Error.

...read moreread less

Journal ArticleDOI

A comparative study on the contour tracking algorithms in ultrasound tongue images with automatic re-initialization

Kele Xu, +3 more

- 20 May 2016 -

Journal of the Acoustical Society of Ame...

TL;DR: The results demonstrate that with automatic re-initialization of contour tracking, the tracking error can be reduced from an average of 5-6 to about 4 pixels, a result obtained by using a large number of hand-labeled frames and similarity measurements to extract the contours, which results in improved performance.

...read moreread less

Journal ArticleDOI

Convolutional neural network-based automatic classification of midsagittal tongue gestural targets using B-mode ultrasound images

Kele Xu, +3 more

- 09 Jun 2017 -

Journal of the Acoustical Society of Ame...

TL;DR: The CNN-based method achieves state-of-the-art performance, even though no pre-training of the CNN was carried out, and the speaker-dependent and speaker-independent tongue gestural target classification experiments are conducted.

...read moreread less

Journal ArticleDOI

Speech-centric Multimodal Interaction for Easy-to-access Online Services – A Personal Life Assistant for the Elderly

Antônio Lúcio Teixeira, +13 more

- 01 Jan 2014 -

Procedia Computer Science

TL;DR: The multimodal architecture of the PLA, the services provided by thePLA, and the work done in the area of speech input and output modalities, which play a key role in the application are presented.

...read moreread less

Proceedings ArticleDOI

F0 Estimation for DNN-Based Ultrasound Silent Speech Interfaces

Tamás Grósz, +4 more

TL;DR: Deep neural networks are experimented with to perform articulatory-to-acoustic conversion from ultrasound images, with an emphasis on estimating the voicing feature and the F0 curve from the ultrasound input, with a correlation rate of 0.74.

...read moreread less

Papers

DNN-Based Ultrasound-to-Speech Conversion for a Silent Speech Interface.

A comparative study on the contour tracking algorithms in ultrasound tongue images with automatic re-initialization

Convolutional neural network-based automatic classification of midsagittal tongue gestural targets using B-mode ultrasound images

Speech-centric Multimodal Interaction for Easy-to-access Online Services – A Personal Life Assistant for the Elderly

F0 Estimation for DNN-Based Ultrasound Silent Speech Interfaces