scispace - formally typeset
S

Shaojin Ding

Researcher at Texas A&M University

Publications -  26
Citations -  718

Shaojin Ding is an academic researcher from Texas A&M University. The author has contributed to research in topics: Computer science & Engineering. The author has an hindex of 8, co-authored 17 publications receiving 330 citations. Previous affiliations of Shaojin Ding include Google.

Papers
More filters
Proceedings ArticleDOI

ABD-Net: Attentive but Diverse Person Re-Identification

TL;DR: An Attentive but Diverse Network (ABD-Net), which seamlessly integrates attention modules and diversity regularizations throughout the entire network to learn features that are representative, robust, and more discriminative.
Proceedings ArticleDOI

Personal VAD: Speaker-Conditioned Voice Activity Detection

TL;DR: Personal VAD as discussed by the authors is a system to detect the voice activity of a target speaker at the frame level by training a VAD-alike neural network conditioned on the target speaker embedding or the speaker verification score.
Proceedings ArticleDOI

Group Latent Embedding for Vector Quantized Variational Autoencoder in Non-Parallel Voice Conversion.

TL;DR: The proposed Group Latent Embedding for Vector Quantized Variational Autoencoders used in nonparallel Voice Conversion significantly improves the acoustic quality of the VC syntheses compared to the traditional VQ-VAE while retaining the voice identity of the target speaker.
Proceedings ArticleDOI

Foreign Accent Conversion by Synthesizing Speech from Phonetic Posteriorgrams.

TL;DR: This work presents a framework for FAC that eliminates the need for conventional vocoders and therefore the need to use the native speaker’s excitation, and produces speech that sounds more clear, natural, and similar to the non-native speaker compared with a baseline system.
Proceedings ArticleDOI

AutoSpeech: Neural Architecture Search for Speaker Recognition.

TL;DR: Results demonstrate that the derived CNN architectures from the proposed approach significantly outperform current speaker recognition systems based on VGG-M, Res net-18, and ResNet-34 back-bones, while enjoying lower model complexity.