scispace - formally typeset
S

Shigeki Sagayama

Researcher at Meiji University

Publications -  329
Citations -  5542

Shigeki Sagayama is an academic researcher from Meiji University. The author has contributed to research in topics: Hidden Markov model & Speaker recognition. The author has an hindex of 36, co-authored 327 publications receiving 5313 citations. Previous affiliations of Shigeki Sagayama include École Normale Supérieure & National Institute of Informatics.

Papers
More filters
Proceedings Article

Dynamic Time-Alignment Kernel in Support Vector Machine

TL;DR: The proposed SVM (DTAK-SVM) is evaluated in speaker-dependent speech recognition experiments of hand-segmented phoneme recognition and preliminary experimental results show comparable recognition performance with hidden Markov models (HMMs).
Journal ArticleDOI

A Multipitch Analyzer Based on Harmonic Temporal Structured Clustering

TL;DR: A multipitch analyzer called the harmonic temporal structured clustering (HTC) method, that jointly estimates pitch, intensity, onset, duration, etc., of each underlying source in a multipitch audio signal, is proposed.
Proceedings ArticleDOI

Complex NMF: A new sparse representation for acoustic signals

TL;DR: A new sparse representation for acoustic signals is presented which is based on a mixing model defined in the complex-spectrum domain (where additivity holds), and allows us to extract recurrent patterns of magnitude spectra that underlie observed complex spectra and the phase estimates of constituent signals.
Proceedings ArticleDOI

A successive state splitting algorithm for efficient allophone modeling

TL;DR: The authors propose an algorithm, successive state splitting (SSS), for simultaneously finding an optimal set of phoneme context classes, an optimal topology, and optimal parameters for hidden Markov models (HMMs) commonly using a maximum likelihood criterion.
Proceedings Article

Separation of a monaural audio signal into harmonic/percussive components by complementary diffusion on spectrogram

TL;DR: A simple and fast method to separate a monaural audio signal into harmonic and percussive components, which is much useful for multi-pitch analysis, automatic music transcription, drum detection, modification of music, and so on.