Home
/
Authors
/
Abolghasem Sayadiyan

Author

Abolghasem Sayadiyan

Bio: Abolghasem Sayadiyan is an academic researcher from Amirkabir University of Technology. The author has contributed to research in topics: Speech coding & Speech processing. The author has an hindex of 11, co-authored 45 publications receiving 405 citations.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A Maximum Likelihood Estimation of Vocal-Tract-Related Filter Characteristics for Single Channel Speech Separation

[...]

M.H. Radfar¹, Richard M. Dansereau², Abolghasem Sayadiyan¹•Institutions (2)

Amirkabir University of Technology¹, Carleton University²

01 Dec 2006-Eurasip Journal on Audio, Speech, and Music Processing

TL;DR: A new technique for separating two speech signals from a single recording is presented and effectively adds vocal-tract-related filter characteristics as a new cue to CASA models using a new grouping technique based on an underdetermined blind source separation.

...read moreread less

Abstract: We present a new technique for separating two speech signals from a single recording. The proposed method bridges the gap between underdetermined blind source separation techniques and those techniques that model the human auditory system, that is, computational auditory scene analysis (CASA). For this purpose, we decompose the speech signal into the excitation signal and the vocal-tract-related filter and then estimate the components from the mixed speech using a hybrid model. We first express the probability density function (PDF) of the mixed speech's log spectral vectors in terms of the PDFs of the underlying speech signal's vocal-tract-related filters. Then, the mean vectors of PDFs of the vocal-tract-related filters are obtained using a maximum likelihood estimator given the mixed signal. Finally, the estimated vocal-tract-related filters along with the extracted fundamental frequencies are used to reconstruct estimates of the individual speech signals. The proposed technique effectively adds vocal-tract-related filter characteristics as a new cue to CASA models using a new grouping technique based on an underdetermined blind source separation. We compare our model with both an underdetermined blind source separation and a CASA method. The experimental results show that our model outperforms both techniques in terms of SNR improvement and the percentage of crosstalk suppression.

...read moreread less

53 citations

Journal Article•DOI•

Nonlinear minimum mean square error estimator for mixture-maximisation approximation

[...]

M.H. Radfar¹, Amir H. Banihashemi¹, Richard M. Dansereau¹, Abolghasem Sayadiyan²•Institutions (2)

Carleton University¹, Amirkabir University of Technology²

08 Jun 2006-Electronics Letters

TL;DR: It is concluded that this mixture-maximisation approximation is a nonlinear minimum mean square error estimator with the assumption of uniform distributions for phase information of the underlying speech signals.

...read moreread less

Abstract: In many speech separation, enhancement, and recognition techniques, it is necessary to express the log spectrum of a mixture speech signal in terms of the log spectra of the underlying speech signals. For this purpose, the mixture-maximisation (MIXMAX) approximation is commonly used. Presented is a proof for this approximation in a statistical framework. It is concluded that this approximation is a nonlinear minimum mean square error estimator with the assumption of uniform distributions for phase information of the underlying speech signals.

...read moreread less

41 citations

Journal Article•DOI•

Monaural speech segregation based on fusion of source-driven with model-driven techniques

[...]

M.H. Radfar¹, Richard M. Dansereau², Abolghasem Sayadiyan¹•Institutions (2)

Amirkabir University of Technology¹, Carleton University²

01 Jun 2007-Speech Communication

TL;DR: The results show that although for the speaker-dependent case, model-based separation delivers the best quality, for a speaker independent scenario the integrated model outperforms the individual approaches and supports the idea that the human auditory system takes on both grouping cues and a priori knowledge to segregate speech signals.

...read moreread less

25 citations

Proceedings Article•

Performance evaluation of three features for model-based single channel speech separation problem.

[...]

M.H. Radfar, Richard M. Dansereau, Abolghasem Sayadiyan

01 Jan 2006

25 citations

Proceedings Article•DOI•

Data Transmission over GSM Adaptive Multi Rate Voice Channel Using Speech-Like Symbols

[...]

Ali Shahbazi¹, Amir Hossein Rezaei¹, Abolghasem Sayadiyan¹, Saeed Mosayyebpour¹•Institutions (1)

Amirkabir University of Technology¹

09 Feb 2010

TL;DR: A novel algorithm is proposed to design a set of Speech-Like (SL) symbols which leads to a GSM voice channel data modem to modulate and demodulate data on GSM Adaptive Multi Rate (AMR) voice codec which consist of different bit-rates.

...read moreread less

Abstract: This paper introduces a new method to transmit digital data through Global System for Mobile communications (GSM) voice channel. A novel algorithm is proposed to design a set of Speech-Like (SL) symbols which leads to design a GSM voice channel data modem to modulate and demodulate data on GSM Adaptive Multi Rate (AMR) voice codec which consist of different bit-rates. Designing a set of time-symbols is an off line procedure with the aim of minimizing symbol detection error. This modem is useful in real-time data communication with high priority. The introduced modem encodes data into SL symbols to be transmitted over GSM voice channel and the received SL symbols are decoded back to data.

...read moreread less

21 citations

1
2
3
4
…
5
6
7
8
9

Collapse

Cited by

PDF

Open Access

More filters

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

Journal Article•DOI•

Digital processing of speech signals

[...]

M.G. Bellanger

01 Oct 1980

1,565 citations

Journal Article•DOI•

Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet

[...]

Edward J. Vajda

01 Dec 2000-Language

789 citations

Journal Article•DOI•

Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection

[...]

Emre Cakir¹, Giambattista Parascandolo¹, Toni Heittola¹, Heikki Huttunen¹, Tuomas Virtanen¹ - Show less +1 more•Institutions (1)

Tampere University of Technology¹

01 Jun 2017-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: In this paper, a convolutional recurrent neural network (CRNN) was proposed for polyphonic sound event detection task and compared with CNN, RNN and other established methods, and observed a considerable improvement for four different datasets consisting of everyday sound events.

...read moreread less

Abstract: Sound events often occur in unstructured environments where they exhibit wide variations in their frequency content and temporal structure. Convolutional neural networks CNNs are able to extract higher level features that are invariant to local spectral and temporal variations. Recurrent neural networks RNNs are powerful in learning the longer term temporal context in the audio signals. CNNs and RNNs as classifiers have recently shown improved performances over established methods in various sound recognition tasks. We combine these two approaches in a convolutional recurrent neural network CRNN and apply it on a polyphonic sound event detection task. We compare the performance of the proposed CRNN method with CNN, RNN, and other established methods, and observe a considerable improvement for four different datasets consisting of everyday sound events.

...read moreread less

432 citations

Journal Article•DOI•

A Tandem Algorithm for Pitch Estimation and Voiced Speech Segregation

[...]

Guoning Hu¹, DeLiang Wang¹•Institutions (1)

Ohio State University¹

01 Nov 2010-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: A tandem algorithm is proposed that performs pitch estimation of a target utterance and segregation of voiced portions of target speech jointly and iteratively and performs substantially better than previous systems for either pitch extraction or voiced speech segregation.

...read moreread less

Abstract: A lot of effort has been made in computational auditory scene analysis (CASA) to segregate speech from monaural mixtures. The performance of current CASA systems on voiced speech segregation is limited by lacking a robust algorithm for pitch estimation. We propose a tandem algorithm that performs pitch estimation of a target utterance and segregation of voiced portions of target speech jointly and iteratively. This algorithm first obtains a rough estimate of target pitch, and then uses this estimate to segregate target speech using harmonicity and temporal continuity. It then improves both pitch estimation and voiced speech segregation iteratively. Novel methods are proposed for performing segregation with a given pitch estimate and pitch determination with given segregation. Systematic evaluation shows that the tandem algorithm extracts a majority of target speech without including much interference, and it performs substantially better than previous systems for either pitch extraction or voiced speech segregation.

...read moreread less

263 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

Collapse