Showing papers by "Goutam Saha published in 2011"

PDF

Open Access

Posted Content•

Improving Performance of Speaker Identification System Using Complementary Information Fusion

[...]

Md. Sahidullah, Sandipan Chakroborty, Goutam Saha

13 May 2011-arXiv: Sound

TL;DR: A novel feature set extracted from the residual signal of LP modeling is proposed, where vocal cord based decision score is fused with the vocal tract based score and the experimental results on two public databases show that fused mode system outperforms single spectral features.

...read moreread less

Abstract: Feature extraction plays an important role as a front-end processing block in speaker identification (SI) process. Most of the SI systems utilize like Mel-Frequency Cepstral Coefficients (MFCC), Perceptual Linear Prediction (PLP), Linear Predictive Cepstral Coefficients (LPCC), as a feature for representing speech signal. Their derivations are based on short term processing of speech signal and they try to capture the vocal tract information ignoring the contribution from the vocal cord. Vocal cord cues are equally important in SI context, as the information like pitch frequency, phase in the residual signal, etc could convey important speaker specific attributes and are complementary to the information contained in spectral feature sets. In this paper we propose a novel feature set extracted from the residual signal of LP modeling. Higher-order statistical moments are used here to find the nonlinear relationship in residual signal. To get the advantages of complementarity vocal cord based decision score is fused with the vocal tract based score. The experimental results on two public databases show that fused mode system outperforms single spectral features.

...read moreread less

5 citations

Proceedings Article•

Prediction of dominant genes responsible for Lung Adenocarcinoma using Rough Set Theory

[...]

Abhinandan Khan¹, Goutam Saha, Srirupa Dasgupta, Soumya Kanti Datta²•Institutions (2)

Jadavpur University¹, Institut Eurécom²

01 Nov 2011

TL;DR: An efficient approach of predicting the dominant genes responsible for Lung Adenocarcinoma using Rough Set Theory is presented, taking a microarray dataset containing data of diseased, suspected and healthy patients and characterizing them in terms of objects and attributes.

...read moreread less

Abstract: This paper presents an efficient approach of predicting the dominant genes responsible for Lung Adenocarcinoma using Rough Set Theory. The work takes a microarray dataset containing data of diseased, suspected and healthy patients and characterizes them in terms of objects and attributes. Using rough set theory, redundant attributes are then determined and eliminated. The core attributes are worked out by analyzing the relationship among the remaining attributes. Then Johnson's reduction algorithm has been used to extract underlying important rules from the remaining dataset. The paper reports three sets of rules, one each for diseased, suspected and healthy persons. The dominant genes can be accurately predicted by investigating the genes appearing in the generated Rule Sets. Microarray data obtained from a patient is analyzed in accordance with the Rule Sets generated. If any match is found with any one of the mentioned three cases, the patient will be diagnosed accordingly.

...read moreread less

3 citations

Proceedings Article•DOI•

A distance metric based outliers detection for robust Automatic Speaker Recognition applications

[...]

Israj Ali¹, Goutam Saha¹•Institutions (1)

Indian Institute of Technology Kharagpur¹

01 Dec 2011

TL;DR: This paper tries to investigate the outliers in testing phase using three different distance measures with the databases, one is microphone speech, YOHO and the other is telephone speech, POLYCOST and the results show that distance metric based outlier removal can remove maximum 29.43% of outlier in YohO and 22.86% for POLyCOST.

...read moreread less

Abstract: Outlier detection in Automatic Speaker Recognition (ASR) context is a task to detect those points in feature space which are less representative of a speaker. The existence of outliers is related to handset, noise or speaker's non-intrinsic characteristics. So detection and removal of outliers is useful in robust speaker recognition. The detection can be done in training phase or in testing phase or both. In this paper, we try to investigate the outliers in testing phase using three different distance measures with the databases, one is microphone speech, YOHO and the other is telephone speech, POLYCOST. The experiment is conducted on Mel-Frequency Cepstral Coefficients (MFCC) features with Gaussian Mixture Model (GMM) based speaker model. The results show that distance metric based outlier removal can remove maximum 29.43% of outliers in YOHO and 22.86% for POLYCOST while the accuracy improves or remain same as baseline depending on distance metric used.

...read moreread less

2 citations

Posted Content•

In Search of Autocorrelation Based Vocal Cord Cues for Speaker Identification

[...]

Md. Sahidullah¹, Goutam Saha•Institutions (1)

Indian Institute of Technology Kharagpur¹

11 May 2011-arXiv: Human-Computer Interaction

TL;DR: This paper has used Gaussian mixture model (GMM) based speaker modeling and results are shown on two public databases to validate the proposition that fusing these two sources of information in representing speaker characteristics yield better speaker identification accuracy.

...read moreread less

Abstract: In this paper we investigate a technique to find out vocal source based features from the LP residual of speech signal for automatic speaker identification. Autocorrelation with some specific lag is computed for the residual signal to derive these features. Compared to traditional features like MFCC, PLPCC which represent vocal tract information, these features represent complementary vocal cord information. Our experiment in fusing these two sources of information in representing speaker characteristics yield better speaker identification accuracy. We have used Gaussian mixture model (GMM) based speaker modeling and results are shown on two public databases to validate our proposition.

...read moreread less

1 citations