scispace - formally typeset
Search or ask a question

Showing papers by "Goutam Saha published in 2015"


Journal ArticleDOI
01 May 2015
TL;DR: All SID and SV systems are slightly more robust to voices converted through cross-gender conversion than intra- gender conversion and the results of this experiment show an approach on quantifying objective score of voice conversion that can be related to the ability to spoof an SV system.
Abstract: Graphical abstractDisplay Omitted HighlightsEvaluation of robustness of SID and SV systems against VC spoofing attack.The vulnerability in decreasing order of VC techniques is GMM, WFW and WFW-.In SV systems, GMM-SVM is more resilient than GMM-UBM for VC impostor attacks.All systems are more robust to cross-gender than intra-gender converted voices.An approach of relating the VC score with SV performance is proposed. Voice conversion (VC) approach, which morphs the voice of a source speaker to be perceived as spoken by a specified target speaker, can be intentionally used to deceive the speaker identification (SID) and speaker verification (SV) systems that use speech biometric. Voice conversion spoofing attacks to imitate a particular speaker pose potential threat to these kinds of systems. In this paper, we first present an experimental study to evaluate the robustness of such systems against voice conversion disguise. We use Gaussian mixture model (GMM) based SID systems, GMM with universal background model (GMM-UBM) based SV systems and GMM supervector with support vector machine (GMM-SVM) based SV systems for this. Voice conversion is conducted by using three different techniques: GMM based VC technique, weighted frequency warping (WFW) based conversion method and its variation, where energy correction is disabled (WFW-). Evaluation is done by using intra-gender and cross-gender voice conversions between fifty male and fifty female speakers taken from TIMIT database. The result is indicated by degradation in the percentage of correct identification (POC) score in SID systems and degradation in equal error rate (EER) in all SV systems. Experimental results show that the GMM-SVM SV systems are more resilient against voice conversion spoofing attacks than GMM-UBM SV systems and all SID and SV systems are most vulnerable towards GMM based conversion than WFW and WFW- based conversion. From the results, it can also be said that, in general terms, all SID and SV systems are slightly more robust to voices converted through cross-gender conversion than intra-gender conversion. This work extended the study to find out the relationship between VC objective score and SV system performance in CMU ARCTIC database, which is a parallel corpus. The results of this experiment show an approach on quantifying objective score of voice conversion that can be related to the ability to spoof an SV system.

31 citations


Journal ArticleDOI
TL;DR: A technique to measure similarity among Indian languages in a novel way, using language verification framework, and it is expected that the languages belonging to the same family should manifest their similarity in experimental results.
Abstract: Majority of Indian languages have originated from two language families, namely, Indo-European and Dravidian. Therefore, certain kind of similarity among languages of a particular family can be expected to exist. Also, languages spoken in neighboring regions show certain similarity since there happens to be a lot of intermingling between population of neighboring regions. This paper develops a technique to measure similarity among Indian languages in a novel way, using language verification framework. Four verification systems are designed for each language. Acceptance of one language as another, which relates to false acceptance in language verification framework, is used as a measure of similarity. If language A shows false acceptance more than a predefined threshold with language B, in at least three out of the four systems, then languages A and B are considered to be similar in this work. It is expected that the languages belonging to the same family should manifest their similarity in experimental results. Also, similarity between neighboring languages should be detected through experiments. Any deviation from such fact should be due to specific linguistic or historical reasons. This work analyzes any such scenario.

18 citations


Journal ArticleDOI
TL;DR: Auscultation is an important part of the clinical examination of different lung diseases and its subsequent automatic interpretations may help a clinical practice.
Abstract: Background and objective Auscultation is an important part of the clinical examination of different lung diseases. Objective analysis of lung sounds based on underlying characteristics and its subsequent automatic interpretations may help a clinical practice. Methods We collected the breath sounds from 8 normal subjects and 20 diffuse parenchymal lung disease (DPLD) patients using a newly developed instrument and then filtered off the heart sounds using a novel technology. The collected sounds were thereafter analysed digitally on several characteristics as dynamical complexity, texture information and regularity index to find and define their unique digital signatures for differentiating normality and abnormality. For convenience of testing, these characteristic signatures of normal and DPLD lung sounds were transformed into coloured visual representations. The predictive power of these images has been validated by six independent observers that include three physicians. Results The proposed method gives a classification accuracy of 100% for composite features for both the normal as well as lung sound signals from DPLD patients. When tested by independent observers on the visually transformed images, the positive predictive value to diagnose the normality and DPLD remained 100%. Conclusions The lung sounds from the normal and DPLD subjects could be differentiated and expressed according to their digital signatures. On visual transformation to coloured images, they retain 100% predictive power. This technique may assist physicians to diagnose DPLD from visual images bearing the digital signature of the condition.

7 citations


Journal ArticleDOI
21 Apr 2015
TL;DR: A framework which selects good quality heart sound subseences which are artifact-free and reused the features involved in this processing in segmentation to assist interpretation of heart sound by physicians in objective analysis through record- ing in a computer is developed.
Abstract: Purpose: Digital recording of heart sounds commonly known as Phonocardiogram (PCG) signal, is a convenient primary diagnostic tool for analyzing condition of heart. Phono- cardiogram aids physicians to visualize the acoustic energies that results from mechanical aspect of cardiac activity. PCG signal cycle segmentation is an essential processing step to- wards heart sound signal analysis. Sound artifacts due to inappropriate placement of stetho- scope, body movement, cough etc. makes segmentation difficult. Artifact-free segmented heart sound cycles are convenient for physicians to interpret and it is also useful for computerized automated classification of abnormality. Methods: We have developed a framework which selects good quality heart sound subse- quences which are artifact-free and reused the features involved in this processing in segmenta- tion. In this work, we have used information contained in frequency subbands by decomposing the signal using Discrete Wavelet Packet Transform (DWPT). The algorithm identifies the parts of the signal where artifacts are prominent and it also detects major events in heart sound cycles. Results: The algorithm shows good results when tested on normal and five commonly occur - ring pathological heart sound signals. An average accuracy of 93.71% is registered for artifact- free subsequence selection process. The cycle segmentation algorithm gives an accuracy of 98.36%, 98.18% and 93.97% respectively for three databases used in the experiment. Conclusions: The work provides a solution for artifact-free segmentation of heart sound cy- cles to assist interpretation of heart sound by physicians in objective analysis through record- ing in a computer. It is also useful for development of an automated decision support system on heart sound abnormality.

6 citations