scispace - formally typeset
Search or ask a question

Showing papers by "Santanu Phadikar published in 2017"


Journal ArticleDOI
TL;DR: An auction game model is proposed that analyzes the complex decision making process and efficiently allocates an idle channel to a pair of RT and NRT secondary users from a pool of users.

25 citations



Book ChapterDOI
01 Jan 2017
TL;DR: In this paper, a Bangla phoneme recognition system was proposed for the development of a fully functional automated speech recognition system (ASR) for Bangla, which achieved an accuracy of 98.35% using Mel Scale Cepstral Coefficient (MFCC).
Abstract: Speech Recognition is a challenging task especially for a multilingual country like India as the speakers are habituated in using mixed language and accent. Bangla is a very popular language in East Asia and a fully functional Automated Speech Recognition System (ASR) for it is yet to be developed. Every language embodies a set of sounds called phoneme set, which is the building block for the words of that language. READ (Record Extract Approximate Distinguish) is a Bangla phoneme recognition system, proposed toward the development of a Bangla ASR. To start with, Mel Scale Cepstral Coefficient (MFCC) features have been used for testing on a database of 1400 Bangla vowel phonemes and an accuracy of 98.35% has been obtained.

12 citations


Proceedings ArticleDOI
01 Jul 2017
TL;DR: RECAL (Record Extract Classify According to Language) is a system, aimed towards identification of languages from multilingual voice signals, which has an accuracy of 98.39% considering the similarity between Bangla and Hindi numerals and avoidance of noise gating to simulate real world environment.
Abstract: Since the inception of IT, one of the primary concerns has been to build devices with easy interactivity. Speech can be considered as one of the most preferred and easiest modes of interaction. Speech Recognition is the technique of automatically identifying spoken words from voice signals. Due to the multilingual nature of our country, we are habituated in using a mixture of languages in the course of verbal interaction and so, prior to recognizing speech, it is essential to determine the respective languages to which the spoken words belong. RECAL (Record Extract Classify According to Language) is a system, aimed towards identification of languages from multilingual voice signals. To start with, Mel Scale Cepstral Coefficient (MFCC) based features have been used to model languages using 9300 uttered numerals amidst 3 languages (English, Bangla and Hindi). An accuracy of 98.39% has been obtained considering the similarity between Bangla and Hindi numerals and avoidance of noise gating to simulate real world environment.

11 citations


Book ChapterDOI
24 Mar 2017
TL;DR: SMIL (Segregate Musical Instrument by Listening) is a system aimed towards identification of isolated instruments from stereophonic audio that combines Mel Scale Cepstral Coefficient based features coupled with a Multi Layer Perceptron based classifier.
Abstract: The music industry has made remarkable progress over the last few decades from vinyls to digital audio The field of Music Information Retrieval (MIR) has received attention from researchers across the globe because of its diverse applications, one of them being Automatic Music Transcription (AMT) A music piece generally consists of an array of instruments played simultaneously and prior to transcription it is essential to identify the active regions of the instruments Identifying instruments in isolation prior to identifying their active regions in a piece is essential SMIL (Segregate Musical Instrument by Listening) is a system aimed towards identification of isolated instruments from stereophonic audio Mel Scale Cepstral Coefficient (MFCC) based features coupled with a Multi Layer Perceptron (MLP) based classifier has been used to characterize 2716 clips from 7 different instruments and an average accuracy of 9838% has been obtained

7 citations


Book ChapterDOI
01 Jan 2017
TL;DR: This work has used radio frequency identification (RFID) tags for efficient hand off mechanism with reduced delay and not only the delay in handoff is reduced but also hand off is performed efficiently with minimal information.
Abstract: With the introduction of VANET, inter-vehicular communication and communication between vehicle and network infrastructure has become very convenient. This advancement has brought in challenges with itself. The vehicles are mobile; moving around with very high speed whereas infrastructure is stationary. Thereby handoff plays a critical role in VANET. In our approach, we have used radio frequency identification (RFID) tags for efficient hand off mechanism with reduced delay. RFID tags are deployed in the chassis of every vehicle containing the unique MAC address of that vehicle. RFID scanners are deployed on road which scans the RFID tags and sends the MAC address to the handoff gateway which communicates the information to the nearest access point. This makes the access point aware of the incoming vehicle thereby reducing the delay. Using this novel approach not only the delay in handoff is reduced but also hand off is performed efficiently with minimal information.

4 citations


Book ChapterDOI
01 Jan 2017
TL;DR: The rand index value indicates that Gini entropy-based thresholding is the best choice for hypo and hyperpigmented skin diseases segmentation.
Abstract: Image segmentation is a crucial part of medical imaging technology. Threshold-based image segmentation is very effective for medical images. A good segmentation helps in correct diagnosis. In this paper, entropy-based thresholding is used for automatic segmentation of hypo and hyperpigmented skin disease. Here threshold values are selected based on Shannon and Gini entropy. A comparison study with Otsu and Fuzzy C-Means (FCM) method is carried out based on rand index (RI) to prove efficiency of entropy-based thresholding. The rand index value indicates that Gini entropy-based thresholding is the best choice for hypo and hyperpigmented skin diseases segmentation.

2 citations


Book ChapterDOI
24 Mar 2017
TL;DR: Objects of this work is to develop a method to identify isolated Bengali letter/alphabet (Swarabarna and Banjanbarna), from uttered sound, using feature termed as Mel Frequency Wavelet Transform Coefficient (MFWTC).
Abstract: With the advancement of the voice signal processing, speech to text recognition has become an important area of research Though some efforts are found for the English language, for regional languages like Bengali, Hindi, Guajarati etc it is very rare or not started yet Thus objectives of this work is to develop a method to identify isolated Bengali letter/alphabet (Swarabarna and Banjanbarna), from uttered sound In speech processing, identifying a particular uttered letter consists of two major steps, Speech Feature Extraction and Feature Classification Studies show that Mel Frequency Cepstral Coefficient (MFCC) give better representation of human auditory system, but at the same time with increased noise, performance of MFCC degrades, which may be reduced by Discrete Wavelet Transform (DWT) Thus MFCC combined with DWT is used as a feature termed as Mel Frequency Wavelet Transform Coefficient (MFWTC) for this work For experiment, a sound database is developed by uttering of 43 Bengali alphabets {11 Swarabarna and 32 Banjanbarna} by ten speakers, 20 times for each letter Then these signals are pre-processed to remove the silent portion from both end points followed by applying pre-emphasized filter Next, MFCC features are extracted from preprocessed signals These features are then fine-tuned by applying DWT to compute MFWTC features Not only these feature, Zero Crossing Count(ZCC) are also used independently to compare with this method Finally these features are used to recognize the Bengali Barnas using different classifiers (BayesNet, NaiveBayes, IB1, LWL, Classification Via Clustering, Dagging, Multi Scheme, VFI, Conjunctive Rule, ZeroR, BFTree and Simple Cart) available in Weka tools The classification accuracy is measured using 10-fold cross validation method, which shows the average 4761% and 6219% for Swarabarna and Banjanbarna respectively

1 citations