scispace - formally typeset
Search or ask a question
Author

V. Ramalingam

Bio: V. Ramalingam is an academic researcher from Annamalai University. The author has contributed to research in topics: Feature extraction & Support vector machine. The author has an hindex of 16, co-authored 42 publications receiving 984 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: This paper proposes effective algorithms to automatically classify audio clips into one of six classes: music, news, sports, advertisement, cartoon and movie, using the application of neural network for the classification of audio.
Abstract: In the age of digital information, audio data has become an important part in many modern computer applications Audio classification has been becoming a focus in the research of audio processing and pattern recognition Automatic audio classification is very useful to audio indexing, content-based audio retrieval and on-line audio distribution, but it is a challenge to extract the most common and salient themes from unstructured raw audio data In this paper, we propose effective algorithms to automatically classify audio clips into one of six classes: music, news, sports, advertisement, cartoon and movie For these categories a number of acoustic features that include linear predictive coefficients, linear predictive cepstral coefficients and mel-frequency cepstral coefficients are extracted to characterize the audio content Support vector machines are applied to classify audio into their respective classes by learning from training data Then the proposed method extends the application of neural network (RBFNN) for the classification of audio RBFNN enables nonlinear transformation followed by linear transformation to achieve a higher dimension in the hidden space The experiments on different genres of the various categories illustrate the results of classification are significant and effective

160 citations

Journal ArticleDOI
TL;DR: It is demonstrated that RBFNN outperformed the polynomial kernel of SVM for correctly classifying the tumors and the results were compared to convey and compare the qualities of the classifiers.
Abstract: Correct diagnosis is one of the major problems in medical field. This includes the limitation of human expertise in diagnosing the disease manually. From the literature it has been found that pattern classification techniques such as support vector machines (SVM) and radial basis function neural network (RBFNN) can help them to improve in this domain. RBFNN and SVM with their remarkable ability to derive meaning from complicated or imprecise data, can be used to extract patterns and detect trends that are too complex to be noticed by either humans or other computer techniques. This paper compares the use of polynomial kernel of SVM and RBFNN in ascertaining the diagnostic accuracy of cytological data obtained from the Wisconsin breast cancer database. The data set includes nine different attributes and two categories of tumors namely benign and malignant. Known sets of cytologically proven tumor data was used to train the models to categorize cancer patients according to their diagnosis. Performance measures such as accuracy, specificity, sensitivity, F-score and other metrics used in medical diagnosis such as Youden's index and discriminant power were evaluated to convey and compare the qualities of the classifiers. This research has demonstrated that RBFNN outperformed the polynomial kernel of SVM for correctly classifying the tumors.

121 citations

Journal ArticleDOI
TL;DR: The development of an automatic breast tissue classification methodology is described, which can be summarized in a number of distinct steps: (1) preprocessing, (2) feature extraction, and (3) classification.

120 citations

Journal ArticleDOI
01 Jan 2011
TL;DR: Effective algorithms to automatically classify audio clips into one of six classes: music, news, sports, advertisement, cartoon and movie are proposed.
Abstract: Today, digital audio applications are part of our everyday lives. Audio classification can provide powerful tools for content management. If an audio clip automatically can be classified it can be stored in an organised database, which can improve the management of audio dramatically. In this paper, we propose effective algorithms to automatically classify audio clips into one of six classes: music, news, sports, advertisement, cartoon and movie. For these categories a number of acoustic features that include linear predictive coefficients, linear predictive cepstral coefficients and mel-frequency cepstral coefficients are extracted to characterize the audio content. The autoassociative neural network model (AANN) is used to capture the distribution of the acoustic feature vectors. The AANN model captures the distribution of the acoustic features of a class, and the backpropagation learning algorithm is used to adjust the weights of the network to minimize the mean square error for each feature vector. The proposed method also compares the performance of AANN with a Gaussian mixture model (GMM) wherein the feature vectors from each class were used to train the GMM models for those classes. During testing, the likelihood of a test sample belonging to each model is computed and the sample is assigned to the class whose model produces the highest likelihood.

76 citations

Journal ArticleDOI
K. Rajan1, V. Ramalingam1, M. Ganesan1, S. Palanivel1, B. Palaniappan1 
TL;DR: In this paper, the authors proposed the application of vector space model (VSM) and ANN for the classification of Tamil language documents, and the experimental results show that ANN model achieves 93.33% which is better than the performance of VSM which yields 90.33%.
Abstract: Automatic text classification based on vector space model (VSM), artificial neural networks (ANN), K-nearest neighbor (KNN), Naives Bayes (NB) and support vector machine (SVM) have been applied on English language documents, and gained popularity among text mining and information retrieval (IR) researchers. This paper proposes the application of VSM and ANN for the classification of Tamil language documents. Tamil is morphologically rich Dravidian classical language. The development of internet led to an exponential increase in the amount of electronic documents not only in English but also other regional languages. The automatic classification of Tamil documents has not been explored in detail so far. In this paper, corpus is used to construct and test the VSM and ANN models. Methods of document representation, assigning weights that reflect the importance of each term are discussed. In a traditional word-matching based categorization system, the most popular document representation is VSM. This method needs a high dimensional space to represent the documents. The ANN classifier requires smaller number of features. The experimental results show that ANN model achieves 93.33% which is better than the performance of VSM which yields 90.33% on Tamil document classification.

69 citations


Cited by
More filters
Journal ArticleDOI
01 Oct 1980

1,565 citations

Journal ArticleDOI
TL;DR: An analysis of speaker diarization performance as reported through the NIST Rich Transcription evaluations on meeting data and identify important areas for future research are presented.
Abstract: Speaker diarization is the task of determining “who spoke when?” in an audio or video recording that contains an unknown amount of speech and also an unknown number of speakers. Initially, it was proposed as a research topic related to automatic speech recognition, where speaker diarization serves as an upstream processing step. Over recent years, however, speaker diarization has become an important key technology for many tasks, such as navigation, retrieval, or higher level inference on audio data. Accordingly, many important improvements in accuracy and robustness have been reported in journals and conferences in the area. The application domains, from broadcast news, to lectures and meetings, vary greatly and pose different problems, such as having access to multiple microphones and multimodal information or overlapping speech. The most recent review of existing technology dates back to 2006 and focuses on the broadcast news domain. In this paper, we review the current state-of-the-art, focusing on research developed since 2006 that relates predominantly to speaker diarization for conference meetings. Finally, we present an analysis of speaker diarization performance as reported through the NIST Rich Transcription evaluations on meeting data and identify important areas for future research.

706 citations

Journal ArticleDOI
TL;DR: A new taxonomy of automatic RGB, 3D, thermal and multimodal facial expression analysis is defined, encompassing all steps from face detection to facial expression recognition, and described and classify the state of the art methods accordingly.
Abstract: Facial expressions are an important way through which humans interact socially. Building a system capable of automatically recognizing facial expressions from images and video has been an intense field of study in recent years. Interpreting such expressions remains challenging and much research is needed about the way they relate to human affect. This paper presents a general overview of automatic RGB, 3D, thermal and multimodal facial expression analysis. We define a new taxonomy for the field, encompassing all steps from face detection to facial expression recognition, and describe and classify the state of the art methods accordingly. We also present the important datasets and the bench-marking of most influential methods. We conclude with a general discussion about trends, important questions and future lines of research.

357 citations

Posted Content
TL;DR: Facial expressions are an important way through which humans interact socially as mentioned in this paper, and much research is needed about the way they relate to human affect, and a taxonomy of facial expression analysis methods can be found in this paper.
Abstract: Facial expressions are an important way through which humans interact socially. Building a system capable of automatically recognizing facial expressions from images and video has been an intense field of study in recent years. Interpreting such expressions remains challenging and much research is needed about the way they relate to human affect. This paper presents a general overview of automatic RGB, 3D, thermal and multimodal facial expression analysis. We define a new taxonomy for the field, encompassing all steps from face detection to facial expression recognition, and describe and classify the state of the art methods accordingly. We also present the important datasets and the bench-marking of most influential methods. We conclude with a general discussion about trends, important questions and future lines of research.

340 citations