scispace - formally typeset
Search or ask a question
Author

Vipul Arora

Bio: Vipul Arora is an academic researcher from Indian Institute of Technology Kanpur. The author has contributed to research in topics: Computer science & Engineering. The author has an hindex of 9, co-authored 30 publications receiving 220 citations. Previous affiliations of Vipul Arora include University of Oxford & Indian Institutes of Technology.

Papers
More filters
Journal ArticleDOI
TL;DR: A novel framework which estimates predominant vocal melody in real-time by tracking various sources with the help of harmonic clusters (combs) and then determining the predominant vocal source by using the harmonic strength of the source.
Abstract: Extraction of predominant melody from the musical performances containing various instruments is one of the most challenging task in the field of music information retrieval and computational musicology. This paper presents a novel framework which estimates predominant vocal melody in real-time by tracking various sources with the help of harmonic clusters (combs) and then determining the predominant vocal source by using the harmonic strength of the source. The novel on-line harmonic comb tracking approach complies with both structural as well as temporal constraints simultaneously. It relies upon the strong higher harmonics for robustness against distortion of the first harmonic due to low frequency accompaniments, in contrast to the existing methods which track the pitch values. The predominant vocal source identification depends upon the novel idea of source dependant filtering of recognition score, which allows the algorithm to be implemented on-line. The proposed method, although on-line, is shown to significantly outperform our implementation of a state-of-the-art offline method for vocal melody extraction. Evaluations also show the reduction in octave error and the effectiveness of novel score filtering technique in enhancing the performance.

32 citations

Journal ArticleDOI
23 Sep 2019
TL;DR: In this paper, the authors proposed a method which utilizes the fuzzy common spatial pattern optimized differential phase synchrony representations to inspect electroencephalogram (EEG) synchronization changes from the alert state to the drowsy state.
Abstract: Driver drowsiness is receiving a lot of deliberation as it is a major cause of traffic accidents. This paper proposes a method which utilizes the fuzzy common spatial pattern optimized differential phase synchrony representations to inspect electroencephalogram (EEG) synchronization changes from the alert state to the drowsy state. EEG-based reaction time prediction and drowsiness detection are formulated as primary and ancillary problems in the context of multi-task learning. Statistical analysis results suggest that our method can be used to distinguish between alert and drowsy state of mind. The proposed Multi-Task DeepNet (MTDNN) performs superior to the baseline regression schemes, like support vector regression (SVR), least absolute shrinkage and selection operator, ridge regression, K-nearest neighbors, and adaptive neuro fuzzy inference scheme (ANFIS), in terms of root mean squared error (RMSE), mean absolute percentage error (MAPE), and correlation coefficient (CC) metrics. In particular, the best performing multi-task network $\text{MTDNN}_5$ recorded a 15.49% smaller RMSE, a 27.15% smaller MAPE, and a 10.13% larger CC value than SVR.

32 citations

Journal ArticleDOI
01 Apr 2020
TL;DR: Evaluation results substantiate the improvements brought about by the proposed scheme regarding faster convergence and better accuracy, and performance is validated on many small to large scale, synthetic datasets (UCI, LIBSVM datasets).
Abstract: This paper proposes a novel method for training neural networks (NNs). It uses an approach from optimal control theory, namely, Hamilton–Jacobi–Bellman equation, which optimizes system performance along the trajectory. This formulation leads to a closed-form solution for an optimal weight update rule, which has been combined with per-parameter adaptive scheme AdaGrad to further enhance its performance. To evaluate the proposed method, the NNs are trained and tested on two problems related to EEG classification, namely, mental imagery classification (multiclass) and eye state recognition (binary class). In addition, a novel dataset with the name EEG eye state, for benchmarking learning methods, is presented. The convergence proof for the proposed approach is also included, and performance is validated on many small to large scale, synthetic datasets (UCI, LIBSVM datasets). The performance of NNs trained with the proposed scheme is compared with other state-of-the-art approaches. Evaluation results substantiate the improvements brought about by the proposed scheme regarding faster convergence and better accuracy.

27 citations

Journal ArticleDOI
TL;DR: The authors implementation of a phonological feature-based ASR system using deep neural networks as an acoustic model and its use for detecting mispronunciation detection, analysing errors, and rendering corrective feedback is presented.
Abstract: The authors address the question whether phonological features can be used effectively in an automatic speech recognition (ASR) system for pronunciation training in non-native language (L2) learning. Computer-aided pronunciation training consists of two essential tasks—detecting mispronunciations and providing corrective feedback, usually either on the basis of full words or phonemes. Phonemes, however, can be further disassembled into phonological features, which in turn define groups of phonemes. A phonological feature-based ASR system allows the authors to perform a sub-phonemic analysis at feature level, providing a more effective feedback to reach the acoustic goal and perceptual constancy. Furthermore, phonological features provide a structured way for analysing the types of errors a learner makes, and can readily convey which pronunciations need improvement. This paper presents the authors implementation of such an ASR system using deep neural networks as an acoustic model, and its use for detectin...

27 citations

Book ChapterDOI
18 Feb 2009
TL;DR: The proposed Sanskrit parser is able to create semantic nets for many classes of Sanskrit paragraphs and is taking care of both external and internal sandhi in the Sanskrit words.
Abstract: In this paper, we are presenting our work towards building a dependency parser for Sanskrit language that uses deterministic finite automata(DFA) for morphological analysis and 'utsarga apavaada' approach for relation analysis A computational grammar based on the framework of Panini is being developed A linguistic generalization for Verbal and Nominal database has been made and declensions are given the form of DFA Verbal database for all the class of verbs have been completed for this part Given a Sanskrit text, the parser identifies the root words and gives the dependency relations based on semantic constraints The proposed Sanskrit parser is able to create semantic nets for many classes of Sanskrit paragraphs(*********************) The parser is taking care of both external and internal sandhi in the Sanskrit words

27 citations


Cited by
More filters
Proceedings Article
22 Aug 1999
TL;DR: The accessibility, usability, and, ultimately, acceptability of Information Society Technologies by anyone, anywhere, at anytime, and through any media and device is addressed.
Abstract: ▶ Addresses the accessibility, usability, and, ultimately, acceptability of Information Society Technologies by anyone, anywhere, at anytime, and through any media and device. ▶ Focuses on theoretical, methodological, and empirical research, of both technological and non-technological nature. ▶ Features papers that report on theories, methods, tools, empirical results, reviews, case studies, and best-practice examples.

752 citations

01 Jan 1952

748 citations

Journal ArticleDOI
TL;DR: The results show that deep learning methods provide better classification performance compared to other state of art approaches and can be applied successfully to BCI systems where the amount of data is large due to daily recording.
Abstract: Objective. Signal classification is an important issue in brain computer interface (BCI) systems. Deep learning approaches have been used successfully in many recent studies to learn features and classify different types of data. However, the number of studies that employ these approaches on BCI applications is very limited. In this study we aim to use deep learning methods to improve classification performance of EEG motor imagery signals. Approach. In this study we investigate convolutional neural networks (CNN) and stacked autoencoders (SAE) to classify EEG Motor Imagery signals. A new form of input is introduced to combine time, frequency and location information extracted from EEG signal and it is used in CNN having one 1D convolutional and one max-pooling layers. We also proposed a new deep network by combining CNN and SAE. In this network, the features that are extracted in CNN are classified through the deep network SAE. Main results. The classification performance obtained by the proposed method on BCI competition IV dataset 2b in terms of kappa value is 0.547. Our approach yields 9% improvement over the winner algorithm of the competition. Significance. Our results show that deep learning methods provide better classification performance compared to other state of art approaches. These methods can be applied successfully to BCI systems where the amount of data is large due to daily recording.

659 citations

Journal Article
TL;DR: In this paper, the authors present a short bibliography on AI and the arts, which is presented in four sections: General Arguments, Proposals, and Approaches (31 references), Artificial Intelligence in Music (124 references); Artificial AI in Literature and the Performing Arts (13 references), and Artificial Intelligence and Visual Art (57 references).
Abstract: The title of this technical report says almost everything: this is indeed \"a short bibliography on AI and the arts\". It is presented in four sections: General Arguments, Proposals, and Approaches (31 references); Artificial Intelligence in Music (124 references); Artificial Intelligence in Literature and the Performing Arts (13 references), and Artificial Intelligence and Visual Art (57 references). About a quarter of these have short abstracts. Creating a bibliography can be a monumental task, and this bibliography should be viewed as a good and useful start, though it is by no means complete. For comparison, consider the 4,585-entry bibliography Computer Applications in Music by Deta Davis (A-REditions). No direct comparison is intended (or possible), but my point is that many more papers are likely to exist. As a rough check, I looked for several pre-1990 AI and Music articles and books (including my own, of course) in the bibliography. Out of five papers from well-known sources, only one was listed. On the other hand, I discovered a number of papers in this report that were unknown to me, so I am grateful to have a new source of references. In their introduction, the authors acknowledge the need for more references and even offer.a cup of coffee in reward for each new one. I will be sending a number of contributions, so the next time anyone is in Vienna, the coffee is on me. I hope the authors will continue to collect abstracts and publish an updated report in the future.

356 citations

Journal ArticleDOI
TL;DR: This article provides a comprehensive review of the state-of-the-art of a complete BCI system and a considerable number of popular BCI applications are reviewed in terms of electrophysiological control signals, feature extraction, classification algorithms, and performance evaluation metrics.
Abstract: Brain-Computer Interface (BCI), in essence, aims at controlling different assistive devices through the utilization of brain waves. It is worth noting that the application of BCI is not limited to medical applications, and hence, the research in this field has gained due attention. Moreover, the significant number of related publications over the past two decades further indicates the consistent improvements and breakthroughs that have been made in this particular field. Nonetheless, it is also worth mentioning that with these improvements, new challenges are constantly discovered. This article provides a comprehensive review of the state-of-the-art of a complete BCI system. First, a brief overview of electroencephalogram (EEG)-based BCI systems is given. Secondly, a considerable number of popular BCI applications are reviewed in terms of electrophysiological control signals, feature extraction, classification algorithms, and performance evaluation metrics. Finally, the challenges to the recent BCI systems are discussed, and possible solutions to mitigate the issues are recommended.

207 citations