Home
/
Authors
/
Vinayak Abrol

Author

Vinayak Abrol

Other affiliations: Idiap Research Institute, University Institute of Engineering and Technology, Panjab University, Indian Institute of Technology Mandi ...read more

Bio: Vinayak Abrol is an academic researcher from University of Oxford. The author has contributed to research in topics: Sparse approximation & Speech processing. The author has an hindex of 10, co-authored 43 publications receiving 296 citations. Previous affiliations of Vinayak Abrol include Idiap Research Institute & University Institute of Engineering and Technology, Panjab University.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Deep-Sparse-Representation-Based Features for Speech Recognition

[...]

Pulkit Sharma¹, Vinayak Abrol¹, Anil Kumar Sao¹•Institutions (1)

Indian Institute of Technology Mandi¹

01 Nov 2017-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: This paper proposes to use a multilevel decomposition (having multiple layers), also known as the deep sparse representation (DSR), to derive a feature representation for speech recognition, and reveals that the representations obtained at different sparse layers of the proposed DSR model have complimentary information.

...read moreread less

Abstract: Features derived using sparse representation (SR)-based approaches have been shown to yield promising results for speech recognition tasks. In most of the approaches, the SR corresponding to speech signal is estimated using a dictionary, which could be either exemplar based or learned. However, a single-level decomposition may not be suitable for the speech signal, as it contains complex hierarchical information about various hidden attributes. In this paper, we propose to use a multilevel decomposition (having multiple layers), also known as the deep sparse representation (DSR), to derive a feature representation for speech recognition. Instead of having a series of sparse layers, the proposed framework employs a dense layer between two sparse layers, which helps in efficient implementation. Our studies reveal that the representations obtained at different sparse layers of the proposed DSR model have complimentary information. Thus, the final feature representation is derived after concatenating the representations obtained at the sparse layers. This results in a more discriminative representation, and improves the speech recognition performance. Since the concatenation results in a high-dimensional feature, principal component analysis is used to reduce the dimension of the obtained feature. Experimental studies demonstrate that the proposed feature outperforms existing features for various speech recognition tasks.

...read moreread less

34 citations

Proceedings Article•DOI•

Understanding and Visualizing Raw Waveform-based CNNs

[...]

Hannah Muckenhirn¹, Vinayak Abrol², Mathew Magimai-Doss², Sébastien Marcel²•Institutions (2)

Google¹, Idiap Research Institute²

15 Sep 2019

TL;DR: This paper develops a gradient based approach to estimate the relevance of each speech sample input on the output score, and shows that analysis of the resulting “relevance signal” through conventional speech signal processing techniques can reveal the information modeled by the whole network.

...read moreread less

Abstract: Modeling directly raw waveforms through neural networks for speech processing is gaining more and more attention. Despite its varied success, a question that remains is: what kind of information are such neural networks capturing or learning for different tasks from the speech signal? Such an insight is not only interesting for advancing those techniques but also for understanding better speech signal characteristics. This paper takes a step in that direction, where we develop a gradient based approach to estimate the relevance of each speech sample input on the output score. We show that analysis of the resulting “relevance signal” through conventional speech signal processing techniques can reveal the information modeled by the whole network. We demonstrate the potential of the proposed approach by analyzing raw waveform CNN-based phone recognition and speaker identification systems.

...read moreread less

25 citations

Journal Article•DOI•

Voiced/nonvoiced detection in compressively sensed speech signals

[...]

Vinayak Abrol¹, Pulkit Sharma¹, Anil Kumar Sao¹•Institutions (1)

Indian Institute of Technology Mandi¹

01 Sep 2015-Speech Communication

TL;DR: The proposed novel unsupervised voiced/nonvoiced (V/NV) detection method attempts to exploit the fact that there is significant glottal activity during production of voiced speech while the same is not true for nonvoiced speech, and provides compelling evidence of the effectiveness of sparse feature vector for V/NV detection.

...read moreread less

20 citations

Journal Article•DOI•

Greedy dictionary learning for kernel sparse representation based classifier

[...]

Vinayak Abrol¹, Pulkit Sharma¹, Anil Kumar Sao¹•Institutions (1)

Indian Institute of Technology Mandi¹

15 Jul 2016-Pattern Recognition Letters

TL;DR: Compared to the existing state-of-the-art methods, the proposed method has much less computational complexity, but performs similar for various pattern classification tasks.

...read moreread less

19 citations

Proceedings Article•DOI•

Evaluating performance of Compressed Sensing for speech signals

[...]

Vinayak Abrol¹, Pulkit Sharma², Sumit Budhiraja³•Institutions (3)

Panjab University, Chandigarh¹, Indian Institute of Technology Mandi², University Institute of Engineering and Technology, Panjab University³

13 May 2013

TL;DR: This work shows a comparative analysis of different sparse basis & measurement matrices which can be used in speech/audio processing and gives a detail analysis of the performance bounds, compression ratios, reconstruction errors etc. which should be taken care of while designing CS based speech applications.

...read moreread less

Abstract: Reconstruction of a signal based on Compressed Sensing (CS) framework relies on the knowledge of the sparse basis & measurement matrix used for sensing. While most of the studies so far focus on the application of CS in fields of images, radar, astronomy etc.; wepresent our work on application of CS in field of speech/Audio processing. This work shows a comparative analysis of different sparse basis & measurement matrices which can be used in speech/audio processing. Our work gives a detail analysis of the performance bounds, compression ratios, reconstruction errors etc. which should be taken care of while designing CS based speech applications.

...read moreread less

17 citations

1
2
3
4
…
5
6
7
8
9
10

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

A Systematic Review of Compressive Sensing: Concepts, Implementations and Applications

[...]

Meenu Rani¹, Sanjay B. Dhok¹, Raghavendra B. Deshmukh¹•Institutions (1)

Visvesvaraya National Institute of Technology¹

17 Jan 2018-IEEE Access

TL;DR: To bridge the gap between theory and practicality of CS, different CS acquisition strategies and reconstruction approaches are elaborated systematically in this paper.

...read moreread less

Abstract: Compressive Sensing (CS) is a new sensing modality, which compresses the signal being acquired at the time of sensing. Signals can have sparse or compressible representation either in original domain or in some transform domain. Relying on the sparsity of the signals, CS allows us to sample the signal at a rate much below the Nyquist sampling rate. Also, the varied reconstruction algorithms of CS can faithfully reconstruct the original signal back from fewer compressive measurements. This fact has stimulated research interest toward the use of CS in several fields, such as magnetic resonance imaging, high-speed video acquisition, and ultrawideband communication. This paper reviews the basic theoretical concepts underlying CS. To bridge the gap between theory and practicality of CS, different CS acquisition strategies and reconstruction approaches are elaborated systematically in this paper. The major application areas where CS is currently being used are reviewed here. This paper also highlights some of the challenges and research directions in this field.

...read moreread less

334 citations

An Archetypal Analysis on

[...]

Chong Won Ji

01 Jan 2005

331 citations

Journal Article•DOI•

Automatic acoustic detection of birds through deep learning : the first bird audio detection challenge

[...]

Dan Stowell¹, Michael Wood², Hanna Pamuła³, Yannis Stylianou⁴, Hervé Glotin⁵ - Show less +1 more•Institutions (5)

Queen Mary University of London¹, University of Salford², AGH University of Science and Technology³, University of Crete⁴, Aix-Marseille University⁵

11 Mar 2019-Methods in Ecology and Evolution

TL;DR: General‐purpose acoustic bird detection can achieve very high retrieval rates in remote monitoring data with no manual recalibration, and no pre‐training of the detector for the target species or the acoustic conditions in the target environment.

...read moreread less

Abstract: Assessing the presence and abundance of birds is important for monitoring specific species as well as overall ecosystem health. Many birds are most readily detected by their sounds, and thus passive acoustic monitoring is highly appropriate. Yet acoustic monitoring is often held back by practical limitations such as the need for manual configuration, reliance on example sound libraries, low accuracy, low robustness, and limited ability to generalise to novel acoustic conditions. Here we report outcomes from a collaborative data challenge. We present new acoustic monitoring datasets, summarise the machine learning techniques proposed by challenge teams, conduct detailed performance evaluation, and discuss how such approaches to detection can be integrated into remote monitoring projects. Multiple methods were able to attain performance of around 88% AUC (area under the ROC curve), much higher performance than previous general‐purpose methods. With modern machine learning including deep learning, general‐purpose acoustic bird detection can achieve very high retrieval rates in remote monitoring data with no manual recalibration, and no pre‐training of the detector for the target species or the acoustic conditions in the target environment.

...read moreread less

220 citations

Dissertation•

Sparse and redundant representations for inverse problems and recognition

[...]

Vishal M. Patel

01 Jan 2010

TL;DR: This research investigates the combination of domain adaptation, dictionary learning, object recognition, activity recognition, and shape representation in machine learning to solve the challenge of sparse representation in signal/Image processing.

...read moreread less

Abstract: Research Interests Security and privacy: Active authentication, biometrics template protection, biometrics recognition. Computer vision: Domain adaptation, dictionary learning, object recognition, activity recognition, shape representation. Machine learning: Dimensionality reduction, clustering, kernel methods, weakly-supervised learning. Signal/Image processing: Sparse representation, compressive sampling, synthetic aperture radar imaging, millimeter wave imaging.

...read moreread less

160 citations

Journal Article•DOI•

An Intelligent Parkinson’s Disease Diagnostic System Based on a Chaotic Bacterial Foraging Optimization Enhanced Fuzzy KNN Approach

[...]

Zhennao Cai¹, Jianhua Gu¹, Caiyun Wen², Dong Zhao³, Chunyu Huang⁴, Hui Huang⁵, Changfei Tong⁵, Jun Li⁵, Huiling Chen⁵ - Show less +5 more•Institutions (5)

Northwestern Polytechnical University¹, First Affiliated Hospital of Wenzhou Medical University², Changchun Normal University³, Changchun University⁴, Wenzhou University⁵

21 Jun 2018-Computational and Mathematical Methods in Medicine

TL;DR: An enhanced fuzzy k-nearest neighbor (FKNN) method for the early detection of PD based upon vocal measurements was developed, and simulation results indicated the proposed approach outperformed the other five FKNN models based on BFO, particle swarm optimization, Genetic algorithms, fruit fly optimization, and firefly algorithm.

...read moreread less

Abstract: Parkinson's disease (PD) is a common neurodegenerative disease, which has attracted more and more attention. Many artificial intelligence methods have been used for the diagnosis of PD. In this study, an enhanced fuzzy k-nearest neighbor (FKNN) method for the early detection of PD based upon vocal measurements was developed. The proposed method, an evolutionary instance-based learning approach termed CBFO-FKNN, was developed by coupling the chaotic bacterial foraging optimization with Gauss mutation (CBFO) approach with FKNN. The integration of the CBFO technique efficiently resolved the parameter tuning issues of the FKNN. The effectiveness of the proposed CBFO-FKNN was rigorously compared to those of the PD datasets in terms of classification accuracy, sensitivity, specificity, and AUC (area under the receiver operating characteristic curve). The simulation results indicated the proposed approach outperformed the other five FKNN models based on BFO, particle swarm optimization, Genetic algorithms, fruit fly optimization, and firefly algorithm, as well as three advanced machine learning methods including support vector machine (SVM), SVM with local learning-based feature selection, and kernel extreme learning machine in a 10-fold cross-validation scheme. The method presented in this paper has a very good prospect, which will bring great convenience to the clinicians to make a better decision in the clinical diagnosis.

...read moreread less

97 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50

Collapse