scispace - formally typeset
Search or ask a question
Author

Vivek Bhardwaj

Bio: Vivek Bhardwaj is an academic researcher from University Institute of Engineering and Technology, Panjab University. The author has contributed to research in topics: Feature extraction & Mel-frequency cepstrum. The author has an hindex of 2, co-authored 8 publications receiving 15 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: After enhancing the pitch using the Cepstral analysis in the feature extraction process, the recognition rate of the children's speech recognition system using different age group datasets increases as compared to the normal acoustics features extracted using Mel Frequency CepStral Coefficient (MFCC) feature extracted process.

14 citations

Proceedings ArticleDOI
15 Jul 2020
TL;DR: A remarkable depletion in the word error rate (WER) was noticed using SGMM by varying the feature dimensions and the achievement of SGMM has gotten a large performance margin in Punjabi children speech recognition.
Abstract: In this paper, the Punjabi children speech recognition system is developed using Subspace Gaussian mixture models (SGMM) acoustic modeling techniques. Initially, the system is dependent upon Mel-frequency cepstral coefficients (MFCC) approach for controlling the temporal variations in the input speech signals. Here, SGMM is integrated with HMM to measure the efficiency of each state which carries the information of a short-windowed frame. For handling the children speaker acoustic variations speaker adaptive training (SAT), based on vocal-tract length normalization and feature space maximum likelihood linear regression is adopted. Kaldi and open-source speech recognition toolkit is used to develop the Robust Automatic Speech Recognition (ASR) System for Punjabi Children's speech. S GMM accumulate the frame coefficients and their posterior probabilities and pass these probabilities to HMM which systematically fit the frame and output have resulted from HMM states. Therefore, the achievement of SGMM has gotten a large performance margin in Punjabi children speech recognition. A remarkable depletion in the word error rate (WER) was noticed using SGMM by varying the feature dimensions. The developed children ASR system obtained a recognition accuracy of 83.66% while tested by varying the feature dimensions to 12.

8 citations

Proceedings ArticleDOI
30 Oct 2020
TL;DR: Punjabi children's speech corpus has been collected and several experiments have been performed using the DNN modeling technique, and experimental results have revealed that the system has attained 87% accuracy.
Abstract: Despite the number of developed Automatic Speech Recognition (ASR) systems for different languages, still no work has been done on children's speech of Punjabi language. Due to the unavailability of children's speech corpus for Punjabi Language, it is a challenging task to collect speech data. In our current work, efforts have been made to collect Punjabi children's speech corpus and build Children ASR system for Indian regional Punjabi language. The recognition rate of ASR systems is observed to be improved drastically by the emergence of Deep Neural Networks (DNN). In our work, the DNN acoustic model has been implemented by varying number of hidden layers. Approximately four hours of Punjabi children's speech corpus has been collected and several experiments have been performed using the DNN modeling technique. Experimental results have revealed that the system has attained 87% accuracy.

8 citations

Book ChapterDOI
01 Jan 2021
TL;DR: Experimental results demonstrate that the feature space discriminative approaches have achieved a significant reduction in the Word Error Rate (WER), and it is shown that fbMMI achieves better performance than the bMMI and fMMI.
Abstract: It is a very difficult challenge to recognize children’s speech on Automatic Speech Recognition (ASR) systems built using adult speech. In such ASR tasks, a significant deteriorated recognition efficiency is observed, as noted by several earlier studies. It is primarily related to the significant inconsistency between the two groups of speakers in the auditory and linguistic attributes. One of the numerous causes of conflict found is that the adult and child speaker vocal organs are of substantially different dimensions. Discriminatory approaches are noted for dealing extensively with the effects emerging from these differences. Specific parameter variations have been introduced with boosted parameters and iteration values to achieve the optimum value of the acoustic models boosted maximum mutual information (bMMI) and feature-space bMMI (fbMMI). Experimental results demonstrate that the feature space discriminative approaches have achieved a significant reduction in the Word Error Rate (WER). This is also shown that fbMMI achieves better performance than the bMMI and fMMI. Recognition of children and the elderly will need even more studies if we are to examine these age groups features in existing and future speech recognition systems.

5 citations


Cited by
More filters
Proceedings ArticleDOI
13 May 2021
TL;DR: In this article, a rice disease detection (RDD) system on hispa rice disease by using real-time rice plant images collected from rice fields of Punjab, trained on a CNN-based deep learning model was implemented.
Abstract: The current work focuses on implementing a rice disease detection (RDD) system on hispa rice disease by using real-time rice plant images collected from rice fields of Punjab, trained on a CNN-based deep learning model. The dataset first gets preprocessed using a Matlab tool and then splits up into 70 to 30 ratio which further gets trained and validated on a proposed CNN model results in an accuracy of 94%. The motivation behind the proposed work is due to an unavailability of a system for RDD in case of hispa disease gave rise to a need for an efficient and trained system that will be useful for the detection of rice hispa disease.

21 citations

Proceedings ArticleDOI
26 Aug 2021
TL;DR: In this paper, a CNN-based deep learning (DL) multi-classification model was used to classify the potato crop plants having healthy and potato blight (PB) disease images based on their PB disease severity level, along with this binary classification has also been done to simply classify the healthy and disease crop leaf.
Abstract: Detection of plant crop diseases has become an active field of research day by day due to increasing the demand for such systems and techniques as crop diseases are now become a common part of agriculture. Focusing on this demand and need, we have developed a Convolutional neural network (CNN)-based Deep learning (DL) multi-classification model which classifies the total of 900 real-time collected images of potato crop plants having healthy and potato blight (PB) disease images based on their PB disease severity level, along with this binary classification has also been done to simply classify the healthy and disease crop leaf. A total of four disease severity levels have been taken into account which resulted in a binary classification accuracy of 90.77% and 94.77% of best multi-classification accuracy. This work will be a great contribution in the field of potato disease recognition and detection using DL approaches.

20 citations

Proceedings ArticleDOI
29 Apr 2022
TL;DR: A landscape analysis of children’s AI systems is conducted, via a systematic literature review including 188 papers, which reveals a wide assortment of applications, and that most systems’ designs addressed only a small subset of principles among those identified.
Abstract: AI systems are becoming increasingly pervasive within children’s devices, apps, and services. However, it is not yet well-understood how risks and ethical considerations of AI relate to children. This paper makes three contributions to this area: first, it identifies ten areas of alignment between general AI frameworks and codes for age-appropriate design for children. Then, to understand how such principles relate to real application contexts, we conducted a landscape analysis of children’s AI systems, via a systematic literature review including 188 papers. This analysis revealed a wide assortment of applications, and that most systems’ designs addressed only a small subset of principles among those we identified. Finally, we synthesised our findings in a framework to inform a new “Code for Age-Appropriate AI”, which aims to provide timely input to emerging policies and standards, and inspire increased interactions between the AI and child-computer interaction communities.

13 citations

Journal ArticleDOI
TL;DR: This paper presents efforts towards developing a children’s ASR system in Punjabi which a low-resourced language, and the role of prosody-modification-based out-of-domain data augmentation is studied to deal with training data scarcity.

9 citations

Journal ArticleDOI
TL;DR: It is demonstrated that inclusion of pitch features with test normalized children dataset has significantly enhanced system performance over different environment conditions i.e clean or noisy.

8 citations