scispace - formally typeset
Search or ask a question
Author

Ismail Shahin

Bio: Ismail Shahin is an academic researcher from University of Sharjah. The author has contributed to research in topics: Hidden Markov model & Speaker recognition. The author has an hindex of 18, co-authored 89 publications receiving 1068 citations. Previous affiliations of Ismail Shahin include Southern Illinois University Carbondale.


Papers
More filters
Journal ArticleDOI
TL;DR: A thorough examination of the different studies that have been conducted since 2006, when deep learning first arose as a new area of machine learning, for speech applications is provided.
Abstract: Over the past decades, a tremendous amount of research has been done on the use of machine learning for speech processing applications, especially speech recognition. However, in the past few years, research has focused on utilizing deep learning for speech-related applications. This new area of machine learning has yielded far better results when compared to others in a variety of applications including speech, and thus became a very attractive area of research. This paper provides a thorough examination of the different studies that have been conducted since 2006, when deep learning first arose as a new area of machine learning, for speech applications. A thorough statistical analysis is provided in this review which was conducted by extracting specific information from 174 papers published between the years 2006 and 2018. The results provided in this paper shed light on the trends of research in this area as well as bring focus to new research topics.

701 citations

Journal ArticleDOI
TL;DR: The dominant signal mask provided by the hybrid classifier offers better system performance in the presence of noisy signals, and gives higher emotion recognition accuracy than SVMs and MLP classifiers.
Abstract: This paper aims at recognizing emotions for a text-independent and speaker-independent emotion recognition system based on a novel classifier, which is a hybrid of a cascaded Gaussian mixture model and deep neural network (GMM-DNN). This hybrid classifier has been assessed for emotion recognition on “Emirati speech database (Arabic United Arab Emirates Database)” with six different emotions. The sequential GMM-DNN classifier has been contrasted with support vector machines (SVMs) and multilayer perceptron (MLP) classifiers, and its performance accuracy is indexed at 83.97%, while the other two perform at 80.33% and 69.78% using SVMs and MLP, respectively. These results demonstrate that the hybrid classifier significantly gives higher emotion recognition accuracy than SVMs and MLP classifiers. Our GMM-DNN model yields the results similar to those obtained by human judges in a subjective assessment context. Also, the performance of the classifier has been tested using two distinct emotional databases and in normal and noisy talking conditions. The dominant signal mask provided by the hybrid classifier offers better system performance in the presence of noisy signals.

92 citations

Proceedings ArticleDOI
03 Nov 2020
TL;DR: This study highlights the importance of speech signal processing in the process of early screening and diagnosing the COVID-19 virus by utilizing the Recurrent Neural Network (RNN) and specifically its significant well-known architecture, the Long Short-Term Memory (LSTM) for analyzing the acoustic features of cough, breathing, and voice of the patients.
Abstract: Lately, an immense amount of work has been done by people working on the frontlines, such as hospitals, clinics, and laboratories, alongside researchers and scientists who are also making considerable efforts in the fight against the COVID-19 epidemic. Due to the unconscionable dissemination of the disease, the implementation of Artificial Intelligence (AI) has made a significant contribution to the digital health district by applying the basics of Automatic Speech Recognition (ASR) and deep learning algorithms. In this study, we highlight the importance of speech signal processing in the process of early screening and diagnosing the COVID-19 virus by utilizing the Recurrent Neural Network (RNN) and specifically its significant well-known architecture, the Long Short-Term Memory (LSTM) for analyzing the acoustic features of cough, breathing, and voice of the patients. Our results show a low accuracy in the voice test compared to both coughing and breathing sound samples. Moreover, our results are preparatory, and there is a possibility to enhance the accuracy of the voice tests by expanding the data set and targeting a larger group of healthy and infected people.

90 citations

Journal ArticleDOI
TL;DR: There is a need for more efforts to implement modernized deep learning methods for Arabic sentiment analysis systems, and it is observed that CNN and RNN (LSTM) models were the most common methods used for ASSA.

51 citations

01 Jan 2009
TL;DR: The results show that the three models used to identify speakers in each of the neutral and emotional environments perform extremely well, and are better than those obtained in subjective evaluation by human judges.
Abstract: The performance of speaker identification is almost perfect in the neutral environment. However, the performance is significantly deteriorated in emotional environments. In this work, three different and separate models have been used, tested and compared to identify speakers in each of the neutral and emotional environments (completely two separate environments). Our emotional environments in this work consist of five emotions. These emotions are: angry, sad, happy, disgust and fear. The three models are: Hidden Markov Models (HMMs), Second-Order Circular Hidden Markov Models (CHMM2s) and Suprasegmental Hidden Markov Models (SPHMMs). Our results show that the three models perform extremely well for speaker identification in the neutral environment. In emotional environments, the average speaker identification performance based on HMMs, CHMM2s and SPHMMs is 61.4%, 66.4% and 69.1%, respectively. Our results in this work are better than those obtained in subjective evaluation by human judges.

41 citations


Cited by
More filters
Journal ArticleDOI

[...]

08 Dec 2001-BMJ
TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.
Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

33,785 citations

Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Journal ArticleDOI
TL;DR: A thorough examination of the different studies that have been conducted since 2006, when deep learning first arose as a new area of machine learning, for speech applications is provided.
Abstract: Over the past decades, a tremendous amount of research has been done on the use of machine learning for speech processing applications, especially speech recognition. However, in the past few years, research has focused on utilizing deep learning for speech-related applications. This new area of machine learning has yielded far better results when compared to others in a variety of applications including speech, and thus became a very attractive area of research. This paper provides a thorough examination of the different studies that have been conducted since 2006, when deep learning first arose as a new area of machine learning, for speech applications. A thorough statistical analysis is provided in this review which was conducted by extracting specific information from 174 papers published between the years 2006 and 2018. The results provided in this paper shed light on the trends of research in this area as well as bring focus to new research topics.

701 citations