Home
/
Authors
/
Yassine BenAyed

Author

Yassine BenAyed

Bio: Yassine BenAyed is an academic researcher from University of Sfax. The author has contributed to research in topics: Support vector machine & Hidden Markov model. The author has an hindex of 5, co-authored 17 publications receiving 95 citations.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Deep multilayer multiple kernel learning

[...]

Ilyes Rebai¹, Yassine BenAyed¹, Walid Mahdi²•Institutions (2)

University of Sfax¹, Taif University²

01 Nov 2016-Neural Computing and Applications

TL;DR: This paper proposes to optimize the network over an adaptive backpropagation MLMKL framework using the gradient ascent method instead of dual objective function, or the estimation of the leave-one-out error, and achieves high performance.

...read moreread less

Abstract: Multiple kernel learning (MKL) approach has been proposed for kernel methods and has shown high performance for solving some real-world applications. It consists on learning the optimal kernel from one layer of multiple predefined kernels. Unfortunately, this approach is not rich enough to solve relatively complex problems. With the emergence and the success of the deep learning concept, multilayer of multiple kernel learning (MLMKL) methods were inspired by the idea of deep architecture. They are introduced in order to improve the conventional MKL methods. Such architectures tend to learn deep kernel machines by exploring the combinations of multiple kernels in a multilayer structure. However, existing MLMKL methods often have trouble with the optimization of the network for two or more layers. Additionally, they do not always outperform the simplest method of combining multiple kernels (i.e., MKL). In order to improve the effectiveness of MKL approaches, we introduce, in this paper, a novel backpropagation MLMKL framework. Specifically, we propose to optimize the network over an adaptive backpropagation algorithm. We use the gradient ascent method instead of dual objective function, or the estimation of the leave-one-out error. We test our proposed method through a large set of experiments on a variety of benchmark data sets. We have successfully optimized the system over many layers. Empirical results over an extensive set of experiments show that our algorithm achieves high performance compared to the traditional MKL approach and existing MLMKL methods.

...read moreread less

36 citations

Journal Article•DOI•

Text-to-speech synthesis system with Arabic diacritic recognition system

[...]

Ilyes Rebai, Yassine BenAyed

01 Nov 2015-Computer Speech & Language

TL;DR: A Text-To-Speech (TTS) synthesis system for modern standard Arabic language based on statistical parametric approach and Mel-cepstral coefficients is described and the proposed method for synthesis system can generate intelligible and natural speech.

...read moreread less

26 citations

Journal Article•DOI•

Arabic speech synthesis and diacritic recognition

[...]

Ilyes Rebai¹, Yassine BenAyed¹•Institutions (1)

University of Sfax¹

18 May 2016-International Journal of Speech Technology

TL;DR: An efficient Arabic TTS system based on statistical parametric approach and non-uniform units speech synthesis and a new simple stacked neural network approach to improve the accuracy of the acoustic models is presented.

...read moreread less

Abstract: Text-to-speech system (TTS), known also as speech synthesizer, is one of the important technology in the last years due to the expanding field of applications. Several works on speech synthesizer have been made on English and French, whereas many other languages, including Arabic, have been recently taken into consideration. The area of Arabic speech synthesis has not sufficient progress and it is still in its first stage with a low speech quality. In fact, speech synthesis systems face several problems (e.g. speech quality, articulatory effect, etc.). Different methods were proposed to solve these issues, such as the use of large and different unit sizes. This method is mainly implemented with the concatenative approach to improve the speech quality and several works have proved its effectiveness. This paper presents an efficient Arabic TTS system based on statistical parametric approach and non-uniform units speech synthesis. Our system includes a diacritization engine. Modern Arabic text is written without mention the vowels, called also diacritic marks. Unfortunately, these marks are very important to define the right pronunciation of the text which explains the incorporation of the diacritization engine to our system. In this work, we propose a simple approach based on deep neural networks. Deep neural networks are trained to directly predict the diacritic marks and to predict the spectral and prosodic parameters. Furthermore, we propose a new simple stacked neural network approach to improve the accuracy of the acoustic models. Experimental results show that our diacritization system allows the generation of full diacritized text with high precision and our synthesis system produces high-quality speech.

...read moreread less

11 citations

Journal Article•

Hybrid SVM/HMM model for the arab phonemes recognition.

[...]

Elyes Zarrouk, Yassine BenAyed

01 Jan 2016-The International Arab Journal of Information Technology

TL;DR: The incorporation of SVM with HMM brings into existence of the new system of ASR, and the proposed system SVM/HMM realizes the best performances, whereby, it achieves 75.8% as a recognition frequency.

...read moreread less

Abstract: Hidden Markov Models (HMM) are currently widely used in Automatic Speech Recognition (ASR) as being the most effective models. Yet, they sometimes pose some problems of discrimination. The hybridization of Artificial Neural Networks (ANN) in particular Multi Layer Perceptrons (MLP) with HMM is a promising technique to overcome these limitations. In order to ameliorate results of recognition system, we use Support Vector Machines (SVM) witch characterized by a high predictive power and discrimination. The incorporation of SVM with HMM brings into existence of the new system of ASR. So, by using 2800 occurrences of Arabic phonemes, this work arises a comparative study of our acknowledgment system of it as the following: The use of especially the HMM standards lead to a recognition rate of 66.98%. Also, with the hybrid system MLP/HMM we succeed in achieving the value of 73.78%. Moreover, our proposed system SVM/HMM realizes the best performances, whereby, we achieve 75.8% as a recognition frequency.

...read moreread less

11 citations

Proceedings Article•DOI•

Arabic text to speech synthesis based on neural networks for MFCC estimation

[...]

Ilyes Rebai, Yassine BenAyed

22 Jun 2013

TL;DR: An Arabic text to speech synthesis system based on statistical parametric synthesis, where MFCC neural network architecture and an objective evaluation with the MFCC distortion measure are given in this paper.

...read moreread less

Abstract: With the increasing number of users of text to speech applications, high quality speech synthesis is required. However, only few researches concern Arabic text to speech applications. Compared with other languages such as English and French the quality of Arabic synthesis speech is still poor. For these reasons, we propose in this paper an Arabic text to speech synthesis system based on statistical parametric synthesis. Mel Frequency Cepstral Coefficients (MFCC), energy and pitch are predicted using back propagation artificial neural networks and then transformed into speech using Mel Log Spectrum Approximation filter. Often, in Arabic written text, the short vowels called diacritic marks are omitted. So, a diacritization system is proposed to resolve this problem. Different unit sizes are considered in speech database which are phoneme, diphone and triphone. MFCC neural network architecture and an objective evaluation with the MFCC distortion measure are given in this paper.

...read moreread less

9 citations

1
2
3
4
…

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Mapping Rice Cropping Systems in Vietnam Using an NDVI-Based Time-Series Similarity Measurement Based on DTW Distance

[...]

Xudong Guan, Chong Huang, Gaohuan Liu, Xuelian Meng, Qingsheng Liu - Show less +1 more

08 Jan 2016-Remote Sensing

TL;DR: The results demonstrate that the DTW-based similarity measure of the NDVI time series can be effectively used to map large-area rice cropping systems with diverse cultivation processes.

...read moreread less

Abstract: Normalized Difference Vegetation Index (NDVI) derived from Moderate Resolution Imaging Spectroradiometer (MODIS) time-series data has been widely used in the fields of crop and rice classification. The cloudy and rainy weather characteristics of the monsoon season greatly reduce the likelihood of obtaining high-quality optical remote sensing images. In addition, the diverse crop-planting system in Vietnam also hinders the comparison of NDVI among different crop stages. To address these problems, we apply a Dynamic Time Warping (DTW) distance-based similarity measure approach and use the entire yearly NDVI time series to reduce the inaccuracy of classification using a single image. We first de-noise the NDVI time series using S-G filtering based on the TIMESAT software. Then, a standard NDVI time-series base for rice growth is established based on field survey data and Google Earth sample data. NDVI time-series data for each pixel are constructed and the DTW distance with the standard rice growth NDVI time series is calculated. Then, we apply thresholds to extract rice growth areas. A qualitative assessment using statistical data and a spatial assessment using sampled data from the rice-cropping map reveal a high mapping accuracy at the national scale between the statistical data, with the corresponding R2 being as high as 0.809; however, the mapped rice accuracy decreased at the provincial scale due to the reduced number of rice planting areas per province. An analysis of the results indicates that the 500-m resolution MODIS data are limited in terms of mapping scattered rice parcels. The results demonstrate that the DTW-based similarity measure of the NDVI time series can be effectively used to map large-area rice cropping systems with diverse cultivation processes.

...read moreread less

123 citations

The International Arab Journal of InformationTechnology

[...]

Nazean Binti Jomhari

01 May 2011

67 citations

Journal Article•DOI•

Bridging deep and multiple kernel learning: A review

[...]

Tinghua Wang, Lin Zhang, Wenyu Hu

01 Mar 2021-Information Fusion

TL;DR: This article presents a comprehensive overview of the state-of-the-art approaches that bridge the MKL and deep learning techniques, systematically reviewing the typical hybrid models, training techniques, and their theoretical and practical benefits.

...read moreread less

37 citations

Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing

[...]

Erdem Motuk¹, Stefan Bilbao¹, Roger Woods•Institutions (1)

Edinburgh College of Art¹

01 Jan 1996

32 citations

Journal Article•DOI•

Hybrid continuous speech recognition systems by HMM, MLP and SVM: a comparative study

[...]

Elyes Zarrouk, Yassine Ben Ayed, Faiez Gargouri

01 Sep 2014-International Journal of Speech Technology

TL;DR: It is deduced that SVM/HMM hybrid model is more efficient then HMMs standards and the hybrid system Multi-Layer Perceptron (MLP) with HMM.

...read moreread less

Abstract: This paper presents a new hybrid method for continuous Arabic speech recognition based on triphones modelling. To do this, we apply Support Vectors Machine (SVM) as an estimator of posterior probabilities within the Hidden Markov Models (HMM) standards. In this work, we describe a new approach of categorising Arabic vowels to long and short vowels to be applied on the labeling phase of speech signals. Using this new labeling method, we deduce that SVM/HMM hybrid model is more efficient then HMMs standards and the hybrid system Multi-Layer Perceptron (MLP) with HMM. The obtained results for the Arabic speech recognition system based on triphones are 64.68 % with HMMs, 72.39 % with MLP/HMM and 74.01 % for SVM/HMM hybrid model. The WER obtained for the recognition of continuous speech by the three systems proves the performance of SVM/HMM by obtaining the lowest average for 4 tested speakers 11.42 %.

...read moreread less

31 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

Collapse