scispace - formally typeset
Search or ask a question
Topic

Statistical learning theory

About: Statistical learning theory is a research topic. Over the lifetime, 1618 publications have been published within this topic receiving 158033 citations.


Papers
More filters
Proceedings ArticleDOI
01 Jun 2011
TL;DR: Support Vector Machine (SVM), which is based on Statistical Learning theory, is a universal machine learning method and the application of SVM in classifying to several power quality disturbances is proposed.
Abstract: Support Vector Machine (SVM), which is based on Statistical Learning theory, is a universal machine learning method. This paper proposes the application of SVM in classifying to several power quality disturbances. For this purpose, a process based in HOS has been realized to extract features that help in classification. In this stage the geometrical pattern established via higher-order statistical measurements is obtained, and this pattern is function of the amplitudes and frequencies of the power quality disturbances associated to the 50-Hz power-line. Once the features are managed will be segmented to form training and test sets and them will be applied in the statistical method used to perform automatic classification of PQ disturbances. The result is shown according to correlation and mistake rates.

4 citations

Journal Article
Niu Zhi-guang1
TL;DR: A new mathematical model according to the support vector machines (SVM) theory was developed, based on the analysis of the characters of the hourly urban water consumption data, the statistical learning theory (SLT) , and the empirical risk minimization (ERM) principle, which proved to be effective in short-term prediction of urbanWater consumption.
Abstract: In order to overcome the over-fitting problem and the local minima problem of the artificial neural network (ANN) method in short-term prediction of urban water consumption, a new mathematical model according to the support vector machines (SVM) theory was developed. The model was based on the analysis of the characters of the hourly urban water consumption data, the statistical learning theory (SLT) , and the empirical risk minimization (ERM) principle. The experimental results indicated that the average prediction precision increased by 2 percent, compared to the back propagation (BP) neural network method, and that this model was faster in computation and had better generalization performance, which proved to be effective in short-term prediction of urban water consumption.

4 citations

Proceedings ArticleDOI
12 Jul 2008
TL;DR: The definitions of complex rough variable and primary norm are introduced, and the definitions of the complex empirical risk functional, the complex expected riskfunctional, andcomplex empirical risk minimization principle about samples corrupted by noise are proposed and proved.
Abstract: The key theorem plays an important role in the statistical learning theory. However, the researches about it at present mainly focus on real random variable and the samples which are supposed to be noise-free. In this paper, the definitions of complex rough variable and primary norm are introduced. Then, the definitions of the complex empirical risk functional, the complex expected risk functional, and complex empirical risk minimization principle about samples corrupted by noise are proposed. Finally, the key theorem of learning theory based on complex rough samples corrupted by noise is proposed and proved. The investigations help lay essential theoretical foundations for the systematic and comprehensive development of the statistical learning theory of complex rough samples.

4 citations

DissertationDOI
08 Aug 2018
TL;DR: This dissertation investigates how more robust acoustic generalizations can be made, even with little data and noisy accented-speech data, and takes advantage of raw feature extraction provided by deep learning techniques to produce robust results for acoustic modeling without the dependency of big data.
Abstract: SHULBY, C. D. RAMBLE: robust acoustic modeling for Brazilian learners of English. 2018. 160 p. Tese (Doutorado em Ciências – Ciências de Computação e Matemática Computacional) – Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo, São Carlos – SP, 2018. The gains made by current deep-learning techniques have often come with the price tag of big data and where that data is not available, a new solution must be found. Such is the case for accented and noisy speech where large databases do not exist and data augmentation techniques, which are less than perfect, present an even larger obstacle. Another problem is that state-of-the-art results are rarely reproducible because they use proprietary datasets, pretrained networks and/or weight initializations from other larger networks. An example of a low resource scenario exists even in the fifth largest land in the world; home to most of the speakers of the seventh most spoken language on earth. Brazil is the leader in the Latin-American economy and as a BRIC country aspires to become an ever-stronger player in the global marketplace. Still, English proficiency is low, even for professionals in businesses and universities. Low intelligibility and strong accents can damage professional credibility. It has been established in the literature for foreign language teaching that it is important that adult learners are made aware of their errors as outlined by the “Noticing Theory”, explaining that a learner is more successful when he is able to learn from his own mistakes. An essential objective of this dissertation is to classify phonemes in the acoustic model which is needed to properly identify phonemic errors automatically. A common belief in the community is that deep learning requires large datasets to be effective. This happens because brute force methods create a highly complex hypothesis space which requires large and complex networks which in turn demand a great amount of data samples in order to generate useful networks. Besides that, the loss functions used in neural learning does not provide statistical learning guarantees and only guarantees the network can memorize the training space well. In the case of accented or noisy speech where a new sample can carry a great deal of variation from the training samples, the generalization of such models suffers. The main objective of this dissertation is to investigate how more robust acoustic generalizations can be made, even with little data and noisy accented-speech data. The approach here is to take advantage of raw feature extraction provided by deep learning techniques and instead focus on how learning guarantees can be provided for small datasets to produce robust results for acoustic modeling without the dependency of big data. This has been done by careful and intelligent parameter and architecture selection within the framework of the statistical learning theory. Here, an intelligently defined CNN architecture, together with context windows and a knowledge-driven hierarchical tree of SVM classifiers achieves nearly state-of-the-art frame-wise phoneme recognition results with absolutely no pretraining or external weight initialization. A goal of this thesis is to produce transparent and reproducible architectures with high frame-level accuracy, comparable to the state of the art. Additionally, a convergence analysis based on the learning guarantees of the statistical learning theory is performed in order to evidence the generalization capacity of the model. The model achieves 39.7% error in framewise classification and a 43.5% phone error rate using deep feature extraction and SVM classification even with little data (less than 7 hours). These results are comparable to studies which use well over ten times that amount of data. Beyond the intrinsic evaluation, the model also achieves an accuracy of 88% in the identification of epenthesis, the error which is most difficult for Brazilian speakers of English This is a 69% relative percentage gain over the previous values in the literature. The results are significant because it shows how deep feature extraction can be applied to little data scenarios, contrary to popular belief. The extrinsic, task-based results also show how this approach could be useful in tasks like automatic error diagnosis. Another contribution is the publication of a number of freely available resources which previously did not exist, meant to aid future researches in dataset creation.

4 citations


Network Information
Related Topics (5)
Artificial neural network
207K papers, 4.5M citations
86% related
Cluster analysis
146.5K papers, 2.9M citations
82% related
Feature extraction
111.8K papers, 2.1M citations
81% related
Optimization problem
96.4K papers, 2.1M citations
80% related
Fuzzy logic
151.2K papers, 2.3M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20239
202219
202159
202069
201972
201847