Search or ask a question

Showing papers by "Sandra Maria Aluísio published in 2019"

PDF

Open Access

Journal Article•DOI•

Theoretical learning guarantees applied to acoustic modeling

[...]

Christopher Shulby¹, Christopher Shulby², Martha Dais Ferreira², Rodrigo Fernandes de Mello², Sandra Maria Aluísio² - Show less +1 more•Institutions (2)

Samsung¹, University of São Paulo²

01 Dec 2019-Journal of the Brazilian Computer Society

TL;DR: It is shown that gross errors present even in state-of-the-art systems can be avoided and that an accurate acoustic model can be built in a hierarchical fashion and that even with a small amount of data, accurate and robust recognition rates can be obtained.

...read moreread less

Abstract: In low-resource scenarios, for example, small datasets or a lack in computational resources available, state-of-the-art deep learning methods for speech recognition have been known to fail. It is possible to achieve more robust models if care is taken to ensure the learning guarantees provided by the statistical learning theory. This work presents a shallow and hybrid approach using a convolutional neural network feature extractor fed into a hierarchical tree of support vector machines for classification. Here, we show that gross errors present even in state-of-the-art systems can be avoided and that an accurate acoustic model can be built in a hierarchical fashion. Furthermore, we present proof that our algorithm does adhere to the learning guarantees provided by the statistical learning theory. The acoustic model produced in this work outperforms traditional hidden Markov models, and the hierarchical support vector machine tree outperforms a multi-class multilayer perceptron classifier using the same features. More importantly, we isolate the performance of the acoustic model and provide results on both the frame and phoneme level, considering the true robustness of the model. We show that even with a small amount of data, accurate and robust recognition rates can be obtained.

...read moreread less

9 citations

Proceedings Article•DOI•

Robust Phoneme Recognition with Little Data

[...]

Christopher Shulby¹, Martha Dais Ferreira¹, Rodrigo Fernandes de Mello¹, Sandra Maria Aluísio¹•Institutions (1)

University of São Paulo¹

01 Jan 2019

TL;DR: This work presents a two-fold novelty where a carefully designed CNN architecture, together with a knowledge-driven classifier achieves nearly state-of-the-art phoneme recognition results with absolutely no pretraining or external weight initialization.

...read moreread less

Abstract: A common belief in the community is that deep learning requires large datasets to be effective. We show that with careful parameter selection, deep feature extraction can be applied even to small datasets.We also explore exactly how much data is necessary to guarantee learning by convergence analysis and calculating the shattering coefficient for the algorithms used. Another problem is that state-of-the-art results are rarely reproducible because they use proprietary datasets, pretrained networks and/or weight initializations from other larger networks. We present a two-fold novelty for this situation where a carefully designed CNN architecture, together with a knowledge-driven classifier achieves nearly state-of-the-art phoneme recognition results with absolutely no pretraining or external weight initialization. We also beat the best replication study of the state of the art with a 28% FER. More importantly, we are able to achieve transparent, reproducible frame-level accuracy and, additionally, perform a convergence analysis to show the generalization capacity of the model providing statistical evidence that our results are not obtained by chance. Furthermore, we show how algorithms with strong learning guarantees can not only benefit from raw data extraction but contribute with more robust results.

...read moreread less

2 citations

Journal Article•DOI•

Automatic detection and correction of discourse marker errors made by Spanish native speakers in Portuguese academic writing

[...]

Lianet Sepúlveda-Torres¹, Magali Sanches Duran¹, Sandra Maria Aluísio¹•Institutions (1)

University of São Paulo¹

01 Sep 2019

TL;DR: A lexicon that will be used to support the task of automatically detecting and correcting discourse marker errors and can potentially identify many others, as long as new lexical inputs are incorporated into them is presented.

...read moreread less

Abstract: Discourse markers are words and expressions (such as: firstly, then, for example, because, as a result, likewise, in comparison, in contrast) that explicitly state the relational structure of the information in the text, i.e. signalling a sequential relationship between the current message and the previous discourse. Using these markers improves the cohesion and coherence of texts, facilitating reading comprehension. Although often included in tools that support the rhetoric structuring of texts, discourse markers have hardly been explored in writing support tools for learners of a second language. However, learners of a second language, including those at advanced levels, have trouble producing these lexical items, frequently replacing them with items from their native language or with literal translations of items in their own language, which often do not result in proper lexical items in the second language. In addition, students learn a single marker per function and use it repeatedly, producing monotonous texts. With the aim of contributing to reducing these difficulties, this paper presents a lexicon that will be used to support the task of automatically detecting and correcting discourse marker errors. Several heuristics have been evaluated to generate different types of errors. Automatic translation methods were used to semi-automatically compile the lexicon used in these heuristics. Similarity measures were also combined with these heuristics to correct discourse marker errors. The evaluated methods proved to be suitable for the task of identifying some types of discourse marker errors and can potentially identify many others, as long as new lexical inputs are incorporated into them.

...read moreread less

1 citations