scispace - formally typeset
Search or ask a question

Showing papers in "Computer Speech & Language in 2009"


Journal ArticleDOI
TL;DR: RavenClaw isolates the domain-specific aspects of the dialog control logic from domain-independent conversational skills, and in the process facilitates rapid development of mixed-initiative systems operating in complex, task-oriented domains.

284 citations


Journal ArticleDOI
TL;DR: This work proposes an approach to address the problem of improving content selection in automatic text summarization by using some statistical tools, which takes into account several features, including sentence position, positive keyword, negative keyword, sentence centrality, sentence resemblance to the title, sentenceclusion of name entity, sentence inclusion of numerical data, sentence relative length and aggregated similarity.

235 citations


Journal ArticleDOI
TL;DR: This paper uses support vector machines to combine features from n-gram language models, parses, and traditional reading level measures to produce a better method of assessing reading level, and explores ways that multiple human annotations can be used in comparative assessments of system performance.

205 citations


Journal ArticleDOI
Jinyu Li1, Li Deng1, Dong Yu1, Yifan Gong1, Alex Acero1 
TL;DR: A model-domain environment robust adaptation algorithm, which demonstrates high performance in the standard Aurora 2 speech recognition task without discriminative training of the HMM system, using the clean-trained complex HMM backend as the baseline system for the unsupervised model adaptation.

104 citations


Journal ArticleDOI
TL;DR: It is observed that agglutinative languages are somewhat more amenable to morphosyntax-based natural language watermarking and the free word order property of a language, like Turkish, is an extra bonus.

102 citations


Journal ArticleDOI
TL;DR: Experimental results show, that the reliability of automatic sentence level scoring by the system is almost as high as the average human evaluator, and that the system gives the highest pronunciation quality scores to 90% of native speakers' utterances.

83 citations


Journal ArticleDOI
TL;DR: A novel integrated dialog simulation technique for evaluating spoken dialog systems using a linear-chain conditional random field, and a two-phase data-driven domain-specific user utterance simulation method and a linguistic knowledge-based ASR channel simulation method are presented.

82 citations


Journal ArticleDOI
TL;DR: It is shown that modeling the sequence of acoustic-prosodic values as n-gram features with a maximum entropy model for dialog act (DA) tagging can perform better than conventional approaches that use coarse representation of the prosodic contour through summative statistics of theProsody contour.

68 citations


Journal ArticleDOI
TL;DR: The proposed intonation models using neural networks are compared with Classification and Regression Tree (CART) models and it is found that 88% of the F"0 values (pitch) of the syllables could be predicted from the models within 15% ofThe actual F" 0.

65 citations


Journal ArticleDOI
Imed Zitouni1, Ruhi Sarikaya1
TL;DR: A comparison of the approach to previously published techniques is shown and the effectiveness of this technique in restoring diacritics in different kind of data such as the dialectal Iraqi Arabic scripts is demonstrated.

59 citations


Journal ArticleDOI
TL;DR: This paper evaluates the GPU technique on a large vocabulary spontaneous speech recognition task using a set of acoustic models with varying complexity and the results consistently show by using the GPU it is possible to reduce the recognition time.

Journal ArticleDOI
TL;DR: The performance of the new LT was comparable to that of regular VTLN implemented by warping the Mel filterbank, when the MLS criterion was used for FW estimation, and it is shown that the approximations involved do not lead to any performance degradation.

Journal ArticleDOI
TL;DR: This article shows how sparse codes can be used to do continuous speech recognition by using an iterative subset selection algorithm with quadratic programming to find a sparse code for a spectrogram.

Journal ArticleDOI
TL;DR: A fast likelihood computation approach called dynamic Gaussian selection (DGS) is proposed, which is a one-pass search technique which generates a dynamic shortlist of Gaussians for each state during the procedure of likelihood computation.

Journal ArticleDOI
TL;DR: An evaluation of the classification performances of the learning models for pronoun resolution in Turkish suggests that non-linear models properly tuned to avoid overfitting outperform linear ones when applied to the data used in the authors' experiments.

Journal ArticleDOI
TL;DR: RavenClaw isolates the domain-specific aspects of the dialog control logic from domain-independent logic in a plan-based, task-independent dialog management framework.

Journal ArticleDOI
TL;DR: Results show that data-driven methods can also outperform rule-based methods on Italian syllabification, a language of low syllabic complexity.

Journal ArticleDOI
TL;DR: A discriminative feedback adaptation (DFA) framework is proposed that reinforces the discriminability between the target speaker model and the anti-model, while preserving the generalization ability of the GMM-UBM approach.

Journal ArticleDOI
TL;DR: Evaluation on the ACE RDC corpora shows that the semi-supervised learning method proposed can integrate the advantages of both SVM bootstrapping and label propagation and significantly outperforms the normal LP algorithm via all the available data without SVMbootstrapping.

Journal ArticleDOI
TL;DR: In this paper, generalized decision trees are used to predict places in the word where substitution, deletion and insertion of phonemes may occur, and appropriate statistical contextual rules are applied to the permitted places, in order to specifically determine word variants.

Journal ArticleDOI
TL;DR: This paper presents a data-driven Korean grapheme-to-phoneme conversion method including alignment, rule extraction, and rule pruning procedures that effectively handle the exceptional pronunciation of speech databases.

Journal ArticleDOI
TL;DR: The proposed algorithm builds a lexicon enriched with topic information in three steps: transcription of an audio stream into phone sequences with a speaker- and task-independent phone recogniser, automatic lexical acquisition based on approximate string matching, and hierarchical topic clustering of the lexical entries based on a knowledge-poor co-occurrence approach.

Journal ArticleDOI
TL;DR: This Ngram-based reordering (NbR) approach uses the powerful techniques of SMT systems to generate a weighted reordering graph, and this allows an extension to the SMT decoding search.

Journal ArticleDOI
TL;DR: This work proposes to take into account frequency and temporal dependencies in order to improve the masks' estimation accuracy, and develops Bayesian models of the masks, which leads to a new architecture of a missing data mask estimator.

Journal ArticleDOI
TL;DR: An algorithm is described that produces a linear combination of MLLR transformations from cluster-specific trees using weights estimated by maximizing the likelihood of a speaker's adaptation data to realize gains in unsupervised adaptation.

Journal ArticleDOI
TL;DR: It is shown that utterance mood can be predicted from intonational information, and that this mood information can then be used to recognize the dialogue act.

Journal ArticleDOI
TL;DR: It is concluded that accent identification is more successful for Xhosa and Zulu utterances because accents are more pronounced for English embedded in mother-tongue speech than for English spoken as part of a monolingual dialogue by non-native speakers.

Journal ArticleDOI
TL;DR: The method jointly optimizes the generated clusters and the required number of clusters by estimating and minimizing the Rand index and uses a genetic algorithm to determine the cluster in which each utterance should be located.

Journal ArticleDOI
TL;DR: This work presents an approach that focuses on the sentence extraction phase of the distillation process, and selects document sentences with respect to their relevance to a query via statistical classification with support vector machines.