Showing papers in "Computer Speech & Language in 2009"
••
TL;DR: RavenClaw isolates the domain-specific aspects of the dialog control logic from domain-independent conversational skills, and in the process facilitates rapid development of mixed-initiative systems operating in complex, task-oriented domains.
284 citations
••
TL;DR: This work proposes an approach to address the problem of improving content selection in automatic text summarization by using some statistical tools, which takes into account several features, including sentence position, positive keyword, negative keyword, sentence centrality, sentence resemblance to the title, sentenceclusion of name entity, sentence inclusion of numerical data, sentence relative length and aggregated similarity.
235 citations
••
TL;DR: This paper uses support vector machines to combine features from n-gram language models, parses, and traditional reading level measures to produce a better method of assessing reading level, and explores ways that multiple human annotations can be used in comparative assessments of system performance.
205 citations
••
TL;DR: A model-domain environment robust adaptation algorithm, which demonstrates high performance in the standard Aurora 2 speech recognition task without discriminative training of the HMM system, using the clean-trained complex HMM backend as the baseline system for the unsupervised model adaptation.
104 citations
••
TL;DR: It is observed that agglutinative languages are somewhat more amenable to morphosyntax-based natural language watermarking and the free word order property of a language, like Turkish, is an extra bonus.
102 citations
••
TL;DR: Experimental results show, that the reliability of automatic sentence level scoring by the system is almost as high as the average human evaluator, and that the system gives the highest pronunciation quality scores to 90% of native speakers' utterances.
83 citations
••
TL;DR: A novel integrated dialog simulation technique for evaluating spoken dialog systems using a linear-chain conditional random field, and a two-phase data-driven domain-specific user utterance simulation method and a linguistic knowledge-based ASR channel simulation method are presented.
82 citations
••
TL;DR: It is shown that modeling the sequence of acoustic-prosodic values as n-gram features with a maximum entropy model for dialog act (DA) tagging can perform better than conventional approaches that use coarse representation of the prosodic contour through summative statistics of theProsody contour.
68 citations
••
TL;DR: The proposed intonation models using neural networks are compared with Classification and Regression Tree (CART) models and it is found that 88% of the F"0 values (pitch) of the syllables could be predicted from the models within 15% ofThe actual F" 0.
65 citations
••
IBM1
TL;DR: A comparison of the approach to previously published techniques is shown and the effectiveness of this technique in restoring diacritics in different kind of data such as the dialectal Iraqi Arabic scripts is demonstrated.
59 citations
••
TL;DR: This paper evaluates the GPU technique on a large vocabulary spontaneous speech recognition task using a set of acoustic models with varying complexity and the results consistently show by using the GPU it is possible to reduce the recognition time.
••
TL;DR: The performance of the new LT was comparable to that of regular VTLN implemented by warping the Mel filterbank, when the MLS criterion was used for FW estimation, and it is shown that the approximations involved do not lead to any performance degradation.
••
TL;DR: This article shows how sparse codes can be used to do continuous speech recognition by using an iterative subset selection algorithm with quadratic programming to find a sparse code for a spectrogram.
••
TL;DR: A fast likelihood computation approach called dynamic Gaussian selection (DGS) is proposed, which is a one-pass search technique which generates a dynamic shortlist of Gaussians for each state during the procedure of likelihood computation.
••
TL;DR: An evaluation of the classification performances of the learning models for pronoun resolution in Turkish suggests that non-linear models properly tuned to avoid overfitting outperform linear ones when applied to the data used in the authors' experiments.
••
TL;DR: RavenClaw isolates the domain-specific aspects of the dialog control logic from domain-independent logic in a plan-based, task-independent dialog management framework.
••
TL;DR: Results show that data-driven methods can also outperform rule-based methods on Italian syllabification, a language of low syllabic complexity.
••
TL;DR: A discriminative feedback adaptation (DFA) framework is proposed that reinforces the discriminability between the target speaker model and the anti-model, while preserving the generalization ability of the GMM-UBM approach.
••
TL;DR: Evaluation on the ACE RDC corpora shows that the semi-supervised learning method proposed can integrate the advantages of both SVM bootstrapping and label propagation and significantly outperforms the normal LP algorithm via all the available data without SVMbootstrapping.
••
TL;DR: In this paper, generalized decision trees are used to predict places in the word where substitution, deletion and insertion of phonemes may occur, and appropriate statistical contextual rules are applied to the permitted places, in order to specifically determine word variants.
••
TL;DR: This paper presents a data-driven Korean grapheme-to-phoneme conversion method including alignment, rule extraction, and rule pruning procedures that effectively handle the exceptional pronunciation of speech databases.
••
TL;DR: The proposed algorithm builds a lexicon enriched with topic information in three steps: transcription of an audio stream into phone sequences with a speaker- and task-independent phone recogniser, automatic lexical acquisition based on approximate string matching, and hierarchical topic clustering of the lexical entries based on a knowledge-poor co-occurrence approach.
••
TL;DR: This Ngram-based reordering (NbR) approach uses the powerful techniques of SMT systems to generate a weighted reordering graph, and this allows an extension to the SMT decoding search.
••
TL;DR: This work proposes to take into account frequency and temporal dependencies in order to improve the masks' estimation accuracy, and develops Bayesian models of the masks, which leads to a new architecture of a missing data mask estimator.
••
TL;DR: An algorithm is described that produces a linear combination of MLLR transformations from cluster-specific trees using weights estimated by maximizing the likelihood of a speaker's adaptation data to realize gains in unsupervised adaptation.
••
TL;DR: It is shown that utterance mood can be predicted from intonational information, and that this mood information can then be used to recognize the dialogue act.
••
TL;DR: It is concluded that accent identification is more successful for Xhosa and Zulu utterances because accents are more pronounced for English embedded in mother-tongue speech than for English spoken as part of a monolingual dialogue by non-native speakers.
••
TL;DR: The method jointly optimizes the generated clusters and the required number of clusters by estimating and minimizing the Rand index and uses a genetic algorithm to determine the cluster in which each utterance should be located.
••
TL;DR: This work presents an approach that focuses on the sentence extraction phase of the distillation process, and selects document sentences with respect to their relevance to a query via statistical classification with support vector machines.