Search or ask a question

Showing papers by "Kevin Duh published in 2006"

PDF

Open Access

Journal Article•DOI•

Morphology-based language modeling for conversational Arabic speech recognition

[...]

Katrin Kirchhoff¹, Dimitra Vergyri², Jeff A. Bilmes¹, Kevin Duh¹, Andreas Stolcke² - Show less +1 more•Institutions (2)

University of Washington¹, SRI International²

01 Oct 2006-Computer Speech & Language

TL;DR: Four different approaches to morphology-based language modeling are presented, including a novel technique called factored language models, and results are presented for both rescoring and first-pass recognition experiments.

...read moreread less

120 citations

Proceedings Article•DOI•

Multilingual Dependency Parsing using Bayes Point Machines

[...]

Simon Corston-Oliver¹, Anthony Aue¹, Kevin Duh², Eric K. Ringger³•Institutions (3)

Microsoft¹, University of Washington², Brigham Young University³

04 Jun 2006

TL;DR: This work develops dependency parsers for Arabic, English, Chinese, and Czech using Bayes Point Machines, a training algorithm which is as easy to implement as the perceptron yet competitive with large margin methods.

...read moreread less

Abstract: We develop dependency parsers for Arabic, English, Chinese, and Czech using Bayes Point Machines, a training algorithm which is as easy to implement as the perceptron yet competitive with large margin methods. We achieve results comparable to state-of-the-art in English and Czech, and report the first directed dependency parsing accuracies for Arabic and Chinese. Given the multilingual nature of our experiments, we discuss some issues regarding the comparison of dependency parsers for different languages.

...read moreread less

26 citations

Proceedings Article•DOI•

Lexicon Acquisition for Dialectal Arabic Using Transductive Learning

[...]

Kevin Duh¹, Katrin Kirchhoff¹•Institutions (1)

University of Washington¹

22 Jul 2006

TL;DR: It is demonstrated that lexicon learning is an important task in resource-poor domains and leads to significant improvements in tagging accuracy for dialectal Arabic.

...read moreread less

Abstract: We investigate the problem of learning a part-of-speech (POS) lexicon for a resource-poor language, dialectal Arabic. Developing a high-quality lexicon is often the first step towards building a POS tagger, which is in turn the front-end to many NLP systems. We frame the lexicon acquisition problem as a transductive learning problem, and perform comparisons on three transductive algorithms: Transductive SVMs, Spectral Graph Transducers, and a novel Transductive Clustering method. We demonstrate that lexicon learning is an important task in resource-poor domains and leads to significant improvements in tagging accuracy for dialectal Arabic.

...read moreread less

17 citations

The University of Washington machine translation system for IWSLT 2006.

[...]

Katrin Kirchhoff, Kevin Duh, Chris Lim

01 Jan 2006

TL;DR: This article presented a multi-pass statistical phrase-based machine translation system for the Italian-English open-data track, which used heterogeneous data sources for training translation and language models, the use of several novel rescoring features in the second pass and exploiting N-best information for translation in the ASR-output condition.

...read moreread less

Abstract: This paper describes the University of Washington’s submission to the IWSLT 2006 evaluation campaign. We present a multi-pass statistical phrase-based machine translation system for the Italian-English open-data track. The focus of our work was on the use of heterogeneous data sources for training translation and language models, the use of several novel rescoring features in the second pass, and exploiting N-best information for translation in the ASR-output condition. Results show mixed benefits of adding out-of-domain data and using N-best information and demonstrate improvements for some of the novel rescoring features.

...read moreread less

3 citations

Lexicon Acquisition for Resource-Poor Languages Using Transductive Learning

[...]

Kevin Duh, Katrin Kirchhoff

01 Jan 2006

TL;DR: The problem of learning a part-of-speech (POS) lexicon for resource-poor languages is investigated, and it is demonstrated that lexicon learning is an important task and leads to signicant improvements in tagging accuracy.

...read moreread less

Abstract: We investigate the problem of learning a part-of-speech (POS) lexicon for resource-poor languages. Developing a high-quality lexicon is often the rst step towards building a POS tagger, which is in turn the front-end to many NLP systems. We frame the lexicon acquisition problem as a transductive learning problem, and perform comparisons on three transductive algorithms: Transductive SVMs, Spectral Graph Transducers, and a novel Transductive Clustering method. We test on two datasets: dialectal Arabic (a resource-poor language) and Wall Street Journal with articially limited training data. For dialectal Arabic, we demonstrate that lexicon learning is an important task and leads to signicant improvements in tagging accuracy. For Wall Street Journal, we observe that transductive learning does not necessary lead to improvements in lexicon accuracy and present some preliminary analyses of results.

...read moreread less