Book ChapterDOI
Robust Bilingual Word Alignment for Machine Aided Translation
Ido Dagan,Kenneth Church,Willian Gale +2 more
- pp 209-224
TLDR
Because word_align and char_align were designed to work robustly on texts that are smaller and more noisy than the Hansards, it has been possible to successfully deploy the programs at AT&T Language Line Services, a commercial translation service, to help them with difficult terminology.Abstract:
We have developed a new program called word_align for aligning parallel text, text such as the Canadian Hansards that are available in two or more languages. The program takes the output of char_align (Church, 1993), a robust alternative to sentence-based alignment programs, and applies word-level constraints using a version of Brown et al.’s Model 2 (Brown et al., 1993), modified and extended to deal with robustness issues. Word_align was tested on a subset of Canadian Hansards supplied by Simard (Simard et al., 1992). The combination of word_align plus char_align reduces the variance (average square error) by a factor of 5 over char_align alone. More importantly, because word_align and char_align were designed to work robustly on texts that are smaller and more noisy than the Hansards, it has been possible to successfully deploy the programs at AT&T Language Line Services, a commercial translation service, to help them with difficult terminology.read more
Citations
More filters
Journal ArticleDOI
A systematic comparison of various statistical alignment models
Franz Josef Och,Hermann Ney +1 more
TL;DR: An important result is that refined alignment models with a first-order dependence and a fertility model yield significantly better results than simple heuristic models.
Journal Article
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora
TL;DR: A novel stochastic inversion transduction grammar formalism for bilingual language modeling of sentence-pairs, and the concept of bilingual parsing with a variety of parallel corpus analysis applications are introduced.
Proceedings ArticleDOI
HMM-based word alignment in statistical translation
TL;DR: A new model for word alignment in statistical translation using a first-order Hidden Markov model for the word alignment problem as they are used successfully in speech recognition for the time alignment problem.
Proceedings ArticleDOI
Automatic Identification of Word Translations from Unrelated English and German Corpora
TL;DR: The current study, based on the assumption that there is a correlation between the patterns of word co-occurrences in corpora of different languages, makes a significant improvement to about 72% of word translations identified correctly.
Journal ArticleDOI
Translating collocations for bilingual lexicons: a statistical approach
TL;DR: A program named Champollion is described which, given a pair of parallel corpora in two different languages and a list of collocations in one of them, automatically produces their translations, to provide a tool for compiling bilingual lexical information above the word level in multiple languages, for different domains.
References
More filters
Journal ArticleDOI
Maximum likelihood from incomplete data via the EM algorithm
Journal Article
The mathematics of statistical machine translation: parameter estimation
TL;DR: The authors describe a series of five statistical models of the translation process and give algorithms for estimating the parameters of these models given a set of pairs of sentences that are translations of one another.
Journal ArticleDOI
A statistical approach to machine translation
Peter Fitzhugh Brown,John Cocke,Stephen A. Della Pietra,Vincent J. Della Pietra,F. Jelinek,John Lafferty,Robert Leroy Mercer,Paul S. Roossin +7 more
TL;DR: The application of the statistical approach to translation from French to English and preliminary results are described and the results are given.