scispace - formally typeset
Search or ask a question

Showing papers by "Taro Watanabe published in 2006"


Proceedings ArticleDOI
17 Jul 2006
TL;DR: A hierarchical phrase-based statistical machine translation in which a target sentence is efficiently generated in left-to-right order, which enables a straightforward integration with ngram language models.
Abstract: We present a hierarchical phrase-based statistical machine translation in which a target sentence is efficiently generated in left-to-right order. The model is a class of synchronous-CFG with a Greibach Normal Form-like structure for the projected production rule: The paired target-side of a production rule takes a phrase prefixed form. The decoder for the target-normalized form is based on an Early-style top down parser on the source side. The target-normalized form coupled with our top down parser implies a left-to-right generation of translations which enables us a straightforward integration with ngram language models. Our model was experimented on a Japanese-to-English newswire translation task, and showed statistically significant performance improvements against a phrase-based translation system.

62 citations


01 Jan 2006
TL;DR: Experiments showed that the hierarchical phrase-based model outperformed a conventional phrase- based model and the reranking algorithm further boosted the performance.
Abstract: We present the NTT translation system that is experimented for the evaluation campaign of “International Workshop on Spoken Language Translation (IWSLT).” The system consists of two primary components: a hierarchical phrase-based statistical machine translation system and a reranking sys tem. The former is conceptualized as a synchronous-CFG in which phrases are hierarchically combined using nonterminals. The latter uses a modified voted perceptron approach with large number of features. Experiments showed that our hierarchical phrase-based model outperformed a conventional phrase-based model. In addition, our reranking algorithm further boosted the performance.

20 citations


Proceedings ArticleDOI
08 Jun 2006
TL;DR: Two translation systems experimented for the shared-task of "Workshop on Statistical Machine Translation," a phrase- based model and a hierarchical phrase-based model are presented and a phrase/rule extraction technique differentiating tokenization of corpora is reported.
Abstract: We present two translation systems experimented for the shared-task of "Workshop on Statistical Machine Translation," a phrase-based model and a hierarchical phrase-based model. The former uses a phrasal unit for translation, whereas the latter is conceptualized as a synchronous-CFG in which phrases are hierarchically combined using non-terminals. Experiments showed that the hierarchical phrase-based model performed very comparable to the phrase-based model. We also report a phrase/rule extraction technique differentiating tokenization of corpora.

9 citations


Patent
23 Feb 2006
TL;DR: The authors proposed a speech recognition and machine translation system that translates a speech in a first language into a correct text in a second language with more reliability using a rescoring module 56 for assigning a score to each of the translation candidates by combining features obtained in the ASR module 80 and SMT module 84.
Abstract: PROBLEM TO BE SOLVED: To provide a machine translation system that translates a speech in a first language into a correct text in a second language with more reliability. SOLUTION: A speech recognition and machine translation apparatus 20 includes: an automatic speech recognition (ASR) module 80 for outputting N-best hypotheses; a statistical machine translation (SMT) module 84 for deriving K translation candidates from each of the N-best hypotheses; and a rescoring module 56 for assigning a score to each of the translation candidates by combining features obtained in the ASR module 80 and SMT module 84. COPYRIGHT: (C)2006,JPO&NCIPI

3 citations