Showing papers by "Taro Watanabe published in 2012"

PDF

Open Access

Proceedings Article•

Bilingual Lexicon Extraction from Comparable Corpora Using Label Propagation

[...]

Akihiro Tamura¹, Taro Watanabe¹, Eiichiro Sumita¹•Institutions (1)

National Institute of Information and Communications Technology¹

12 Jul 2012

TL;DR: A novel method for lexicon extraction that extracts translation pairs from comparable corpora by using graph-based label propagation and achieves improved performance by clustering synonyms into the same translation.

...read moreread less

Abstract: This paper proposes a novel method for lexicon extraction that extracts translation pairs from comparable corpora by using graph-based label propagation. In previous work, it was established that performance drastically decreases when the coverage of a seed lexicon is small. We resolve this problem by utilizing indirect relations with the bilingual seeds together with direct relations, in which each word is represented by a distribution of translated seeds. The seed distributions are propagated over a graph representing relations among words, and translation pairs are extracted by identifying word pairs with a high similarity in the seed distributions. We propose two types of the graphs: a co-occurrence graph, representing co-occurrence relations between words, and a similarity graph, representing context similarities between words. Evaluations using English and Japanese patent comparable corpora show that our proposed graph propagation method outperforms conventional methods. Further, the similarity graph achieved improved performance by clustering synonyms into the same translation.

...read moreread less

73 citations

Proceedings Article•

Inducing a Discriminative Parser to Optimize Machine Translation Reordering

[...]

Graham Neubig¹, Taro Watanabe², Shinsuke Mori¹•Institutions (2)

Kyoto University¹, National Institute of Information and Communications Technology²

12 Jul 2012

TL;DR: This paper proposes a method for learning a discriminative parser for machine translation reordering using only aligned parallel text by treating the parser's derivation tree as a latent variable in a model that is trained to maximize reordering accuracy.

...read moreread less

Abstract: This paper proposes a method for learning a discriminative parser for machine translation reordering using only aligned parallel text. This is done by treating the parser's derivation tree as a latent variable in a model that is trained to maximize reordering accuracy. We demonstrate that efficient large-margin training is possible by showing that two measures of reordering accuracy can be factored over the parse tree. Using this model in the pre-ordering framework results in significant gains in translation accuracy over standard phrase-based SMT and previously proposed unsupervised syntax induction methods.

...read moreread less

73 citations

Proceedings Article•

Machine Translation without Words through Substring Alignment

[...]

Graham Neubig¹, Taro Watanabe², Shinsuke Mori¹, Tatsuya Kawahara¹•Institutions (2)

Kyoto University¹, National Institute of Information and Communications Technology²

08 Jul 2012

TL;DR: This paper demonstrates that accurate machine translation is possible without the concept of "words," treating MT as a problem of transformation between character strings, and proposes a look-ahead parsing algorithm and substring-informed prior probabilities to achieve more effective and efficient alignment.

...read moreread less

Abstract: In this paper, we demonstrate that accurate machine translation is possible without the concept of "words," treating MT as a problem of transformation between character strings. We achieve this result by applying phrasal inversion transduction grammar alignment techniques to character strings to train a character-based translation model, and using this in the phrase-based MT framework. We also propose a look-ahead parsing algorithm and substring-informed prior probabilities to achieve more effective and efficient alignment. In an evaluation, we demonstrate that character-based translation can achieve results that compare to word-based systems while effectively translating unknown and uncommon words over several language pairs.

...read moreread less

44 citations

Proceedings Article•

Locally Training the Log-Linear Model for SMT

[...]

Lemao Liu¹, Hailong Cao¹, Taro Watanabe², Tiejun Zhao¹, Mo Yu¹, Conghui Zhu¹ - Show less +2 more•Institutions (2)

Harbin Institute of Technology¹, National Institute of Information and Communications Technology²

12 Jul 2012

TL;DR: This paper proposes a novel local training method, which significantly outperforms MERT with the maximal improvements up to 2.0 BLEU points, meanwhile its efficiency is comparable to that of the global method.

...read moreread less

Abstract: In statistical machine translation, minimum error rate training (MERT) is a standard method for tuning a single weight with regard to a given development data. However, due to the diversity and uneven distribution of source sentences, there are two problems suffered by this method. First, its performance is highly dependent on the choice of a development set, which may lead to an unstable performance for testing. Second, translations become inconsistent at the sentence level since tuning is performed globally on a document level. In this paper, we propose a novel local training method to address these two problems. Unlike a global training method, such as MERT, in which a single weight is learned and used for all the input sentences, we perform training and testing in one step by learning a sentence-wise weight for each input sentence. We propose efficient incremental training methods to put the local training into practice. In NIST Chinese-to-English translation tasks, our local training method significantly outperforms MERT with the maximal improvements up to 2.0 BLEU points, meanwhile its efficiency is comparable to that of the global method.

...read moreread less

29 citations

Proceedings Article•

Optimized Online Rank Learning for Machine Translation

[...]

Taro Watanabe¹•Institutions (1)

National Institute of Information and Communications Technology¹

03 Jun 2012

TL;DR: This work proposes a variant of SGD with a larger batch size in which the parameter update in each iteration is further optimized by a passive-aggressive algorithm, and indicates significantly better translation results.

...read moreread less

Abstract: We present an online learning algorithm for statistical machine translation (SMT) based on stochastic gradient descent (SGD). Under the online setting of rank learning, a corpus-wise loss has to be approximated by a batch local loss when optimizing for evaluation measures that cannot be linearly decomposed into a sentence-wise loss, such as BLEU. We propose a variant of SGD with a larger batch size in which the parameter update in each iteration is further optimized by a passive-aggressive algorithm. Learning is efficiently parallelized and line search is performed in each round when merging parameters across parallel jobs. Experiments on the NIST Chinese-to-English Open MT task indicate significantly better translation results.

...read moreread less

24 citations

Proceedings Article•

Head-driven Transition-based Parsing with Top-down Prediction

[...]

Katsuhiko Hayashi¹, Taro Watanabe², Masayuki Asahara, Yuji Matsumoto¹•Institutions (2)

Nara Institute of Science and Technology¹, National Institute of Information and Communications Technology²

08 Jul 2012

TL;DR: This paper presents a novel top-down head-driven parsing algorithm for data-driven projective dependency analysis that handles global structures, such as clause and coordination, better than shift-reduce or other bottom-up algorithms.

...read moreread less

Abstract: This paper presents a novel top-down head-driven parsing algorithm for data-driven projective dependency analysis. This algorithm handles global structures, such as clause and coordination, better than shift-reduce or other bottom-up algorithms. Experiments on the English Penn Treebank data and the Chinese CoNLL-06 data show that the proposed algorithm achieves comparable results with other data-driven dependency parsing algorithms.

...read moreread less

7 citations

Journal Article•DOI•

Joint Phrase Alignment and Extraction for Statistical Machine Translation

[...]

Graham Neubig¹, Graham Neubig², Taro Watanabe², Eiichiro Sumita², Shinsuke Mori¹, Tatsuya Kawahara¹ - Show less +2 more•Institutions (2)

Kyoto University¹, National Institute of Information and Communications Technology²

15 Apr 2012-Journal of Information Processing

TL;DR: This work presents a method to directly learn this phrase table from a parallel corpus of sentences that are not aligned at the word level, through the use of non-parametric Bayesian methods and inversion transduction grammars.

...read moreread less

Abstract: The phrase table, a scored list of bilingual phrases, lies at the center of phrase-based machine translation systems. We present a method to directly learn this phrase table from a parallel corpus of sentences that are not aligned at the word level. The key contribution of this work is that while previous methods have generally only modeled phrases at one level of granularity, in the proposed method phrases of many granularities are included directly in the model. This allows for the direct learning of a phrase table that achieves competitive accuracy without the complicated multi-step process of word alignment and phrase extraction that is used in previous research. The model is achieved through the use of non-parametric Bayesian methods and inversion transduction grammars (ITGs), a variety of synchronous context-free grammars (SCFGs). Experiments on several language pairs demonstrate that the proposed model matches the accuracy of the more traditional two-step word alignment/phrase extraction approach while reducing its phrase table to a fraction of its original size.

...read moreread less

4 citations

Proceedings Article•

Expected Error Minimization with Ultraconservative Update for SMT

[...]

Lemao Liu¹, Tiejun Zhao¹, Taro Watanabe², Hailong Cao¹, Conghui Zhu¹ - Show less +1 more•Institutions (2)

Harbin Institute of Technology¹, National Institute of Information and Communications Technology²

01 Dec 2012

TL;DR: This work proposes an alternative tuning method based on an ultraconservative update, in which the combination of an expected task loss and the distance from the parameters in the previous round are minimized with a variant of gradient descent.

...read moreread less

Abstract: Minimum error rate training is a popular method for parameter tuning in statistical machine translation (SMT). However, the optimization objective function may change drastically at each optimization step, which may induce MERT instability. We propose an alternative tuning method based on an ultraconservative update, in which the combination of an expected task loss and the distance from the parameters in the previous round are minimized with a variant of gradient descent. Experiments on test datasets of both Chinese-to-English and Spanish-toEnglish translation show that our method can achieve improvements over MERT under the Moses system.

...read moreread less

2 citations

Journal Article•DOI•

Japanese Argument Reordering Based on Dependency Structure for Statistical Machine Translation

[...]

Chooi-Ling Goh¹, Taro Watanabe¹, Eiichiro Sumita¹•Institutions (1)

National Institute of Information and Communications Technology¹

01 Jun 2012-IEICE Transactions on Information and Systems

TL;DR: This paper proposes to reorder the arguments but not the predicates in Japanese using a dependency structure as a kind of reordering using a discriminative approach by Ranking Support Vector Machines to re-score the multiple reordered phrase translations.

...read moreread less

Abstract: While phrase-based statistical machine translation systems prefer to translate with longer phrases, this may cause errors in a free word order language, such as Japanese, in which the order of the arguments of the predicates is not solely determined by the predicates and the arguments can be placed quite freely in the text. In this paper, we propose to reorder the arguments but not the predicates in Japanese using a dependency structure as a kind of reordering. Instead of a single deterministically given permutation, we generate multiple reordered phrases for each sentence and translate them independently. Then we apply a re-ranking method using a discriminative approach by Ranking Support Vector Machines (SVM) to re-score the multiple reordered phrase translations. In our experiment with the travel domain corpus BTEC, we gain a 1.22% BLEU score improvement when only 1-best is used for re-ranking and 4.12% BLEU score improvement when n-best is used for Japanese-English translation. key words: predicate-argument structure, reordering, paraphrasing, reranking, statistical machine translation

...read moreread less