scispace - formally typeset
Search or ask a question
Author

Katsuhito Sudoh

Bio: Katsuhito Sudoh is an academic researcher from Nara Institute of Science and Technology. The author has contributed to research in topics: Machine translation & Example-based machine translation. The author has an hindex of 19, co-authored 99 publications receiving 1386 citations. Previous affiliations of Katsuhito Sudoh include Nippon Telegraph and Telephone & Kyoto University.


Papers
More filters
Proceedings Article
09 Oct 2010
TL;DR: An automatic evaluation metric based on rank correlation coefficients modified with precision is proposed and meta-evaluation of the NTCIR-7 PATMT JE task data shows that this metric outperforms conventional metrics.
Abstract: Automatic evaluation of Machine Translation (MT) quality is essential to developing high-quality MT systems. Various evaluation metrics have been proposed, and BLEU is now used as the de facto standard metric. However, when we consider translation between distant language pairs such as Japanese and English, most popular metrics (e.g., BLEU, NIST, PER, and TER) do not work well. It is well known that Japanese and English have completely different word orders, and special care must be paid to word order in translation. Otherwise, translations with wrong word order often lead to misunderstanding and incomprehensibility. For instance, SMT-based Japanese-to-English translators tend to translate 'A because B' as 'B because A.' Thus, word order is the most important problem for distant language translation. However, conventional evaluation metrics do not significantly penalize such word order mistakes. Therefore, locally optimizing these metrics leads to inadequate translations. In this paper, we propose an automatic evaluation metric based on rank correlation coefficients modified with precision. Our meta-evaluation of the NTCIR-7 PATMT JE task data shows that this metric outperforms conventional metrics.

335 citations

Proceedings Article
01 Aug 2013
TL;DR: It is found that neural language models are indeed viable tools for data selection: while the improvements are varied, they are fast to train on small in-domain data and can sometimes substantially outperform conventional n-grams.
Abstract: Data selection is an effective approach to domain adaptation in statistical machine translation. The idea is to use language models trained on small in-domain text to select similar sentences from large general-domain corpora, which are then incorporated into the training data. Substantial gains have been demonstrated in previous works, which employ standard ngram language models. Here, we explore the use of neural language models for data selection. We hypothesize that the continuous vector representation of words in neural language models makes them more effective than n-grams for modeling unknown word contexts, which are prevalent in general-domain text. In a comprehensive evaluation of 4 language pairs (English to German, French, Russian, Spanish), we found that neural language models are indeed viable tools for data selection: while the improvements are varied (i.e. 0.1 to 1.7 gains in BLEU), they are fast to train on small in-domain data and can sometimes substantially outperform conventional n-grams.

129 citations

Proceedings Article
15 Jul 2010
TL;DR: This paper proposes an alternative single reordering rule: Head Finalization, a syntax-based preprocessing approach that offers the advantage of simplicity and shows that its result, Head Final English (HFE), follows almost the same order as Japanese.
Abstract: English is a typical SVO (Subject-Verb-Object) language, while Japanese is a typical SOV language. Conventional Statistical Machine Translation (SMT) systems work well within each of these language families. However, SMT-based translation from an SVO language to an SOV language does not work well because their word orders are completely different. Recently, a few groups have proposed rule-based preprocessing methods to mitigate this problem (Xu et al., 2009; Hong et al., 2009). These methods rewrite SVO sentences to derive more SOV-like sentences by using a set of handcrafted rules. In this paper, we propose an alternative single reordering rule: Head Finalization. This is a syntax-based preprocessing approach that offers the advantage of simplicity. We do not have to be concerned about part-of-speech tags or rule weights because the powerful Enju parser allows us to implement the rule at a general level. Our experiments show that its result, Head Final English (HFE), follows almost the same order as Japanese. We also show that this rule improves automatic evaluation scores.

95 citations

Journal ArticleDOI
TL;DR: Experimental results show that incorporating discourse features significantly improves the confidence scoring of intention recognition results, and conventional methods may be insufficient since the intention recognition result is a result of discourse processing.

47 citations

Proceedings Article
01 Jan 2018
TL;DR: The results of the shared tasks from the 4th workshop on Asian translation (WAT2017) including J↔E, J→C scientific paper translation subtasks, C→J, K→J patent translation subtask, H→E mixed domain subtasks and J→E newswire subtasks are presented.
Abstract: This paper presents the results of the shared tasks from the 4th workshop on Asian translation (WAT2017) including J↔E, J↔C scientific paper translation subtasks, C↔J, K↔J, E↔J patent translation subtasks, H↔E mixed domain subtasks, J↔E newswire subtasks and J↔E recipe subtasks. For the WAT2017, 12 institutions participated in the shared tasks. About 300 translation results have been submitted to the automatic evaluation server, and selected submissions were manually evaluated.

47 citations


Cited by
More filters
Posted Content
TL;DR: fairseq as discussed by the authors is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks, and supports distributed training across multiple GPUs and machines.
Abstract: fairseq is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks. The toolkit is based on PyTorch and supports distributed training across multiple GPUs and machines. We also support fast mixed-precision training and inference on modern GPUs. A demo video can be found at this https URL

1,650 citations

Proceedings ArticleDOI
01 Apr 2019
TL;DR: Fairseq is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks and supports distributed training across multiple GPUs and machines.
Abstract: fairseq is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks. The toolkit is based on PyTorch and supports distributed training across multiple GPUs and machines. We also support fast mixed-precision training and inference on modern GPUs. A demo video can be found at https://www.youtube.com/watch?v=OtgDdWtHvto

1,535 citations

Posted Content
TL;DR: This work proposes BERTScore, an automatic evaluation metric for text generation that correlates better with human judgments and provides stronger model selection performance than existing metrics.
Abstract: We propose BERTScore, an automatic evaluation metric for text generation. Analogously to common metrics, BERTScore computes a similarity score for each token in the candidate sentence with each token in the reference sentence. However, instead of exact matches, we compute token similarity using contextual embeddings. We evaluate using the outputs of 363 machine translation and image captioning systems. BERTScore correlates better with human judgments and provides stronger model selection performance than existing metrics. Finally, we use an adversarial paraphrase detection task to show that BERTScore is more robust to challenging examples when compared to existing metrics.

1,456 citations

Proceedings Article
30 Apr 2020
TL;DR: This article proposed BERTScore, an automatic evaluation metric for text generation, which computes a similarity score for each token in the candidate sentence with each token from the reference sentence. But instead of exact matches, they compute token similarity using contextual embeddings.
Abstract: We propose BERTScore, an automatic evaluation metric for text generation. Analogously to common metrics, BERTScore computes a similarity score for each token in the candidate sentence with each token in the reference sentence. However, instead of exact matches, we compute token similarity using contextual embeddings. We evaluate using the outputs of 363 machine translation and image captioning systems. BERTScore correlates better with human judgments and provides stronger model selection performance than existing metrics. Finally, we use an adversarial paraphrase detection task and show that BERTScore is more robust to challenging examples compared to existing metrics.

819 citations

Journal ArticleDOI
TL;DR: This tutorial surveys neural network models from the perspective of natural language processing research, in an attempt to bring natural-language researchers up to speed with the neural techniques.
Abstract: Over the past few years, neural networks have re-emerged as powerful machine-learning models, yielding state-of-the-art results in fields such as image recognition and speech processing. More recently, neural network models started to be applied also to textual natural language signals, again with very promising results. This tutorial surveys neural network models from the perspective of natural language processing research, in an attempt to bring natural-language researchers up to speed with the neural techniques. The tutorial covers input encoding for natural language tasks, feed-forward networks, convolutional networks, recurrent networks and recursive networks, as well as the computation graph abstraction for automatic gradient computation.

760 citations