Showing papers on "Rule-based machine translation published in 2016"

PDF

Open Access

Posted Content•

Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation

[...]

26 Sep 2016-arXiv: Computation and Language

TL;DR: GNMT, Google's Neural Machine Translation system, is presented, which attempts to address many of the weaknesses of conventional phrase-based translation systems and provides a good balance between the flexibility of "character"-delimited models and the efficiency of "word"-delicited models.

...read moreread less

Abstract: Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based translation systems. Unfortunately, NMT systems are known to be computationally expensive both in training and in translation inference. Also, most NMT systems have difficulty with rare words. These issues have hindered NMT's use in practical deployments and services, where both accuracy and speed are essential. In this work, we present GNMT, Google's Neural Machine Translation system, which attempts to address many of these issues. Our model consists of a deep LSTM network with 8 encoder and 8 decoder layers using attention and residual connections. To improve parallelism and therefore decrease training time, our attention mechanism connects the bottom layer of the decoder to the top layer of the encoder. To accelerate the final translation speed, we employ low-precision arithmetic during inference computations. To improve handling of rare words, we divide words into a limited set of common sub-word units ("wordpieces") for both input and output. This method provides a good balance between the flexibility of "character"-delimited models and the efficiency of "word"-delimited models, naturally handles translation of rare words, and ultimately improves the overall accuracy of the system. Our beam search technique employs a length-normalization procedure and uses a coverage penalty, which encourages generation of an output sentence that is most likely to cover all the words in the source sentence. On the WMT'14 English-to-French and English-to-German benchmarks, GNMT achieves competitive results to state-of-the-art. Using a human side-by-side evaluation on a set of isolated simple sentences, it reduces translation errors by an average of 60% compared to Google's phrase-based production system.

...read moreread less

5,737 citations

Proceedings Article•

Dual learning for machine translation

[...]

Di He¹, Yingce Xia², Tao Qin¹, Liwei Wang³, Nenghai Yu², Tie-Yan Liu¹, Wei-Ying Ma¹ - Show less +3 more•Institutions (3)

Microsoft¹, University of Science and Technology of China², Peking University³

05 Dec 2016

TL;DR: Experiments show that dual-NMT works very well on English ↔ French translation; especially, by learning from monolingual data, it achieves a comparable accuracy to NMT trained from the full bilingual data for the French-to-English translation task.

...read moreread less

Abstract: While neural machine translation (NMT) is making good progress in the past two years, tens of millions of bilingual sentence pairs are needed for its training. However, human labeling is very costly. To tackle this training data bottleneck, we develop a dual-learning mechanism, which can enable an NMT system to automatically learn from unlabeled data through a dual-learning game. This mechanism is inspired by the following observation: any machine translation task has a dual task, e.g., English-to-French translation (primal) versus French-to-English translation (dual); the primal and dual tasks can form a closed loop, and generate informative feedback signals to train the translation models, even if without the involvement of a human labeler. In the dual-learning mechanism, we use one agent to represent the model for the primal task and the other agent to represent the model for the dual task, then ask them to teach each other through a reinforcement learning process. Based on the feedback signals generated during this process (e.g., the language-model likelihood of the output of a model, and the reconstruction error of the original sentence after the primal and dual translations), we can iteratively update the two models until convergence (e.g., using the policy gradient methods). We call the corresponding approach to neural machine translation dual-NMT. Experiments show that dual-NMT works very well on English ↔ French translation; especially, by learning from monolingual data (with 10% bilingual data for warm start), it achieves a comparable accuracy to NMT trained from the full bilingual data for the French-to-English translation task.

...read moreread less

559 citations

Posted Content•

Toward Multilingual Neural Machine Translation with Universal Encoder and Decoder

[...]

Thanh-Le Ha, Jan Niehues, Alex Waibel

15 Nov 2016-arXiv: Computation and Language

TL;DR: This paper presents the first attempts in building a multilingual Neural Machine Translation framework under a unified approach in which the information shared among languages can be helpful in the translation of individual language pairs and points out a novel way to make use of monolingual data with Neural Machine translation.

...read moreread less

Abstract: In this paper, we present our first attempts in building a multilingual Neural Machine Translation framework under a unified approach. We are then able to employ attention-based NMT for many-to-many multilingual translation tasks. Our approach does not require any special treatment on the network architecture and it allows us to learn minimal number of free parameters in a standard way of training. Our approach has shown its effectiveness in an under-resourced translation scenario with considerable improvements up to 2.6 BLEU points. In addition, the approach has achieved interesting and promising results when applied in the translation task that there is no direct parallel corpus between source and target languages.

...read moreread less

314 citations

Proceedings Article•DOI•

Exploiting Source-side Monolingual Data in Neural Machine Translation

[...]

Jiajun Zhang¹, Chengqing Zong¹•Institutions (1)

Chinese Academy of Sciences¹

01 Nov 2016

TL;DR: Two approaches to make full use of the sourceside monolingual data in NMT are proposed using the self-learning algorithm to generate the synthetic large-scale parallel data for NMT training and the multi-task learning framework using two NMTs to predict the translation and the reordered source-side monolingUAL sentences simultaneously.

...read moreread less

Abstract: Neural Machine Translation (NMT) based on the encoder-decoder architecture has recently become a new paradigm. Researchers have proven that the target-side monolingual data can greatly enhance the decoder model of NMT. However, the source-side monolingual data is not fully explored although it should be useful to strengthen the encoder model of NMT, especially when the parallel corpus is far from sufficient. In this paper, we propose two approaches to make full use of the sourceside monolingual data in NMT. The first approach employs the self-learning algorithm to generate the synthetic large-scale parallel data for NMT training. The second approach applies the multi-task learning framework using two NMTs to predict the translation and the reordered source-side monolingual sentences simultaneously. The extensive experiments demonstrate that the proposed methods obtain significant improvements over the strong attention-based NMT.

...read moreread less

304 citations

Proceedings Article•DOI•

Linguistic Input Features Improve Neural Machine Translation

[...]

Rico Sennrich, Barry Haddow

12 Aug 2016

TL;DR: This paper generalizes the embedding layer of the encoder in the attentional encoder--decoder architecture to support the inclusion of arbitrary features, in addition to the baseline word feature, and finds that linguistic input features improve model quality according to three metrics: perplexity, BLEU and CHRF3.

...read moreread less

Abstract: Neural machine translation has recently achieved impressive results, while using little in the way of external linguistic information. In this paper we show that the strong learning capability of neural MT models does not make linguistic features redundant; they can be easily incorporated to provide further improvements in performance. We generalize the embedding layer of the encoder in the attentional encoder–decoder architecture to support the inclusion of arbitrary features, in addition to the baseline word feature. We add morphological features, part-ofspeech tags, and syntactic dependency labels as input features to English↔German and English→Romanian neural machine translation systems. In experiments on WMT16 training and test sets, we find that linguistic input features improve model quality according to three metrics: perplexity, BLEU and CHRF3. An opensource implementation of our neural MT system is available1, as are sample files and configurations2.

...read moreread less

301 citations

Book Chapter•DOI•

Segmentation from Natural Language Expressions

[...]

Ronghang Hu¹, Marcus Rohrbach¹, Marcus Rohrbach², Trevor Darrell¹•Institutions (2)

University of California, Berkeley¹, Institute of Company Secretaries of India²

08 Oct 2016

TL;DR: An end-to-end trainable recurrent and convolutional network model that jointly learns to process visual and linguistic information is proposed that can produce quality segmentation output from the natural language expression, and outperforms baseline methods by a large margin.

...read moreread less

Abstract: In this paper we approach the novel problem of segmenting an image based on a natural language expression. This is different from traditional semantic segmentation over a predefined set of semantic classes, as e.g., the phrase “two men sitting on the right bench” requires segmenting only the two people on the right bench and no one standing or sitting on another bench. Previous approaches suitable for this task were limited to a fixed set of categories and/or rectangular regions. To produce pixelwise segmentation for the language expression, we propose an end-to-end trainable recurrent and convolutional network model that jointly learns to process visual and linguistic information. In our model, a recurrent neural network is used to encode the referential expression into a vector representation, and a fully convolutional network is used to a extract a spatial feature map from the image and output a spatial response map for the target object. We demonstrate on a benchmark dataset that our model can produce quality segmentation output from the natural language expression, and outperforms baseline methods by a large margin.

...read moreread less

276 citations

Proceedings Article•DOI•

Improved Representation Learning for Question Answer Matching

[...]

Ming Tan¹, Cicero Nogueira dos Santos¹, Bing Xiang¹, Bowen Zhou¹•Institutions (1)

IBM¹

01 Aug 2016

TL;DR: This work develops hybrid models that process the text using both convolutional and recurrent neural networks, combining the merits on extracting linguistic information from both structures to address passage answer selection.

...read moreread less

Abstract: Passage-level question answer matching is a challenging task since it requires effective representations that capture the complex semantic relations between questions and answers. In this work, we propose a series of deep learning models to address passage answer selection. To match passage answers to questions accommodating their complex semantic relations, unlike most previous work that utilizes a single deep learning structure, we develop hybrid models that process the text using both convolutional and recurrent neural networks, combining the merits on extracting linguistic information from both structures. Additionally, we also develop a simple but effective attention mechanism for the purpose of constructing better answer representations according to the input question, which is imperative for better modeling long answer sequences. The results on two public benchmark datasets, InsuranceQA and TREC-QA, show that our proposed models outperform a variety of strong baselines.

...read moreread less

266 citations

Proceedings Article•DOI•

A Shared Task on Multimodal Machine Translation and Crosslingual Image Description

[...]

Lucia Specia¹, Stella Frank², Khalil Sima'an, Desmond Elliott•Institutions (2)

University of Sheffield¹, University of Edinburgh²

12 Aug 2016

TL;DR: This paper introduces and summarises the findings of a new shared task at the intersection of Natural Language Processing and Computer Vision: the generation of image descriptions in a target language, given an image and/or one or more describe in a different (source) language.

...read moreread less

Abstract: This paper introduces and summarises the findings of a new shared task at the intersection of Natural Language Processing and Computer Vision: the generation of image descriptions in a target language, given an image and/or one or more descriptions in a different (source) language. This challenge was organised along with the Conference on Machine Translation (WMT16), and called for system submissions for two task variants: (i) a translation task, in which a source language image description needs to be translated to a target language, (optionally) with additional cues from the corresponding image, and (ii) a description generation task, in which a target language description needs to be generated for an image, (optionally) with additional cues from source language descriptions of the same image. In this first edition of the shared task, 16 systems were submitted for the translation task and seven for the image description task, from a total of 10 teams.

...read moreread less

263 citations

Proceedings Article•

Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation

[...]

Alexandre Berard, Olivier Pietquin, Laurent Besacier¹, Christophe Servan•Institutions (1)

Institut Universitaire de France¹

10 Dec 2016

TL;DR: This paper proposes a first attempt to build an end-to-end speech- to-text translation system, which does not use source language text during learning or decoding, and would drastically change the data collection methodology in speech translation.

...read moreread less

Abstract: This paper proposes a first attempt to build an end-to-end speech-to-text translation system, which does not use source language text during learning or decoding. Relaxing the need for source language transcription would drastically change the data collection methodology in speech translation, especially in under-resourced scenarios.

...read moreread less

256 citations

Journal Article•DOI•

Connecting the linguistic hierarchy and the numerical scale for the 2-tuple linguistic model and its use to deal with hesitant unbalanced linguistic information

[...]

Yucheng Dong¹, Cong-Cong Li¹, Francisco Herrera²•Institutions (2)

Sichuan University¹, King Abdulaziz University²

01 Nov 2016-Information Sciences

TL;DR: This study provides a connection between two different models based on linguistic 2-tuples and proves the equivalence of the linguistic computational models to handle ULTSs, and proposes a novel CW methodology where the hesitant fuzzy linguistic term sets (HFLTSs) can be constructed based on ULtss using a numerical scale.

...read moreread less

208 citations

Posted Content•

Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism

[...]

Orhan Firat¹, Kyunghyun Cho², Yoshua Bengio²•Institutions (2)

Middle East Technical University¹, Université de Montréal²

06 Jan 2016-arXiv: Computation and Language

TL;DR: The proposed multi-way, multilingual neural machine translation approach enables a single neural translation model to translate between multiple languages, with a number of parameters that grows only linearly with the number of languages.

...read moreread less

Abstract: We propose multi-way, multilingual neural machine translation. The proposed approach enables a single neural translation model to translate between multiple languages, with a number of parameters that grows only linearly with the number of languages. This is made possible by having a single attention mechanism that is shared across all language pairs. We train the proposed multi-way, multilingual model on ten language pairs from WMT'15 simultaneously and observe clear performance improvements over models trained on only one language pair. In particular, we observe that the proposed model significantly improves the translation quality of low-resource language pairs.

...read moreread less

Posted Content•

Zero-Resource Translation with Multi-Lingual Neural Machine Translation

[...]

Orhan Firat¹, Baskaran Sankaran², Yaser Al-Onaizan², Fatos T. Yarman Vural¹, Kyunghyun Cho³ - Show less +1 more•Institutions (3)

Middle East Technical University¹, IBM², Université de Montréal³

13 Jun 2016-arXiv: Computation and Language

TL;DR: A novel finetuning algorithm for the recently introduced multi-way, mulitlingual neural machine translate that enables zero-resource machine translation and is better than pivot-based translation strategy while keeping only one additional copy of attention-related parameters.

...read moreread less

Abstract: In this paper, we propose a novel finetuning algorithm for the recently introduced multi-way, mulitlingual neural machine translate that enables zero-resource machine translation. When used together with novel many-to-one translation strategies, we empirically show that this finetuning algorithm allows the multi-way, multilingual model to translate a zero-resource language pair (1) as well as a single-pair neural translation model trained with up to 1M direct parallel sentences of the same language pair and (2) better than pivot-based translation strategy, while keeping only one additional copy of attention-related parameters.

...read moreread less

Posted Content•

Recurrent Neural Network Grammars

[...]

Chris Dyer¹, Adhiguna Kuncoro¹, Miguel Ballesteros², Noah A. Smith³•Institutions (3)

Carnegie Mellon University¹, Pompeu Fabra University², University of Washington³

25 Feb 2016-arXiv: Computation and Language

TL;DR: The authors introduce recurrent neural network grammars, probabilistic models of sentences with explicit phrase structure, which allow application to both parsing and language modeling, and demonstrate that they provide better parsing in English than any single previously published supervised generative model and better language modeling than state-of-the-art sequential RNNs in English and Chinese.

...read moreread less

Abstract: We introduce recurrent neural network grammars, probabilistic models of sentences with explicit phrase structure. We explain efficient inference procedures that allow application to both parsing and language modeling. Experiments show that they provide better parsing in English than any single previously published supervised generative model and better language modeling than state-of-the-art sequential RNNs in English and Chinese.

...read moreread less

Proceedings Article•

Improved neural machine translation with SMT features

[...]

Wei He¹, Zhongjun He¹, Hua Wu¹, Haifeng Wang¹•Institutions (1)

Baidu¹

12 Feb 2016

TL;DR: The proposed method significantly improves the translation quality of the state-of-the-art NMT system on Chinese-to-English translation tasks and incorporates statistical machine translation (SMT) features, such as a translation model and an n-gram language model, with the NMT model under the log-linear framework.

...read moreread less

Abstract: Neural machine translation (NMT) conducts end-to-end translation with a source language encoder and a target language decoder, making promising translation performance. However, as a newly emerged approach, the method has some limitations. An NMT system usually has to apply a vocabulary of certain size to avoid the time-consuming training and decoding, thus it causes a serious out-of-vocabulary problem. Furthermore, the decoder lacks a mechanism to guarantee all the source words to be translated and usually favors short translations, resulting in fluent but inadequate translations. In order to solve the above problems, we incorporate statistical machine translation (SMT) features, such as a translation model and an n-gram language model, with the NMT model under the log-linear framework. Our experiments show that the proposed method significantly improves the translation quality of the state-of-the-art NMT system on Chinese-to-English translation tasks. Our method produces a gain of up to 2.33 BLEU score on NIST open test sets.

...read moreread less

Posted Content•

Can neural machine translation do simultaneous translation

[...]

Kyunghyun Cho, Masha Esipova

07 Jun 2016-arXiv: Computation and Language

TL;DR: A novel decoding algorithm is introduced that allows an existing neural machine translation model to begin translating before a full source sentence is received and is unique from previous works on simultaneous translation in that segmentation and translation are done jointly to maximize the translation quality.

...read moreread less

Abstract: We investigate the potential of attention-based neural machine translation in simultaneous translation. We introduce a novel decoding algorithm, called simultaneous greedy decoding, that allows an existing neural machine translation model to begin translating before a full source sentence is received. This approach is unique from previous works on simultaneous translation in that segmentation and translation are done jointly to maximize the translation quality and that translating each segment is strongly conditioned on all the previous segments. This paper presents a first step toward building a full simultaneous translation system based on neural machine translation.

...read moreread less

Proceedings Article•DOI•

Mining input grammars from dynamic taints

[...]

Matthias Höschele¹, Andreas Zeller¹•Institutions (1)

Saarland University¹

25 Aug 2016

TL;DR: The AUTOGRAM prototype automatically produced readable and structurally accurate grammars for inputs like URLs, spreadsheets or configuration files, and not only allow simple reverse engineering of input formats, but can also directly serve as input for test generators.

...read moreread less

Abstract: Knowing which part of a program processes which parts of an input can reveal the structure of the input as well as the structure of the program. In a URL http://www.example.com/path/, for instance, the protocol http, the host www.example.com, and the path path would be handled by different functions and stored in different variables. Given a set of sample inputs, we use dynamic tainting to trace the data flow of each input character, and aggregate those input fragments that would be handled by the same function into lexical and syntactical entities. The result is a context-free grammar that reflects valid input structure. In its evaluation, our AUTOGRAM prototype automatically produced readable and structurally accurate grammars for inputs like URLs, spreadsheets or configuration files. The resulting grammars not only allow simple reverse engineering of input formats, but can also directly serve as input for test generators.

...read moreread less

Grammatical Framework Programming With Multilingual Grammars

[...]

Anna Freud

01 Jan 2016

TL;DR: This is a book that will show you even new to old thing, and when you are really dying of grammatical framework programming with multilingual grammars, just pick this book; it will be right for you.

...read moreread less

Abstract: It's coming again, the new collection that this site has. To complete your curiosity, we offer the favorite grammatical framework programming with multilingual grammars book as the choice today. This is a book that will show you even new to old thing. Forget it; it will be right for you. Well, when you are really dying of grammatical framework programming with multilingual grammars, just pick it. You know, this book is always making the fans to be dizzy if not to find.

...read moreread less

Posted Content•

Is Neural Machine Translation Ready for Deployment? A Case Study on 30 Translation Directions

[...]

Marcin Junczys-Dowmunt, Tomasz Dwojak, Hieu Hoang

04 Oct 2016-arXiv: Computation and Language

TL;DR: It is demonstrated that current neural machine translation could already be used for in-production systems when comparing words-persecond ratios, and aspects of translation speed are investigated, introducing AmuNMT, the authors' efficient neural machinetranslation decoder.

...read moreread less

Abstract: In this paper we provide the largest published comparison of translation quality for phrase-based SMT and neural machine translation across 30 translation directions. For ten directions we also include hierarchical phrase-based MT. Experiments are performed for the recently published United Nations Parallel Corpus v1.0 and its large six-way sentence-aligned subcorpus. In the second part of the paper we investigate aspects of translation speed, introducing AmuNMT, our efficient neural machine translation decoder. We demonstrate that current neural machine translation could already be used for in-production systems when comparing words-per-second ratios.

...read moreread less

Journal Article•DOI•

A group decision making framework based on fuzzy VIKOR approach for machine tool selection with linguistic information

[...]

Zhibin Wu¹, Jamil Ahmad¹, Jiuping Xu¹•Institutions (1)

Sichuan University¹

01 May 2016

TL;DR: A multi-criteria group decision making (MCGDM) technique based on the fuzzy VIKOR method is developed to solve a CNC machine tool selection problem and a general MCGDM framework is proposed.

...read moreread less

Abstract: Graphical abstractDisplay Omitted HighlightsTwo algorithms for VIKOR method based on the fuzzy linguistic approach are developed.A general MCGDM framework for a machine tool selection problem is presented.The proposed framework is verified by an example. Computer numerical control (CNC) machines are used for repetitive, difficult and unsafe manufacturing tasks that require a high degree of accuracy. However, when selecting an appropriate CNC machine, multiple criteria need to be considered by multiple decision makers. In this study, a multi-criteria group decision making (MCGDM) technique based on the fuzzy VIKOR method is developed to solve a CNC machine tool selection problem. Linguistic variables represented by triangular fuzzy numbers are used to reflect decision maker preferences for the criteria importance weights and the performance ratings. After the individual preferences are aggregated or after the separation values are computed, they are then defuzzified. In this paper, two algorithms based on a fuzzy linguistic approach are developed. Based on these two algorithms and the VIKOR method, a general MCGDM framework is proposed. A CNC machine tool selection example illustrates the application of the proposed approach. A comparative study of the two algorithms using the above case study information highlighted the need to combine the ranking results, as both algorithms have distinct characteristics.

...read moreread less

Posted Content•

Multi-Source Neural Translation

[...]

Barret Zoph¹, Kevin Knight¹•Institutions (1)

Information Sciences Institute¹

05 Jan 2016-arXiv: Computation and Language

TL;DR: A multi-source machine translation model is built and trained to maximize the probability of a target English string given French and German sources to report up to +4.8 Bleu increases on top of a very strong attention-based neural translation model.

...read moreread less

Abstract: We build a multi-source machine translation model and train it to maximize the probability of a target English string given French and German sources. Using the neural encoder-decoder framework, we explore several combination methods and report up to +4.8 Bleu increases on top of a very strong attention-based neural translation model.

...read moreread less

Posted Content•

Learning a Natural Language Interface with Neural Programmer

[...]

Arvind Neelakantan¹, Quoc V. Le², Martín Abadi², Andrew McCallum¹, Dario Amodei³ - Show less +1 more•Institutions (3)

University of Massachusetts Amherst¹, Google², Baidu³

28 Nov 2016-arXiv: Computation and Language

TL;DR: This paper presents the first weakly supervised, end-to-end neural network model to induce such programs on a real-world dataset, and enhances the objective function of Neural Programmer, a neural network with built-in discrete operations, and applies it on WikiTableQuestions, a natural language question-answering dataset.

...read moreread less

Abstract: Learning a natural language interface for database tables is a challenging task that involves deep language understanding and multi-step reasoning. The task is often approached by mapping natural language queries to logical forms or programs that provide the desired response when executed on the database. To our knowledge, this paper presents the first weakly supervised, end-to-end neural network model to induce such programs on a real-world dataset. We enhance the objective function of Neural Programmer, a neural network with built-in discrete operations, and apply it on WikiTableQuestions, a natural language question-answering dataset. The model is trained end-to-end with weak supervision of question-answer pairs, and does not require domain-specific grammars, rules, or annotations that are key elements in previous approaches to program induction. The main experimental result in this paper is that a single Neural Programmer model achieves 34.2% accuracy using only 10,000 examples with weak supervision. An ensemble of 15 models, with a trivial combination technique, achieves 37.7% accuracy, which is competitive to the current state-of-the-art accuracy of 37.1% obtained by a traditional natural language semantic parser.

...read moreread less

Journal Article•DOI•

Probabilistic linguistic vector-term set and its application in group decision making with multi-granular linguistic information

[...]

Yuling Zhai¹, Zeshui Xu², Huchang Liao³•Institutions (3)

Weifang University¹, Nanjing University of Information Science and Technology², Sichuan University³

01 Dec 2016

TL;DR: This paper proposes the probabilistic linguistic vector-term sets (PLVTSs) to promote the application of multi-granular linguistic information and develops a novel algorithm to tackle multi-attribute group decision making (MAGDM) problems with multiple LESs.

...read moreread less

Abstract: Display Omitted The concept of probabilistic linguistic vector-term set (PLVTS) is proposed to consider the score of linguistic term and its associated change rate simultaneously.A novel algorithm is developed to aid MAGDM with multiple linguistic evaluation scales to deal with the large group decision making with linguistic terms at the aspect of patients.Demonstrate the practical guiding significance for the product-provider (such as the hospital). With the rapid information explosion and sharing, recommender systems (RS) play an auxiliary role in assisting the Internet users to make decision especially in the e-service platform. Normally, the information in this process is related to opinions and preferences, which are usually expressed through a qualitative way such as linguistic evaluation terms (LETs). However, the LETs may come from different sources such as experts, users, etc., which makes the linguistic evaluation scales (LESs) used in this process probably be different due to their different backgrounds and levels of knowledge. The diversity and flexibility of these LESs determine the quality of information, and further affect the effectiveness of a RS. In this paper, we focus on improving the accuracy of the multi-granular linguistic recommender system by supporting customers to find out the most eligible items according their own preferences. We first propose the probabilistic linguistic vector-term sets (PLVTSs) to promote the application of multi-granular linguistic information. Based on the PLVTSs, we then develop a novel algorithm to tackle multi-attribute group decision making (MAGDM) problems with multiple LESs. Furthermore, the effectiveness of the PLVTSs is validated by an illustration of personalized hospital selection-recommender problem. Finally, we point out some possible research directions regrading to the PLVTSs.

...read moreread less

Journal Article•DOI•

Why Are There Different Languages? The Role of Adaptation in Linguistic Diversity

[...]

Gary Lupyan¹, Rick Dale²•Institutions (2)

University of Wisconsin-Madison¹, University of California, Merced²

01 Sep 2016-Trends in Cognitive Sciences

TL;DR: It is argued that, beyond these random factors, linguistic differences, from sounds to grammars, may also reflect adaptations to different environments in which the languages are learned and used.

...read moreread less

Proceedings Article•DOI•

Grammatical error correction: Machine translation and classifiers

[...]

Alla Rozovskaya¹, Dan Roth•Institutions (1)

Virginia Tech¹

01 Aug 2016

TL;DR: An algorithmic approach is developed that combines the strengths of both machine learning classification and machine translation and is better at correcting complex mistakes.

...read moreread less

Abstract: We focus on two leading state-of-the-art approaches to grammatical error correction – machine learning classification and machine translation. Based on the comparative study of the two learning frameworks and through error analysis of the output of the state-of-the-art systems, we identify key strengths and weaknesses of each of these approaches and demonstrate their complementarity. In particular, the machine translation method learns from parallel data without requiring further linguistic input and is better at correcting complex mistakes. The classification approach possesses other desirable characteristics, such as the ability to easily generalize beyond what was seen in training, the ability to train without human-annotated data, and the flexibility to adjust knowledge sources for individual error types. Based on this analysis, we develop an algorithmic approach that combines the strengths of both methods. We present several systems based on resources used in previous work with a relative improvement of over 20% (and 7.4 F score points) over the previous state-of-the-art.

...read moreread less

Posted Content•

Context-Dependent Word Representation for Neural Machine Translation

[...]

Heeyoul Choi¹, Kyunghyun Cho², Yoshua Bengio³•Institutions (3)

Handong Global University¹, New York University², Université de Montréal³

03 Jul 2016-arXiv: Computation and Language

TL;DR: This paper proposes to contextualize the word embedding vectors using a nonlinear bag-of-words representation of the source sentence and proposes to represent special tokens with typed symbols to facilitate translating those words that are not well-suited to be translated via continuous vectors.

...read moreread less

Abstract: We first observe a potential weakness of continuous vector representations of symbols in neural machine translation. That is, the continuous vector representation, or a word embedding vector, of a symbol encodes multiple dimensions of similarity, equivalent to encoding more than one meaning of the word. This has the consequence that the encoder and decoder recurrent networks in neural machine translation need to spend substantial amount of their capacity in disambiguating source and target words based on the context which is defined by a source sentence. Based on this observation, in this paper we propose to contextualize the word embedding vectors using a nonlinear bag-of-words representation of the source sentence. Additionally, we propose to represent special tokens (such as numbers, proper nouns and acronyms) with typed symbols to facilitate translating those words that are not well-suited to be translated via continuous vectors. The experiments on En-Fr and En-De reveal that the proposed approaches of contextualization and symbolization improves the translation quality of neural machine translation systems significantly.

...read moreread less

Proceedings Article•DOI•

Knowledge-Based Semantic Embedding for Machine Translation

[...]

Chen Shi¹, Shujie Liu², Shuo Ren³, Shi Feng⁴, Mu Li², Ming Zhou², Xu Sun⁵, Houfeng Wang¹ - Show less +4 more•Institutions (5)

Peking University¹, Microsoft², Beihang University³, University of Maryland, College Park⁴, Royal Institute of Technology⁵

08 Aug 2016

TL;DR: This paper builds and formulate a semantic space to connect the source and target languages, and applies it to the sequence-to-sequence framework to propose a Knowledge-Based Semantic Embedding (KBSE) method.

...read moreread less

Abstract: In this paper, with the help of knowledge base, we build and formulate a semantic space to connect the source and target languages, and apply it to the sequence-to-sequence framework to propose a Knowledge-Based Semantic Embedding (KBSE) method. In our KBSE method, the source sentence is firstly mapped into a knowledge based semantic space, and the target sentence is generated using a recurrent neural network with the internal meaning preserved. Experiments are conducted on two translation tasks, the electric business data and movie data, and the results show that our proposed method can achieve outstanding performance, compared with both the traditional SMT methods and the existing encoder-decoder models.

...read moreread less

Can Text Simplification Help Machine Translation

[...]

Sanja Štajner, Maja Popović

01 Jan 2016

TL;DR: The use of text simplification as a pre-processing step for statistical machine translation of grammatically complex under-resourced languages can improve grammaticality (fluency) of the translation output and reduce technical post-editing effort.

...read moreread less

Abstract: This article explores the use of text simplification as a pre-processing step for statistical machine translation of grammatically complex under-resourced languages. Our experiments on English-to-Serbian translation show that this approach can improve grammaticality (fluency) of the translation output and reduce technical post-editing effort (number of post-edit operations). Furthermore, the use of more aggressive text simplification methods (which do not only simplify the given sentence but also discard irrelevant information thus producing syntactically very simple sentences) also improves meaning preservation (adequacy) of the translation output.

...read moreread less

Posted Content•

Variational Neural Machine Translation

[...]

Biao Zhang¹, Deyi Xiong², Jinsong Su¹, Hong Duan¹, Min Zhang² - Show less +1 more•Institutions (2)

Xiamen University¹, Soochow University (Suzhou)²

25 May 2016-arXiv: Computation and Language

TL;DR: This paper builds a neural posterior approximator conditioned on both the source and the target sides, and equip it with a reparameterization technique to estimate the variational lower bound, and shows that the proposed variational neural machine translation achieves significant improvements over the vanilla neural machinetranslation baselines.

...read moreread less

Abstract: Models of neural machine translation are often from a discriminative family of encoderdecoders that learn a conditional distribution of a target sentence given a source sentence. In this paper, we propose a variational model to learn this conditional distribution for neural machine translation: a variational encoderdecoder model that can be trained end-to-end. Different from the vanilla encoder-decoder model that generates target translations from hidden representations of source sentences alone, the variational model introduces a continuous latent variable to explicitly model underlying semantics of source sentences and to guide the generation of target translations. In order to perform efficient posterior inference and large-scale training, we build a neural posterior approximator conditioned on both the source and the target sides, and equip it with a reparameterization technique to estimate the variational lower bound. Experiments on both Chinese-English and English- German translation tasks show that the proposed variational neural machine translation achieves significant improvements over the vanilla neural machine translation baselines.

...read moreread less

Posted Content•

Linguistic Input Features Improve Neural Machine Translation

[...]

Rico Sennrich, Barry Haddow

09 Jun 2016-arXiv: Computation and Language

TL;DR: The authors generalize the embedding layer of the encoder in the attentional encoder-decoder architecture to support the inclusion of arbitrary features, in addition to the baseline word feature, and add morphological features, part-of-speech tags, and syntactic dependency labels as input features to English German, and English->Romanian NMT systems.

...read moreread less

Abstract: Neural machine translation has recently achieved impressive results, while using little in the way of external linguistic information. In this paper we show that the strong learning capability of neural MT models does not make linguistic features redundant; they can be easily incorporated to provide further improvements in performance. We generalize the embedding layer of the encoder in the attentional encoder--decoder architecture to support the inclusion of arbitrary features, in addition to the baseline word feature. We add morphological features, part-of-speech tags, and syntactic dependency labels as input features to English German, and English->Romanian neural machine translation systems. In experiments on WMT16 training and test sets, we find that linguistic input features improve model quality according to three metrics: perplexity, BLEU and CHRF3. An open-source implementation of our neural MT system is available, as are sample files and configurations.

...read moreread less

Posted Content•

Vocabulary Selection Strategies for Neural Machine Translation

[...]

Gurvan L'Hostis, David Grangier, Michael Auli

02 Nov 2016-arXiv: Computation and Language

TL;DR: It is shown that decoding time on CPUs can be reduced by up to 90% and training time by 25% on the WMT15 English-German and WMT16 English-Romanian tasks at the same or only negligible change in accuracy.

...read moreread less

Abstract: Classical translation models constrain the space of possible outputs by selecting a subset of translation rules based on the input sentence. Recent work on improving the efficiency of neural translation models adopted a similar strategy by restricting the output vocabulary to a subset of likely candidates given the source. In this paper we experiment with context and embedding-based selection methods and extend previous work by examining speed and accuracy trade-offs in more detail. We show that decoding time on CPUs can be reduced by up to 90% and training time by 25% on the WMT15 English-German and WMT16 English-Romanian tasks at the same or only negligible change in accuracy. This brings the time to decode with a state of the art neural translation system to just over 140 msec per sentence on a single CPU core for English-German.

...read moreread less

Collapse