scispace - formally typeset
Search or ask a question

Showing papers on "Phrase published in 2016"


Posted Content
TL;DR: GNMT, Google's Neural Machine Translation system, is presented, which attempts to address many of the weaknesses of conventional phrase-based translation systems and provides a good balance between the flexibility of "character"-delimited models and the efficiency of "word"-delicited models.
Abstract: Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based translation systems. Unfortunately, NMT systems are known to be computationally expensive both in training and in translation inference. Also, most NMT systems have difficulty with rare words. These issues have hindered NMT's use in practical deployments and services, where both accuracy and speed are essential. In this work, we present GNMT, Google's Neural Machine Translation system, which attempts to address many of these issues. Our model consists of a deep LSTM network with 8 encoder and 8 decoder layers using attention and residual connections. To improve parallelism and therefore decrease training time, our attention mechanism connects the bottom layer of the decoder to the top layer of the encoder. To accelerate the final translation speed, we employ low-precision arithmetic during inference computations. To improve handling of rare words, we divide words into a limited set of common sub-word units ("wordpieces") for both input and output. This method provides a good balance between the flexibility of "character"-delimited models and the efficiency of "word"-delimited models, naturally handles translation of rare words, and ultimately improves the overall accuracy of the system. Our beam search technique employs a length-normalization procedure and uses a coverage penalty, which encourages generation of an output sentence that is most likely to cover all the words in the source sentence. On the WMT'14 English-to-French and English-to-German benchmarks, GNMT achieves competitive results to state-of-the-art. Using a human side-by-side evaluation on a set of isolated simple sentences, it reduces translation errors by an average of 60% compared to Google's phrase-based production system.

5,737 citations


Proceedings ArticleDOI
10 Feb 2016
TL;DR: In this article, a systematic comparison of models that learn distributed representations of words from unlabeled data is presented, and it is shown that shallow log-linear models work best for building representation spaces that can be decoded with simple spatial distance metrics.
Abstract: Unsupervised methods for learning distributed representations of words are ubiquitous in today's NLP research, but far less is known about the best ways to learn distributed phrase or sentence representations from unlabelled data. This paper is a systematic comparison of models that learn such representations. We find that the optimal approach depends critically on the intended application. Deeper, more complex models are preferable for representations to be used in supervised systems, but shallow log-linear models work best for building representation spaces that can be decoded with simple spatial distance metrics. We also propose two new unsupervised representation-learning objectives designed to optimise the trade-off between training time, domain portability and performance.

531 citations


Book ChapterDOI
08 Oct 2016
TL;DR: A novel approach which learns grounding by reconstructing a given phrase using an attention mechanism, which can be either latent or optimized directly, and demonstrates the effectiveness on the Flickr 30k Entities and ReferItGame datasets.
Abstract: Grounding (i.e. localizing) arbitrary, free-form textual phrases in visual content is a challenging problem with many applications for human-computer interaction and image-text reference resolution. Few datasets provide the ground truth spatial localization of phrases, thus it is desirable to learn from data with no or little grounding supervision. We propose a novel approach which learns grounding by reconstructing a given phrase using an attention mechanism, which can be either latent or optimized directly. During training our approach encodes the phrase using a recurrent network language model and then learns to attend to the relevant image region in order to reconstruct the input phrase. At test time, the correct attention, i.e., the grounding, is evaluated. If grounding supervision is available it can be directly applied via a loss over the attention mechanism. We demonstrate the effectiveness of our approach on the Flickr30k Entities and ReferItGame datasets with different levels of supervision, ranging from no supervision over partial supervision to full supervision. Our supervised variant improves by a large margin over the state-of-the-art on both datasets.

441 citations


Book ChapterDOI
08 Oct 2016
TL;DR: An end-to-end trainable recurrent and convolutional network model that jointly learns to process visual and linguistic information is proposed that can produce quality segmentation output from the natural language expression, and outperforms baseline methods by a large margin.
Abstract: In this paper we approach the novel problem of segmenting an image based on a natural language expression. This is different from traditional semantic segmentation over a predefined set of semantic classes, as e.g., the phrase “two men sitting on the right bench” requires segmenting only the two people on the right bench and no one standing or sitting on another bench. Previous approaches suitable for this task were limited to a fixed set of categories and/or rectangular regions. To produce pixelwise segmentation for the language expression, we propose an end-to-end trainable recurrent and convolutional network model that jointly learns to process visual and linguistic information. In our model, a recurrent neural network is used to encode the referential expression into a vector representation, and a fully convolutional network is used to a extract a spatial feature map from the image and output a spatial response map for the target object. We demonstrate on a benchmark dataset that our model can produce quality segmentation output from the natural language expression, and outperforms baseline methods by a large margin.

276 citations


Journal ArticleDOI
TL;DR: This system provides an extensible, chemistry-aware, natural language processing pipeline for tokenization, part-of-speech tagging, named entity recognition, and phrase parsing, and the novel use of multiple rule-based grammars that are tailored for interpreting specific document domains such as textual paragraphs, captions, and tables.
Abstract: The emergence of “big data” initiatives has led to the need for tools that can automatically extract valuable chemical information from large volumes of unstructured data, such as the scientific literature. Since chemical information can be present in figures, tables, and textual paragraphs, successful information extraction often depends on the ability to interpret all of these domains simultaneously. We present a complete toolkit for the automated extraction of chemical entities and their associated properties, measurements, and relationships from scientific documents that can be used to populate structured chemical databases. Our system provides an extensible, chemistry-aware, natural language processing pipeline for tokenization, part-of-speech tagging, named entity recognition, and phrase parsing. Within this scope, we report improved performance for chemical named entity recognition through the use of unsupervised word clustering based on a massive corpus of chemistry articles. For phrase parsing an...

257 citations


Proceedings ArticleDOI
16 Aug 2016
TL;DR: This article performed a detailed analysis of neural vs. phrase-based SMT outputs, leveraging high quality post-edits performed by professional translators on the IWSLT data.
Abstract: Within the field of Statistical Machine Translation (SMT), the neural approach (NMT) has recently emerged as the first technology able to challenge the long-standing dominance of phrase-based approaches (PBMT). In particular, at the IWSLT 2015 evaluation campaign, NMT outperformed well established state-of-the-art PBMT systems on English-German, a language pair known to be particularly hard because of morphology and syntactic differences. To understand in what respects NMT provides better translation quality than PBMT, we perform a detailed analysis of neural vs. phrase-based SMT outputs, leveraging high quality post-edits performed by professional translators on the IWSLT data. For the first time, our analysis provides useful insights on what linguistic phenomena are best modeled by neural models - such as the reordering of verbs - while pointing out other aspects that remain to be improved.

220 citations


Posted Content
TL;DR: In this paper, a systematic comparison of models that learn distributed representations of words from unlabeled data is presented, and it is shown that shallow log-linear models work best for building representation spaces that can be decoded with simple spatial distance metrics.
Abstract: Unsupervised methods for learning distributed representations of words are ubiquitous in today's NLP research, but far less is known about the best ways to learn distributed phrase or sentence representations from unlabelled data. This paper is a systematic comparison of models that learn such representations. We find that the optimal approach depends critically on the intended application. Deeper, more complex models are preferable for representations to be used in supervised systems, but shallow log-linear models work best for building representation spaces that can be decoded with simple spatial distance metrics. We also propose two new unsupervised representation-learning objectives designed to optimise the trade-off between training time, domain portability and performance.

168 citations


14 Dec 2016
TL;DR: The authors provided the largest published comparison of translation quality for phrase-based SMT and NMT across 30 translation directions, including hierarchical SMT, NMT, and hierarchical phrase based SMT.
Abstract: In this paper we provide the largest published comparison of translation quality for phrase-based SMT and neural machine translation across 30 translation directions. For ten directions we also include hierarchical phrase-based MT. Experiments are performed for the recently published United Nations Parallel Corpus v1.0 and its large six-way sentence-aligned subcorpus. In the second part of the paper we investigate aspects of translation speed, introducing AmuNMT, our efficient neural machine translation decoder. We demonstrate that current neural machine translation could already be used for in-production systems when comparing words-per-second ratios.

142 citations


Proceedings ArticleDOI
01 Jun 2016
TL;DR: DSCNN hierarchically builds textual representations by processing pretrained word embeddings via Long ShortTerm Memory networks and subsequently extracting features with convolution operators, and does not rely on parsers and expensive phrase labeling, and thus is not restricted to sentencelevel tasks.
Abstract: The goal of sentence and document modeling is to accurately represent the meaning of sentences and documents for various Natural Language Processing tasks. In this work, we present Dependency Sensitive Convolutional Neural Networks (DSCNN) as a generalpurpose classification system for both sentences and documents. DSCNN hierarchically builds textual representations by processing pretrained word embeddings via Long ShortTerm Memory networks and subsequently extracting features with convolution operators. Compared with existing recursive neural models with tree structures, DSCNN does not rely on parsers and expensive phrase labeling, and thus is not restricted to sentencelevel tasks. Moreover, unlike other CNNbased models that analyze sentences locally by sliding windows, our system captures both the dependency information within each sentence and relationships across sentences in the same document. Experiment results demonstrate that our approach is achieving state-ofthe-art performance on several tasks, including sentiment analysis, question type classification, and subjectivity classification.

126 citations


Patent
30 Sep 2016
TL;DR: In this article, a method for placing a first processor in a sleep operating mode and running a second processor that is operative to wake the first processor from the sleep operation in response to a speech command phrase is described.
Abstract: A method include placing a first processor in a sleep operating mode and running a second processor that is operative to wake the first processor from the sleep operating mode in response to a speech command phrase. The method includes identifying, by the second processor, a speech command phrase segment and performing a control operation in response to detecting the segment in detected speech. The control operation is performed while the first processor is maintained in the sleep operating mode.

124 citations


Journal ArticleDOI
TL;DR: An internal linguistic bias for grouping words into phrases can modulate the interpretational impact of speech prosody via delta-band oscillatory phase, which should surface in delta- band oscillations when grouping patterns chosen by comprehenders differ from those indicated by prosody.
Abstract: Language comprehension requires that single words be grouped into syntactic phrases, as words in sentences are too many to memorize individually. In speech, acoustic and syntactic grouping patterns mostly align. However, when ambiguous sentences allow for alternative grouping patterns, comprehenders may form phrases that contradict speech prosody. While delta-band oscillations are known to track prosody, we hypothesized that linguistic grouping bias can modulate the interpretational impact of speech prosody in ambiguous situations, which should surface in delta-band oscillations when grouping patterns chosen by comprehenders differ from those indicated by prosody. In our auditory electroencephalography study, the interpretation of ambiguous sentences depended on whether an identical word was either followed by a prosodic boundary or not, thereby signaling the ending or continuation of the current phrase. Delta-band oscillatory phase at the critical word should reflect whether participants terminate a phrase despite a lack of acoustic boundary cues. Crossing speech prosody with participants' grouping choice, we observed a main effect of grouping choice-independent of prosody. An internal linguistic bias for grouping words into phrases can thus modulate the interpretational impact of speech prosody via delta-band oscillatory phase.

Proceedings ArticleDOI
01 Jun 2016
TL;DR: A shared task on automatically determining sentiment intensity of a word or a phrase as well as phrases formed by words with opposing polarities, taken from general English, English Twitter, and Arabic Twitter.
Abstract: We present a shared task on automatically determining sentiment intensity of a word or a phrase. The words and phrases are taken from three domains: general English, English Twitter, and Arabic Twitter. The phrases include those composed of negators, modals, and degree adverbs as well as phrases formed by words with opposing polarities. For each of the three domains, we assembled the datasets that include multi-word phrases and their constituent words, both manually annotated for real-valued sentiment intensity scores. The three datasets were presented as the test sets for three separate tasks (each focusing on a specific domain). Five teams submitted nine system outputs for the three tasks. All datasets created for this shared task are freely available to the research community.

Posted Content
TL;DR: It is demonstrated that current neural machine translation could already be used for in-production systems when comparing words-persecond ratios, and aspects of translation speed are investigated, introducing AmuNMT, the authors' efficient neural machinetranslation decoder.
Abstract: In this paper we provide the largest published comparison of translation quality for phrase-based SMT and neural machine translation across 30 translation directions. For ten directions we also include hierarchical phrase-based MT. Experiments are performed for the recently published United Nations Parallel Corpus v1.0 and its large six-way sentence-aligned subcorpus. In the second part of the paper we investigate aspects of translation speed, introducing AmuNMT, our efficient neural machine translation decoder. We demonstrate that current neural machine translation could already be used for in-production systems when comparing words-per-second ratios.

Book ChapterDOI
Mingzhe Wang1, Mahmoud Azab1, Noriyuki Kojima1, Rada Mihalcea1, Jia Deng1 
08 Oct 2016
TL;DR: A structured matching of phrases and regions that encourages the semantic relations between phrases to agree with the visual relations between regions that is formulated as a discrete optimization problem and relaxed to a linear program.
Abstract: In this paper we introduce a new approach to phrase localization: grounding phrases in sentences to image regions. We propose a structured matching of phrases and regions that encourages the semantic relations between phrases to agree with the visual relations between regions. We formulate structured matching as a discrete optimization problem and relax it to a linear program. We use neural networks to embed regions and phrases into vectors, which then define the similarities (matching weights) between regions and phrases. We integrate structured matching with neural networks to enable end-to-end training. Experiments on Flickr30K Entities demonstrate the empirical effectiveness of our approach.

Posted Content
TL;DR: This article used a large collection of linguistic and visual cues, such as appearance, size, and position of entity bounding boxes, adjectives that contain attribute information, and spatial relationships between pairs of entities connected by verbs or prepositions.
Abstract: This paper presents a framework for localization or grounding of phrases in images using a large collection of linguistic and visual cues. We model the appearance, size, and position of entity bounding boxes, adjectives that contain attribute information, and spatial relationships between pairs of entities connected by verbs or prepositions. Special attention is given to relationships between people and clothing or body part mentions, as they are useful for distinguishing individuals. We automatically learn weights for combining these cues and at test time, perform joint inference over all phrases in a caption. The resulting system produces state of the art performance on phrase localization on the Flickr30k Entities dataset and visual relationship detection on the Stanford VRD dataset.

Proceedings ArticleDOI
20 May 2016
TL;DR: In this paper, a phrase-based SMT setup with task-specific parameter-tuning outperforms all previously published results for the CoNLL-2014 test set by a large margin (46.37% M^2 over previously 41.75%), while being trained on the same, publicly available data.
Abstract: In this work, we study parameter tuning towards the M^2 metric, the standard metric for automatic grammar error correction (GEC) tasks. After implementing M^2 as a scorer in the Moses tuning framework, we investigate interactions of dense and sparse features, different optimizers, and tuning strategies for the CoNLL-2014 shared task. We notice erratic behavior when optimizing sparse feature weights with M^2 and offer partial solutions. We find that a bare-bones phrase-based SMT setup with task-specific parameter-tuning outperforms all previously published results for the CoNLL-2014 test set by a large margin (46.37% M^2 over previously 41.75%, by an SMT system with neural features) while being trained on the same, publicly available data. Our newly introduced dense and sparse features widen that gap, and we improve the state-of-the-art to 49.49% M^2.

Proceedings ArticleDOI
Haitao Mi1, Zhiguo Wang1, Abe Ittycheriah1
01 Aug 2016
TL;DR: This paper introduces a sentence-level or batch-level vocabulary, which is only a very small sub-set of the full output vocabulary for each sentence or batch, which reduces both the computing time and the memory usage of neural machine translation models.
Abstract: In order to capture rich language phenomena, neural machine translation models have to use a large vocabulary size, which requires high computing time and large memory usage. In this paper, we alleviate this issue by introducing a sentence-level or batch-level vocabulary, which is only a very small sub-set of the full output vocabulary. For each sentence or batch, we only predict the target words in its sentencelevel or batch-level vocabulary. Thus, we reduce both the computing time and the memory usage. Our method simply takes into account the translation options of each word or phrase in the source sentence, and picks a very small target vocabulary for each sentence based on a wordto-word translation model or a bilingual phrase library learned from a traditional machine translation model. Experimental results on the large-scale English-toFrench task show that our method achieves better translation performance by 1 BLEU point over the large vocabulary neural machine translation system of Jean et al. (2015).

Book
22 Dec 2016
TL;DR: The authors provided a state-of-the-art survey of intonation and prosodic structure, showing how morpho-syntactic constituents are mapped to prosodic constituents according to well-formedness conditions.
Abstract: This book provides a state-of-the-art survey of intonation and prosodic structure. Taking a phonological perspective, it shows how morpho-syntactic constituents are mapped to prosodic constituents according to well-formedness conditions. Using a tone-sequence model of intonation, it explores individual tones and how they combine, and discusses how information structure affects intonation in several ways, showing tones and melodies to be 'meaningful' in that they add a pragmatic component to what is being said. The author also shows how, despite a superficial similarity, languages differ in how their tonal patterns arise from tone concatenation. Lexical tones, stress, phrase tones, and boundary tones are assigned differently in different languages, resulting in great variation in intonational grammar, both at the lexical and sentential level. The last chapter is dedicated to experimental studies of how we process prosody. The book will be of interest to advanced students and researchers in linguistics, and particularly in phonological theory.

Proceedings ArticleDOI
TL;DR: This paper investigated the use of hierarchical phrase-based SMT lattices in end-to-end neural machine translation (NMT), and showed that weight pushing transforms the Hiero scores for complete translation hypotheses, with the full translation grammar score and full n-gram language model score, into posteriors compatible with NMT predictive probabilities.
Abstract: We investigate the use of hierarchical phrase-based SMT lattices in end-to-end neural machine translation (NMT). Weight pushing transforms the Hiero scores for complete translation hypotheses, with the full translation grammar score and full n-gram language model score, into posteriors compatible with NMT predictive probabilities. With a slightly modified NMT beam-search decoder we find gains over both Hiero and NMT decoding alone, with practical advantages in extending NMT to very large input and output vocabularies.

Posted Content
TL;DR: This paper proposed phraseNet, a neural machine translator with a phrase memory which stores phrase pairs in symbolic form, mined from corpus or specified by human experts, for any given source sentence, scan the phrase memory to determine the candidate phrase pairs and integrate tagging information in the representation of source sentence accordingly.
Abstract: In this paper, we propose phraseNet, a neural machine translator with a phrase memory which stores phrase pairs in symbolic form, mined from corpus or specified by human experts. For any given source sentence, phraseNet scans the phrase memory to determine the candidate phrase pairs and integrates tagging information in the representation of source sentence accordingly. The decoder utilizes a mixture of word-generating component and phrase-generating component, with a specifically designed strategy to generate a sequence of multiple words all at once. The phraseNet not only approaches one step towards incorporating external knowledge into neural machine translation, but also makes an effort to extend the word-by-word generation mechanism of recurrent neural network. Our empirical study on Chinese-to-English translation shows that, with carefully-chosen phrase table in memory, phraseNet yields 3.45 BLEU improvement over the generic neural machine translator.

Proceedings ArticleDOI
15 Apr 2016
TL;DR: With a slightly modified NMT beam-search decoder, this work finds gains over both Hiero and NMT decoding alone, with practical advantages in extending NMT to very large input and output vocabularies.
Abstract: We investigate the use of hierarchical phrase-based SMT lattices in end-to-end neural machine translation (NMT). Weight pushing transforms the Hiero scores for complete translation hypotheses, with the full translation grammar score and full n-gram language model score, into posteriors compatible with NMT predictive probabilities. With a slightly modified NMT beam-search decoder we find gains over both Hiero and NMT decoding alone, with practical advantages in extending NMT to very large input and output vocabularies.

Proceedings ArticleDOI
07 Aug 2016
TL;DR: The authors explored methods of decode-time integration of attention-based neural translation models with phrase-based statistical machine translation and achieved state-of-the-art performance for English-Russian news translation.
Abstract: This paper describes the AMU-UEDIN submissions to the WMT 2016 shared task on news translation. We explore methods of decode-time integration of attention-based neural translation models with phrase-based statistical machine translation. Efficient batch-algorithms for GPU-querying are proposed and implemented. For English-Russian, our system stays behind the state-of-the-art pure neural models in terms of BLEU. Among restricted systems, manual evaluation places it in the first cluster tied with the pure neural model. For the Russian-English task, our submission achieves the top BLEU result, outperforming the best pure neural system by 1.1 BLEU points and our own phrase-based baseline by 1.6 BLEU. After manual evaluation, this system is the best restricted system in its own cluster. In follow-up experiments we improve results by additional 0.8 BLEU.

Proceedings ArticleDOI
01 Apr 2016
TL;DR: This work offers a new phrase-based method, NPFST, for enriching a unigram BOW, and compares it to both ngram and parsing methods in terms of yield, recall, and efficiency.
Abstract: Social scientists who do not have specialized natural language processing training often use a unigram bag-of-words (BOW) representation when analyzing text corpora. We offer a new phrase-based method, NPFST, for enriching a unigram BOW. NPFST uses a partof-speech tagger and a finite state transducer to extract multiword phrases to be added to a unigram BOW. We compare NPFST to both ngram and parsing methods in terms of yield, recall, and efficiency. We then demonstrate how to use NPFST for exploratory analyses; it performs well, without configuration, on many different kinds of English text. Finally, we present a case study using NPFST to analyze a new corpus of U.S. congressional bills.

Proceedings ArticleDOI
25 Aug 2016
TL;DR: PUMA, an automated, phrase-based approach to extract user opinions in app reviews, is proposed and found that it can reveal severe problems of those apps reported in their user reviews.
Abstract: Mobile app reviews often contain useful user opinions like bug reports or suggestions. However, looking for those opinions manually in thousands of reviews is inefective and time- consuming. In this paper, we propose PUMA, an automated, phrase-based approach to extract user opinions in app reviews. Our approach includes a technique to extract phrases in reviews using part-of-speech (PoS) templates; a technique to cluster phrases having similar meanings (each cluster is considered as a major user opinion); and a technique to monitor phrase clusters with negative sentiments for their outbreaks over time. We used PUMA to study two popular apps and found that it can reveal severe problems of those apps reported in their user reviews.

Proceedings Article
13 Dec 2016
TL;DR: This work proposes the use of imitation learning for structured prediction which learns an incremental model that handles the large search space by avoiding explicit enumeration of the outputs.
Abstract: Natural language generation (NLG) is the task of generating natural language from a meaning representation. Rule-based approaches require domain-specific and manually constructed linguistic resources, while most corpus based approaches rely on aligned training data and/or phrase templates. The latter are needed to restrict the search space for the structured prediction task defined by the unaligned datasets. In this work we propose the use of imitation learning for structured prediction which learns an incremental model that handles the large search space while avoiding explicitly enumerating it. We adapted the Locally Optimal Learning to Search (Chang et al., 2015) framework which allows us to train against non-decomposable loss functions such as the BLEU or ROUGE scores while not assuming gold standard alignments. We evaluate our approach on three datasets using both automatic measures and human judgements and achieve results comparable to the state-of-the-art approaches developed for each of them. Furthermore, we performed an analysis of the datasets which examines common issues with NLG evaluation.

Proceedings ArticleDOI
16 Oct 2016
TL;DR: A simple extension to the familiar mobile keyboard suggestion interface is introduced that presents phrase suggestions that can be accepted by a repeated-tap gesture and finds that phrases were interpreted as suggestions that affected the content of what participants wrote more than conventional single-word suggestions.
Abstract: A system capable of suggesting multi-word phrases while someone is writing could supply ideas about content and phrasing and allow those ideas to be inserted efficiently. Meanwhile, statistical language modeling has provided various approaches to predicting phrases that users type. We introduce a simple extension to the familiar mobile keyboard suggestion interface that presents phrase suggestions that can be accepted by a repeated-tap gesture. In an extended composition task, we found that phrases were interpreted as suggestions that affected the content of what participants wrote more than conventional single-word suggestions, which were interpreted as predictions. We highlight a design challenge: how can a phrase suggestion system make valuable suggestions rather than just accurate predictions'

Proceedings Article
17 Oct 2016
TL;DR: The authors used phrase-based machine translation to pre-translate the input into the target language, and then a neural machine translation system generated the final hypothesis using the pre-translation.
Abstract: Recently, the development of neural machine translation (NMT) has significantly improved the translation quality of automatic machine translation. While most sentences are more accurate and fluent than translations by statistical machine translation (SMT)-based systems, in some cases, the NMT system produces translations that have a completely different meaning. This is especially the case when rare words occur. When using statistical machine translation, it has already been shown that significant gains can be achieved by simplifying the input in a preprocessing step. A commonly used example is the pre-reordering approach. In this work, we used phrase-based machine translation to pre-translate the input into the target language. Then a neural machine translation system generates the final hypothesis using the pre-translation. Thereby, we use either only the output of the phrase-based machine translation (PBMT) system or a combination of the PBMT output and the source sentence. We evaluate the technique on the English to German translation task. Using this approach we are able to outperform the PBMT system as well as the baseline neural MT system by up to 2 BLEU points. We analyzed the influence of the quality of the initial system on the final result.

Posted Content
TL;DR: This work proposes labelling a topic with a succinct phrase that summarises its theme or idea, using Wikipedia document titles as label candidates and compute neural embeddings for documents and words to select the most relevant labels for topics.
Abstract: Topics generated by topic models are typically represented as list of terms. To reduce the cognitive overhead of interpreting these topics for end-users, we propose labelling a topic with a succinct phrase that summarises its theme or idea. Using Wikipedia document titles as label candidates, we compute neural embeddings for documents and words to select the most relevant labels for topics. Compared to a state-of-the-art topic labelling system, our methodology is simpler, more efficient, and finds better topic labels.

Patent
Xin Wang1, Jun Hatori1
21 Sep 2016
TL;DR: In this paper, a user input comprising text of a first symbolic system is received, and the process determines, based on the text, a plurality of sets of one or more candidate words of a second symbolic system.
Abstract: The present disclosure generally relates to dynamic phrase expansion for language input. In one example process, a user input comprising text of a first symbolic system is received. The process determines, based on the text, a plurality of sets of one or more candidate words of a second symbolic system. The process determines, based on at least a portion of the plurality of sets of one or more candidate words, a plurality of expanded candidate phrases. Each expanded candidate phrase comprises at least one word of a respective set of one or more candidate words of the plurality of sets of one or more candidate words and one or more predicted words based on the at least one word of the respective set of one or more candidate words. One or more expanded candidate phrases of the plurality of expanded candidate phrases are displayed for user selection.

Posted Content
TL;DR: This paper used a neural network global lexicon model and neural network joint model to learn non-linear mappings and leverage contextual information from the source sentence more effectively for grammatical error correction.
Abstract: Phrase-based statistical machine translation (SMT) systems have previously been used for the task of grammatical error correction (GEC) to achieve state-of-the-art accuracy. The superiority of SMT systems comes from their ability to learn text transformations from erroneous to corrected text, without explicitly modeling error types. However, phrase-based SMT systems suffer from limitations of discrete word representation, linear mapping, and lack of global context. In this paper, we address these limitations by using two different yet complementary neural network models, namely a neural network global lexicon model and a neural network joint model. These neural networks can generalize better by using continuous space representation of words and learn non-linear mappings. Moreover, they can leverage contextual information from the source sentence more effectively. By adding these two components, we achieve statistically significant improvement in accuracy for grammatical error correction over a state-of-the-art GEC system.