scispace - formally typeset
Open AccessProceedings ArticleDOI

Generating titles for millions of browse pages on an e-Commerce site

Reads0
Chats0
TLDR
An automatic post-editing approach which learns how to post-edit the rule-based titles into curated titles for browse pages in five different languages, namely English, German, French, Italian and Spanish is presented.
Abstract
We present two approaches to generate titles for browse pages in five different languages, namely English, German, French, Italian and Spanish These browse pages are structured search pages in an e-commerce domain We first present a rule-based approach to generate these browse page titles In addition, we also present a hybrid approach which uses a phrase-based statistical machine translation engine on top of the rule-based system to assemble the best title For the two languages English and German we have access to a large amount of already available rule-based generated and curated titles For these languages we present an automatic post-editing approach which learns how to post-edit the rule-based titles into curated titles

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings Article

ESCAPE: a Large-scale Synthetic Corpus for Automatic Post-Editing

TL;DR: The Synthetic Corpus for Automatic Post-Editing (SCAPE) as discussed by the authors is the largest freely-available synthetic corpus for automatic post-editing, consisting of millions of entries in which the MT element of the training triplets has been obtained by translating the source side of publicly available parallel corpora, and using the target side as an artificial human post-edit.
Posted Content

eSCAPE: a Large-scale Synthetic Corpus for Automatic Post-Editing

TL;DR: eSCAPE is presented, the largest freely-available Synthetic Corpus for Automatic Post-Editing released so far, and consists of millions of entries in which the MT element of the training triplets has been obtained by translating the source side of publicly-available parallel corpora, and using the target side as an artificial human post-edit.
Proceedings ArticleDOI

Generating Titles for Web Tables

TL;DR: The proposed technique is the first to consider text-generation methods for table titles, and establishes a new state of the art for generating high-quality table titles using a sequence-to-sequence neural network model.
Proceedings ArticleDOI

Generating E-Commerce Product Titles and Predicting their Quality

TL;DR: This work proposes approaches that automatically generate product titles, predict their quality and both automatic and human evaluations performed on real-world data indicate these approaches are effective and applicable to existing e-commerce scenarios.
Posted Content

Generating Titles for Web Tables

TL;DR: In this article, a sequence-to-sequence neural network model is proposed to generate table titles from web tables. But the model is limited to text snippets that have potentially relevant information to the table, encoding them into an input sequence, and using both copy and generation mechanisms in the decoder to balance relevance and readability of the generated title.
References
More filters
Proceedings ArticleDOI

Bleu: a Method for Automatic Evaluation of Machine Translation

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
Proceedings Article

Neural Machine Translation by Jointly Learning to Align and Translate

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Posted Content

Neural Machine Translation by Jointly Learning to Align and Translate

TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Proceedings ArticleDOI

Moses: Open Source Toolkit for Statistical Machine Translation

TL;DR: An open-source toolkit for statistical machine translation whose novel contributions are support for linguistically motivated factors, confusion network decoding, and efficient data formats for translation models and language models.
Proceedings Article

KenLM: Faster and Smaller Language Model Queries

TL;DR: KenLM is a library that implements two data structures for efficient language model queries, reducing both time and memory costs and is integrated into the Moses, cdec, and Joshua translation systems.
Related Papers (5)