scispace - formally typeset
Search or ask a question
JournalISSN: 0032-6585

The Prague Bulletin of Mathematical Linguistics 

De Gruyter Open
About: The Prague Bulletin of Mathematical Linguistics is an academic journal published by De Gruyter Open. The journal publishes majorly in the area(s): Machine translation & Treebank. It has an ISSN identifier of 0032-6585. It is also open access. Over the lifetime, 317 publications have been published receiving 4325 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A Productivity Test of Statistical Machine Translation Post-Editing in a Typical Localisation Context and results show a productivity increase for each participant, with significant variance across inviduals.
Abstract: We evaluated the productivity increase of statistical MT post-editing as compared to traditional translation in a two-day test involving twelve participants translating from English to French, Italian, German, and Spanish. The test setup followed an empirical methodology. A random subset of the entire new content produced in our company during a given year was translated with statistical MT engines trained on data from the previous year. The translation environment recorded translation and post-editing times for each sentence. The results show a productivity increase for each participant, with significant variance across inviduals.

240 citations

Journal ArticleDOI
TL;DR: This article examined some of the critical parameters that affect the final translation quality, memory usage, training stability and training time, concluding each experiment with a set of recommendations for fellow researchers, and provided practical tips for improved training regarding batch size, learning rate, warmup steps, maximum sentence length and checkpoint averaging.
Abstract: This article describes our experiments in neural machine translation using the recent Tensor2Tensor framework and the Transformer sequence-to-sequence model (Vaswani et al., 2017). We examine some of the critical parameters that affect the final translation quality, memory usage, training stability and training time, concluding each experiment with a set of recommendations for fellow researchers. In addition to confirming the general mantra "more data and larger models", we address scaling to multiple GPUs and provide practical tips for improved training regarding batch size, learning rate, warmup steps, maximum sentence length and checkpoint averaging. We hope that our observations will allow others to get better results given their particular hardware and data constraints.

180 citations

Journal ArticleDOI
TL;DR: Comparing the quality of NMT systems with statistical MT is compared by describing three studies using automatic and human evaluation methods by reporting increases in fluency but inconsistent results for adequacy and post-editing effort.
Abstract: This paper discusses neural machine translation (NMT), a new paradigm in the MT field, comparing the quality of NMT systems with statistical MT by describing three studies using automatic and human evaluation methods. Automatic evaluation results presented for NMT are very promising, however human evaluations show mixed results. We report increases in fluency but inconsistent results for adequacy and post-editing effort. NMT undoubtedly represents a step forward for the MT field, but one that the community should be careful not to oversell.

157 citations

Journal ArticleDOI
TL;DR: It is established that Z-MERT is extremely efficient, making it well-suited for time-sensitive pipelines, and an insight into the tool's runtime in terms of several variables (size of the development set, size of produced N-best lists, etc).
Abstract: We introduce Z-MERT, a soware tool for minimum error rate training of machine translation systems (Och, 2003). In addition to being an open source tool that is extremely easy to compile and run, Z-MERT is also agnostic regarding the evaluation metric, fully configurable, and requires no modification to work with any decoder. We describe Z-MERT and review its features, and report the results of a series of experiments that examine the tool’s runtime. We establish that Z-MERT is extremely efficient, making it well-suited for time-sensitive pipelines. e experiments also provide an insight into the tool’s runtime in terms of several variables (size of the development set, size of produced N-best lists, etc).

144 citations

Journal ArticleDOI
TL;DR: It is shown that a phrase-based statistical machine translation (SMT) system produces translations of higher quality when using word alignments from EFMARAL than from fast_align, and that translation quality is on par with what is obtained using GIZA++, a tool requiring orders of magnitude more processing time.
Abstract: We present efmaral, a new system for efficient and accurate word alignment using a Bayesian model with Markov Chain Monte Carlo (MCMC) inference. Through careful selection of data structures and mo ...

117 citations

Performance
Metrics
No. of papers from the Journal in previous years
YearPapers
20233
20228
20211
202011
20195
201811