Getting Gender Right in Neural Machine Translation

doi:10.18653/V1/D18-1334

Open AccessProceedings ArticleDOI

Getting Gender Right in Neural Machine Translation

- pp 3003-3008

TLDR

This paper integrated gender information into NMT systems to improve the translation quality of French-English NMT for multiple language pairs, and found that adding a gender feature to an NMT system significantly improved translation quality for some language pairs.

Abstract:

Speakers of different languages must attend to and encode strikingly different aspects of the world in order to use their language correctly (Sapir, 1921; Slobin, 1996). One such difference is related to the way gender is expressed in a language. Saying “I am happy” in English, does not encode any additional knowledge of the speaker that uttered the sentence. However, many other languages do have grammatical gender systems and so such knowledge would be encoded. In order to correctly translate such a sentence into, say, French, the inherent gender information needs to be retained/recovered. The same sentence would become either “Je suis heureux”, for a male speaker or “Je suis heureuse” for a female one. Apart from morphological agreement, demographic factors (gender, age, etc.) also influence our use of language in terms of word choices or even on the level of syntactic constructions (Tannen, 1991; Pennebaker et al., 2003). We integrate gender information into NMT systems. Our contribution is twofold: (1) the compilation of large datasets with speaker information for 20 language pairs, and (2) a simple set of experiments that incorporate gender information into NMT for multiple language pairs. Our experiments show that adding a gender feature to an NMT system significantly improves the translation quality for some language pairs.

Citations

PDF

Open Access

More filters

Posted Content

A Survey on Bias and Fairness in Machine Learning

Ninareh Mehrabi, +4 more

- 23 Aug 2019 -

arXiv: Learning

TL;DR: This survey investigated different real-world applications that have shown biases in various ways, and created a taxonomy for fairness definitions that machine learning researchers have defined to avoid the existing bias in AI systems.

...read moreread less

Journal ArticleDOI

A Survey on Bias and Fairness in Machine Learning

Ninareh Mehrabi, +4 more

- 13 Jul 2021 -

ACM Computing Surveys

TL;DR: In this article, the authors present a taxonomy for fairness definitions that machine learning researchers have defined to avoid the existing bias in AI systems and examine different domains and subdomains in AI showing what researchers have observed with regard to unfair outcomes in the state-of-the-art methods and ways they have tried to address them.

...read moreread less

Posted Content

Language (Technology) is Power: A Critical Survey of "Bias" in NLP

Su Lin Blodgett, +3 more

- 28 May 2020 -

arXiv: Computation and Language

TL;DR: The authors survey 146 papers analyzing "bias" in NLP systems, finding that their motivations are often vague, inconsistent, and lacking in normative reasoning, despite the fact that analyzing bias is an inherently normative process.

...read moreread less

Proceedings ArticleDOI

Mitigating Gender Bias in Natural Language Processing: Literature Review

Tony Sun, +9 more

TL;DR: This paper discusses gender bias based on four forms of representation bias and analyzes methods recognizing gender bias in NLP, and discusses the advantages and drawbacks of existing gender debiasing methods.

...read moreread less

Proceedings ArticleDOI

Language (Technology) is Power: A Critical Survey of "Bias" in NLP

Su Lin Blodgett, +3 more

TL;DR: A greater recognition of the relationships between language and social hierarchies is urged, encouraging researchers and practitioners to articulate their conceptualizations of “bias” and to center work around the lived experiences of members of communities affected by NLP systems.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Bleu: a Method for Automatic Evaluation of Machine Translation

Kishore Papineni, +3 more

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.

...read moreread less

Proceedings Article

Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau, +2 more

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

Proceedings ArticleDOI

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

Kyunghyun Cho, +8 more

TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.

...read moreread less

Proceedings Article

Sequence to Sequence Learning with Neural Networks

Ilya Sutskever, +2 more

TL;DR: The authors used a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.

...read moreread less

Europarl: A Parallel Corpus for Statistical Machine Translation

Philipp Koehn

TL;DR: A corpus of parallel text in 11 languages from the proceedings of the European Parliament is collected and its acquisition and application as training data for statistical machine translation (SMT) is focused on.

...read moreread less

Getting Gender Right in Neural Machine Translation

Citations

A Survey on Bias and Fairness in Machine Learning

A Survey on Bias and Fairness in Machine Learning

Language (Technology) is Power: A Critical Survey of "Bias" in NLP

Mitigating Gender Bias in Natural Language Processing: Literature Review

Language (Technology) is Power: A Critical Survey of "Bias" in NLP

References

Bleu: a Method for Automatic Evaluation of Machine Translation

Neural Machine Translation by Jointly Learning to Align and Translate

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

Sequence to Sequence Learning with Neural Networks

Europarl: A Parallel Corpus for Statistical Machine Translation

Related Papers (5)

Attention is All you Need

Bleu: a Method for Automatic Evaluation of Machine Translation

Man is to computer programmer as woman is to homemaker? debiasing word embeddings

Neural Machine Translation of Rare Words with Subword Units

Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods