Learning Gender-Neutral Word Embeddings

doi:10.18653/V1/D18-1521

Open AccessProceedings ArticleDOI

Learning Gender-Neutral Word Embeddings

- pp 4847-4853

TLDR

This article proposed a novel training procedure for learning gender-neutral word embeddings, which aims to preserve gender information in certain dimensions of word vectors while compelling other dimensions to be free of gender influence.

Abstract:

Word embedding models have become a fundamental component in a wide range of Natural Language Processing (NLP) applications However, embeddings trained on human-generated corpora have been demonstrated to inherit strong gender stereotypes that reflect social constructs To address this concern, in this paper, we propose a novel training procedure for learning gender-neutral word embeddings Our approach aims to preserve gender information in certain dimensions of word vectors while compelling other dimensions to be free of gender influence Based on the proposed method, we generate a Gender-Neutral variant of GloVe (GN-GloVe) Quantitative and qualitative experiments demonstrate that GN-GloVe successfully isolates gender information without sacrificing the functionality of the embedding model

Citations

PDF

Open Access

More filters

Posted Content

A Survey on Bias and Fairness in Machine Learning

Ninareh Mehrabi, +4 more

- 23 Aug 2019 -

arXiv: Learning

TL;DR: This survey investigated different real-world applications that have shown biases in various ways, and created a taxonomy for fairness definitions that machine learning researchers have defined to avoid the existing bias in AI systems.

...read moreread less

Journal ArticleDOI

A Survey on Bias and Fairness in Machine Learning

Ninareh Mehrabi, +4 more

- 13 Jul 2021 -

ACM Computing Surveys

TL;DR: In this article, the authors present a taxonomy for fairness definitions that machine learning researchers have defined to avoid the existing bias in AI systems and examine different domains and subdomains in AI showing what researchers have observed with regard to unfair outcomes in the state-of-the-art methods and ways they have tried to address them.

...read moreread less

Posted Content

Language (Technology) is Power: A Critical Survey of "Bias" in NLP

Su Lin Blodgett, +3 more

- 28 May 2020 -

arXiv: Computation and Language

TL;DR: The authors survey 146 papers analyzing "bias" in NLP systems, finding that their motivations are often vague, inconsistent, and lacking in normative reasoning, despite the fact that analyzing bias is an inherently normative process.

...read moreread less

Proceedings ArticleDOI

Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in Deep Image Representations

Tianlu Wang, +4 more

TL;DR: It is shown that trained models significantly amplify the association of target labels with gender beyond what one would expect from biased datasets, and an adversarial approach is adopted to remove unwanted features corresponding to protected variables from intermediate representations in a deep neural network.

...read moreread less

Proceedings ArticleDOI

Mitigating Gender Bias in Natural Language Processing: Literature Review

Tony Sun, +9 more

TL;DR: This paper discusses gender bias based on four forms of representation bias and analyzes methods recognizing gender bias in NLP, and discusses the advantages and drawbacks of existing gender debiasing methods.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Glove: Global Vectors for Word Representation

Jeffrey Pennington, +2 more

TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.

...read moreread less

Proceedings Article

Distributed Representations of Words and Phrases and their Compositionality

Tomas Mikolov, +4 more

TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.

...read moreread less

Posted Content

Efficient Estimation of Word Representations in Vector Space

Tomas Mikolov, +3 more

- 16 Jan 2013 -

arXiv: Computation and Language

TL;DR: This paper proposed two novel model architectures for computing continuous vector representations of words from very large data sets, and the quality of these representations is measured in a word similarity task and the results are compared to the previously best performing techniques based on different types of neural networks.

...read moreread less

Proceedings Article

Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau, +2 more

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

Journal ArticleDOI

WordNet : an electronic lexical database

Christiane Fellbaum

- 01 Sep 2000 -

Language

TL;DR: The lexical database: nouns in WordNet, Katherine J. Miller a semantic network of English verbs, and applications of WordNet: building semantic concordances are presented.

...read moreread less

Collapse

Related Papers (5)

Man is to computer programmer as woman is to homemaker? debiasing word embeddings

Tolga Bolukbasi, +4 more

Semantics derived automatically from language corpora contain human-like biases

Aylin Caliskan, +3 more

- 14 Apr 2017 -

Science

Learning Gender-Neutral Word Embeddings

Citations

A Survey on Bias and Fairness in Machine Learning

A Survey on Bias and Fairness in Machine Learning

Language (Technology) is Power: A Critical Survey of "Bias" in NLP

Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in Deep Image Representations

Mitigating Gender Bias in Natural Language Processing: Literature Review

References

Glove: Global Vectors for Word Representation

Distributed Representations of Words and Phrases and their Compositionality

Efficient Estimation of Word Representations in Vector Space

Neural Machine Translation by Jointly Learning to Align and Translate

WordNet : an electronic lexical database

Related Papers (5)

Man is to computer programmer as woman is to homemaker? debiasing word embeddings

Semantics derived automatically from language corpora contain human-like biases

Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods

Glove: Global Vectors for Word Representation

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding