Data Decisions and Theoretical Implications when Adversarially Learning Fair Representations

Home
/
Papers
/
Data Decisions and Theoretical Implications when Adversarially Learning Fair Representations

Posted Content•

Data Decisions and Theoretical Implications when Adversarially Learning Fair Representations

Alex Beutel, Ed H. Chi, Jilin Chen, Zhe Zhao

01 Jul 2017-arXiv: Learning-

TL;DR: An adversarial training procedure is used to remove information about the sensitive attribute from the latent representation learned by a neural network, and the data distribution empirically drives the adversary's notion of fairness.

read less

Abstract: How can we learn a classifier that is "fair" for a protected or sensitive group, when we do not know if the input to the classifier belongs to the protected group? How can we train such a classifier when data on the protected group is difficult to attain? In many settings, finding out the sensitive input attribute can be prohibitively expensive even during model training, and sometimes impossible during model serving. For example, in recommender systems, if we want to predict if a user will click on a given recommendation, we often do not know many attributes of the user, e.g., race or age, and many attributes of the content are hard to determine, e.g., the language or topic. Thus, it is not feasible to use a different classifier calibrated based on knowledge of the sensitive attribute. Here, we use an adversarial training procedure to remove information about the sensitive attribute from the latent representation learned by a neural network. In particular, we study how the choice of data for the adversarial training effects the resulting fairness properties. We find two interesting results: a small amount of data is needed to train these adversarial models, and the data distribution empirically drives the adversary's notion of fairness.

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Mitigating Unwanted Biases with Adversarial Learning

[...]

Brian Hu Zhang¹, Blake Lemoine², Margaret Mitchell²•Institutions (2)

Stanford University¹, Google²

27 Dec 2018

TL;DR: This work presents a framework for mitigating biases concerning demographic groups by including a variable for the group of interest and simultaneously learning a predictor and an adversary, which results in accurate predictions that exhibit less evidence of stereotyping Z.

...read moreread less

Abstract: Machine learning is a tool for building models that accurately represent input training data. When undesired biases concerning demographic groups are in the training data, well-trained models will reflect those biases. We present a framework for mitigating such biases by including a variable for the group of interest and simultaneously learning a predictor and an adversary. The input to the network X, here text or census data, produces a prediction Y, such as an analogy completion or income bracket, while the adversary tries to model a protected variable Z, here gender or zip code. The objective is to maximize the predictor's ability to predict Y while minimizing the adversary's ability to predict Z. Applied to analogy completion, this method results in accurate predictions that exhibit less evidence of stereotyping Z. When applied to a classification task using the UCI Adult (Census) Dataset, it results in a predictive model that does not lose much accuracy while achieving very close to equality of odds (Hardt, et al., 2016). The method is flexible and applicable to multiple definitions of fairness as well as a wide range of gradient-based learning models, including both regression and classification tasks.

...read moreread less

945 citations

Cites methods or result from "Data Decisions and Theoretical Impl..."

...[2] apply an adversarial training method to achieve eqality of opportunity in cases when the output variable is discrete....
[...]
...[2], and find we are able to better equalize the differences between the two groups, measured by both False Positive Rate and False Negative Rate (1 - True Positive Rate), although note that the previous work performs better overall for False Negative Rate....
[...]
...[2], we attempt to enforce eqality of odds on a model for the task of predicting the income of a person – in particular, predicting whether the income is > $50k – given various attributes about the person, as made available in the UCI Adult dataset [1]....
[...]

Proceedings Article•DOI•

Measuring and Mitigating Unintended Bias in Text Classification

[...]

Lucas Dixon¹, John Li¹, Jeffrey Sorensen¹, Nithum Thain¹, Lucy Vasserman¹ - Show less +1 more•Institutions (1)

Google¹

27 Dec 2018

TL;DR: A new approach to measuring and mitigating unintended bias in machine learning models is introduced, using a set of common demographic identity terms as the subset of input features on which to measure bias.

...read moreread less

Abstract: We introduce and illustrate a new approach to measuring and mitigating unintended bias in machine learning models. Our definition of unintended bias is parameterized by a test set and a subset of input features. We illustrate how this can be used to evaluate text classifiers using a synthetic test set and a public corpus of comments annotated for toxicity from Wikipedia Talk pages. We also demonstrate how imbalances in training data can lead to unintended bias in the resulting models, and therefore potentially unfair applications. We use a set of common demographic identity terms as the subset of input features on which we measure bias. This technique permits analysis in the common scenario where demographic information on authors and readers is unavailable, so that bias mitigation must focus on the content of the text itself. The mitigation method we introduce is an unsupervised approach based on balancing the training dataset. We demonstrate that this approach reduces the unintended bias without compromising overall model quality.

...read moreread less

549 citations

Cites background or methods from "Data Decisions and Theoretical Impl..."

...This concept inspires the error rate equality difference metrics, which use the variation in these error rates between terms to measure the extent of unintended bias in the model, similar to the equality gap metric used in [2]....
[...]
...[2] presents a new mitigation technique using adversarial training that requires only a small amount of labeled demographic data....
[...]

Journal Article•DOI•

Ensuring Fairness in Machine Learning to Advance Health Equity.

[...]

Alvin Rajkomar¹, Michaela Hardt², Michael D. Howell², Greg S. Corrado², Marshall H. Chin³ - Show less +1 more•Institutions (3)

University of California, San Francisco¹, Google², University of Chicago³

18 Dec 2018-Annals of Internal Medicine

TL;DR: The mechanisms by which a model's design, data, and deployment may lead to disparities are described; how different approaches to distributive justice in machine learning can advance health equity are explained; and what contexts are more appropriate for different equity approaches inMachine learning.

...read moreread less

Abstract: Machine learning is used increasingly in clinical care to improve diagnosis, treatment selection, and health system efficiency. Because machine-learning models learn from historically collected data, populations that have experienced human and structural biases in the past-called protected groups-are vulnerable to harm by incorrect predictions or withholding of resources. This article describes how model design, biases in data, and the interactions of model predictions with clinicians and patients may exacerbate health care disparities. Rather than simply guarding against these harms passively, machine-learning systems should be used proactively to advance health equity. For that goal to be achieved, principles of distributive justice must be incorporated into model design, deployment, and evaluation. The article describes several technical implementations of distributive justice-specifically those that ensure equality in patient outcomes, performance, and resource allocation-and guides clinicians as to when they should prioritize each principle. Machine learning is providing increasingly sophisticated decision support and population-level monitoring, and it should encode principles of justice to ensure that models benefit all patients.

...read moreread less

438 citations

Posted Content•

Learning Adversarially Fair and Transferable Representations

[...]

David Madras¹, Elliot Creager¹, Toniann Pitassi¹, Richard S. Zemel¹•Institutions (1)

University of Toronto¹

17 Feb 2018-arXiv: Learning

TL;DR: This paper presents the first in-depth experimental demonstration of fair transfer learning and demonstrates empirically that the authors' learned representations admit fair predictions on new tasks while maintaining utility, an essential goal of fair representation learning.

...read moreread less

Abstract: In this paper, we advocate for representation learning as the key to mitigating unfair prediction outcomes downstream. Motivated by a scenario where learned representations are used by third parties with unknown objectives, we propose and explore adversarial representation learning as a natural method of ensuring those parties act fairly. We connect group fairness (demographic parity, equalized odds, and equal opportunity) to different adversarial objectives. Through worst-case theoretical guarantees and experimental validation, we show that the choice of this objective is crucial to fair prediction. Furthermore, we present the first in-depth experimental demonstration of fair transfer learning and demonstrate empirically that our learned representations admit fair predictions on new tasks while maintaining utility, an essential goal of fair representation learning.

...read moreread less

350 citations

Additional excerpts

...Beutel et al. (2017) explored the particular fairness levels achieved by the algorithm from Edwards & Storkey (2016), and demonstrated that they can vary as a function of the demographic unbalance of the training data....
[...]

Posted Content•

A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications

[...]

Jie Gui¹, Zhenan Sun, Yonggang Wen², Dacheng Tao³, Jieping Ye⁴ - Show less +1 more•Institutions (4)

Southeast University¹, Nanyang Technological University², University of Sydney³, University of Michigan⁴

20 Jan 2020-arXiv: Learning

TL;DR: This paper attempts to provide a review on various GANs methods from the perspectives of algorithms, theory, and applications, and compares the commonalities and differences of these GAns methods.

...read moreread less

Abstract: Generative adversarial networks (GANs) are a hot research topic recently. GANs have been widely studied since 2014, and a large number of algorithms have been proposed. However, there is few comprehensive study explaining the connections among different GANs variants, and how they have evolved. In this paper, we attempt to provide a review on various GANs methods from the perspectives of algorithms, theory, and applications. Firstly, the motivations, mathematical representations, and structure of most GANs algorithms are introduced in details. Furthermore, GANs have been combined with other machine learning algorithms for specific applications, such as semi-supervised learning, transfer learning, and reinforcement learning. This paper compares the commonalities and differences of these GANs methods. Secondly, theoretical issues related to GANs are investigated. Thirdly, typical applications of GANs in image processing and computer vision, natural language processing, music, speech and audio, medical field, and data science are illustrated. Finally, the future open research problems for GANs are pointed out.

...read moreread less

344 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71

Collapse

References

PDF

Open Access

More filters

UCI Machine Learning Repository

[...]

A. Asuncion

01 Jan 2007

17,341 citations

Proceedings Article•

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization.

[...]

John C. Duchi¹, Elad Hazan², Yoram Singer³•Institutions (3)

University of California, Berkeley¹, IBM², Google³

01 Jan 2010

TL;DR: Adaptive subgradient methods as discussed by the authors dynamically incorporate knowledge of the geometry of the data observed in earlier iterations to perform more informative gradient-based learning, which allows us to find needles in haystacks in the form of very predictive but rarely seen features.

...read moreread less

Abstract: We present a new family of subgradient methods that dynamically incorporate knowledge of the geometry of the data observed in earlier iterations to perform more informative gradient-based learning. Metaphorically, the adaptation allows us to find needles in haystacks in the form of very predictive but rarely seen features. Our paradigm stems from recent advances in stochastic optimization and online learning which employ proximal functions to control the gradient steps of the algorithm. We describe and analyze an apparatus for adaptively modifying the proximal function, which significantly simplifies setting a learning rate and results in regret guarantees that are provably as good as the best proximal function that can be chosen in hindsight. We give several efficient algorithms for empirical risk minimization problems with common and important regularization functions and domain constraints. We experimentally study our theoretical analysis and show that adaptive subgradient methods outperform state-of-the-art, yet non-adaptive, subgradient algorithms.

...read moreread less

7,244 citations

Journal Article•

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization

[...]

John C. Duchi¹, Elad Hazan², Yoram Singer³•Institutions (3)

University of California, Berkeley¹, Princeton University², Google³

01 Feb 2011-Journal of Machine Learning Research

TL;DR: This work describes and analyze an apparatus for adaptively modifying the proximal function, which significantly simplifies setting a learning rate and results in regret guarantees that are provably as good as the best proximal functions that can be chosen in hindsight.

...read moreread less

6,984 citations

"Data Decisions and Theoretical Impl..." refers methods in this paper

...Both the adversarial head and the primary head are trained with a logistic loss function, and we use the Adagrad [4] optimizer in Tensor ow with step size 0.01 for 100,000 steps....
[...]
...Both the adversarial head and the primary head are trained with a logistic loss function, and we use the Adagrad [4] optimizer in Tensorow with step size 0....
[...]

Book Chapter•DOI•

Domain-adversarial training of neural networks

[...]

Yaroslav Ganin¹, Evgeniya Ustinova¹, Hana Ajakan², Pascal Germain², Hugo Larochelle³, François Laviolette², Mario Marchand², Victor Lempitsky¹ - Show less +4 more•Institutions (3)

Skolkovo Institute of Science and Technology¹, Laval University², Université de Sherbrooke³

01 Jan 2016-Journal of Machine Learning Research

TL;DR: In this article, a new representation learning approach for domain adaptation is proposed, in which data at training and test time come from similar but different distributions, and features that cannot discriminate between the training (source) and test (target) domains are used to promote the emergence of features that are discriminative for the main learning task on the source domain.

...read moreread less

Abstract: We introduce a new representation learning approach for domain adaptation, in which data at training and test time come from similar but different distributions. Our approach is directly inspired by the theory on domain adaptation suggesting that, for effective domain transfer to be achieved, predictions must be made based on features that cannot discriminate between the training (source) and test (target) domains. The approach implements this idea in the context of neural network architectures that are trained on labeled data from the source domain and unlabeled data from the target domain (no labeled target-domain data is necessary). As the training progresses, the approach promotes the emergence of features that are (i) discriminative for the main learning task on the source domain and (ii) indiscriminate with respect to the shift between the domains. We show that this adaptation behaviour can be achieved in almost any feed-forward model by augmenting it with few standard layers and a new gradient reversal layer. The resulting augmented architecture can be trained using standard backpropagation and stochastic gradient descent, and can thus be implemented with little effort using any of the deep learning packages. We demonstrate the success of our approach for two distinct classification problems (document sentiment analysis and image classification), where state-of-the-art domain adaptation performance on standard benchmarks is achieved. We also validate the approach for descriptor learning task in the context of person re-identification application.

...read moreread less

4,862 citations

Proceedings Article•

Equality of opportunity in supervised learning

[...]

Moritz Hardt¹, Eric Price², Nathan Srebro•Institutions (2)

Google¹, University of Texas at Austin²

05 Dec 2016

TL;DR: This work proposes a criterion for discrimination against a specified sensitive attribute in supervised learning, where the goal is to predict some target based on available features and shows how to optimally adjust any learned predictor so as to remove discrimination according to this definition.

...read moreread less

Abstract: We propose a criterion for discrimination against a specified sensitive attribute in supervised learning, where the goal is to predict some target based on available features. Assuming data about the predictor, target, and membership in the protected group are available, we show how to optimally adjust any learned predictor so as to remove discrimination according to our definition. Our framework also improves incentives by shifting the cost of poor classification from disadvantaged groups to the decision maker, who can respond by improving the classification accuracy. We enourage readers to consult the more complete manuscript on the arXiv.

...read moreread less

2,690 citations

"Data Decisions and Theoretical Impl..." refers background or methods in this paper

...Recent literature sharpening the denition of fairness has relied on a calibration procedure that breaks this constraint [7, 8]....
[...]
...We will primarily work o of the denitions oered in [7]....
[...]
...Where as [7] focuses on equality of outcomes, this method encourages unbiased latent representations inside the model....
[...]
...[7, 8] have both oered novel theoretical work explaining the trade-os between demographic parity, previously focused on as “fair,” and alternative formulations focused more closely on model accuracy....
[...]
...[7] oers a method for achieving equality of opportunity, but does so through a post-processing algorithm, taking as input the model’s prediction and the sensitive aribute....
[...]