Home
/
Authors
/
Yichao Zhou

Author

Yichao Zhou

Other affiliations: Southeast University, University of California, Berkeley, Verizon Communications

Bio: Yichao Zhou is an academic researcher from University of California, Los Angeles. The author has contributed to research in topics: Computer science & Online advertising. The author has an hindex of 7, co-authored 28 publications receiving 457 citations. Previous affiliations of Yichao Zhou include Southeast University & University of California, Berkeley.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Learning Gender-Neutral Word Embeddings

[...]

Jieyu Zhao¹, Yichao Zhou², Zeyu Li¹, Wei Wang¹, Kai-Wei Chang³ - Show less +1 more•Institutions (3)

University of California, Los Angeles¹, Verizon Communications², Boston University³

01 Aug 2018

TL;DR: This article proposed a novel training procedure for learning gender-neutral word embeddings, which aims to preserve gender information in certain dimensions of word vectors while compelling other dimensions to be free of gender influence.

...read moreread less

Abstract: Word embedding models have become a fundamental component in a wide range of Natural Language Processing (NLP) applications However, embeddings trained on human-generated corpora have been demonstrated to inherit strong gender stereotypes that reflect social constructs To address this concern, in this paper, we propose a novel training procedure for learning gender-neutral word embeddings Our approach aims to preserve gender information in certain dimensions of word vectors while compelling other dimensions to be free of gender influence Based on the proposed method, we generate a Gender-Neutral variant of GloVe (GN-GloVe) Quantitative and qualitative experiments demonstrate that GN-GloVe successfully isolates gender information without sacrificing the functionality of the embedding model

...read moreread less

319 citations

Posted Content•

Learning Gender-Neutral Word Embeddings

[...]

Jieyu Zhao¹, Yichao Zhou², Zeyu Li¹, Wei Wang¹, Kai-Wei Chang³ - Show less +1 more•Institutions (3)

University of California, Los Angeles¹, Verizon Communications², Boston University³

29 Aug 2018-arXiv: Computation and Language

TL;DR: A novel training procedure for learning gender-neutral word embeddings that preserves gender information in certain dimensions of word vectors while compelling other dimensions to be free of gender influence is proposed.

...read moreread less

Abstract: Word embedding models have become a fundamental component in a wide range of Natural Language Processing (NLP) applications. However, embeddings trained on human-generated corpora have been demonstrated to inherit strong gender stereotypes that reflect social constructs. To address this concern, in this paper, we propose a novel training procedure for learning gender-neutral word embeddings. Our approach aims to preserve gender information in certain dimensions of word vectors while compelling other dimensions to be free of gender influence. Based on the proposed method, we generate a Gender-Neutral variant of GloVe (GN-GloVe). Quantitative and qualitative experiments demonstrate that GN-GloVe successfully isolates gender information without sacrificing the functionality of the embedding model.

...read moreread less

127 citations

Proceedings Article•DOI•

Learning to Discriminate Perturbations for Blocking Adversarial Attacks in Text Classification

[...]

Yichao Zhou¹, Jyun-Yu Jiang², Kai-Wei Chang², Wei Wang²•Institutions (2)

University of California, Berkeley¹, University of California, Los Angeles²

01 Nov 2019

TL;DR: A novel framework, learning to discriminate perturbation (DISP), to identify and adjust malicious perturbations, thereby blocking adversarial attacks for text classification models and shows the robustness of DISP across different situations.

...read moreread less

Abstract: Adversarial attacks against machine learning models have threatened various real-world applications such as spam filtering and sentiment analysis. In this paper, we propose a novel framework, learning to discriminate perturbations (DISP), to identify and adjust malicious perturbations, thereby blocking adversarial attacks for text classification models. To identify adversarial attacks, a perturbation discriminator validates how likely a token in the text is perturbed and provides a set of potential perturbations. For each potential perturbation, an embedding estimator learns to restore the embedding of the original word based on the context and a replacement token is chosen based on approximate kNN search. DISP can block adversarial attacks for any NLP model without modifying the model structure or training procedure. Extensive experiments on two benchmark datasets demonstrate that DISP significantly outperforms baseline methods in blocking adversarial attacks for text classification. In addition, in-depth analysis shows the robustness of DISP across different situations.

...read moreread less

79 citations

Proceedings Article•DOI•

Understanding Consumer Journey using Attention based Recurrent Neural Networks

[...]

Yichao Zhou¹, Shaunak Mishra², Jelena Gligorijevic², Tarun Bhatia², Narayan Bhamidipati² - Show less +1 more•Institutions (2)

University of California, Los Angeles¹, Yahoo!²

25 Jul 2019

TL;DR: An attention based recurrent neural network (RNN) which ingests a user activity trail, and predicts the user's conversion probability along with attention weights for each activity (analogous to its position in the funnel) is proposed.

...read moreread less

Abstract: Paths of online users towards a purchase event (conversion) can be very complex, and guiding them through their journey is an integral part of online advertising. Studies in marketing indicate that a conversion event is typically preceded by one or more purchase funnel stages, viz., unaware, aware, interest, consideration, and intent. Intuitively, some online activities, including web searches, site visits and ad interactions, can serve as markers for the user's funnel stage. Identifying such markers can potentially refine conversion prediction, guide the design of ad creatives (text and images), and lead to higher ad effectiveness. We explore this hypothesis through a set of experiments designed for two tasks: (i) conversion prediction given a user's activity trail, and (ii) funnel stage specific targeting and creatives. To address challenges in the two tasks, we propose an attention based recurrent neural network (RNN) which ingests a user activity trail, and predicts the user's conversion probability along with attention weights for each activity (analogous to its position in the funnel). Specifically, we propose novel attention mechanisms, which maintain a global weight for each activity across all user trails, and also indicate the activity's funnel stage. Use of the proposed attention mechanisms for the first task of conversion prediction shows significant AUC lifts of 0.9% on a public dataset (RecSys 2015 challenge), and up to 3.6% on three proprietary datasets from a major advertising platform (Yahoo Gemini). To address the second task, the activity weights from the proposed mechanisms are used to automatically assign users to funnel stages via a scalable scoring method. Offline evaluation shows that such activity weights are more aligned with editorially tagged activity-funnel stages compared to weights from existing attention mechanisms and simpler conversion models like logistic regression. In addition, results of online ad campaigns in Yahoo Gemini with funnel specific user targeting and ad creatives show strong performance lifts further validating the connection across online activities, purchase funnel stages, stage-specific custom creatives, and conversions.

...read moreread less

35 citations

Posted Content•

Clinical Temporal Relation Extraction with Probabilistic Soft Logic Regularization and Global Inference

[...]

Yichao Zhou¹, Yu Yan, Rujun Han², J. Harry Caufield³, Kai-Wei Chang¹, Yizhou Sun¹, Peipei Ping¹, Wei Wang⁴ - Show less +4 more•Institutions (4)

University of California, Los Angeles¹, Information Sciences Institute², University of California, Berkeley³, Beihang University⁴

16 Dec 2020-arXiv: Computation and Language

TL;DR: A novel method, Clinical Temporal ReLation Exaction with Probabilistic Soft Logic Regularization and Global Inference (CTRL-PG) to tackle the problem at the document level, and significantly outperforms baseline methods for temporal relation extraction.

...read moreread less

Abstract: There has been a steady need in the medical community to precisely extract the temporal relations between clinical events. In particular, temporal information can facilitate a variety of downstream applications such as case report retrieval and medical question answering. Existing methods either require expensive feature engineering or are incapable of modeling the global relational dependencies among the events. In this paper, we propose a novel method, Clinical Temporal ReLation Exaction with Probabilistic Soft Logic Regularization and Global Inference (CTRL-PG) to tackle the problem at the document level. Extensive experiments on two benchmark datasets, I2B2-2012 and TB-Dense, demonstrate that CTRL-PG significantly outperforms baseline methods for temporal relation extraction.

...read moreread less

24 citations

1
2
3
4
…
5
6
7

Collapse

Cited by

PDF

Open Access

More filters

Posted Content•

A Survey on Bias and Fairness in Machine Learning

[...]

Ninareh Mehrabi¹, Fred Morstatter¹, Nripsuta Saxena¹, Kristina Lerman¹, Aram Galstyan¹ - Show less +1 more•Institutions (1)

Information Sciences Institute¹

23 Aug 2019-arXiv: Learning

TL;DR: This survey investigated different real-world applications that have shown biases in various ways, and created a taxonomy for fairness definitions that machine learning researchers have defined to avoid the existing bias in AI systems.

...read moreread less

Abstract: With the widespread use of AI systems and applications in our everyday lives, it is important to take fairness issues into consideration while designing and engineering these types of systems. Such systems can be used in many sensitive environments to make important and life-changing decisions; thus, it is crucial to ensure that the decisions do not reflect discriminatory behavior toward certain groups or populations. We have recently seen work in machine learning, natural language processing, and deep learning that addresses such challenges in different subdomains. With the commercialization of these systems, researchers are becoming aware of the biases that these applications can contain and have attempted to address them. In this survey we investigated different real-world applications that have shown biases in various ways, and we listed different sources of biases that can affect AI applications. We then created a taxonomy for fairness definitions that machine learning researchers have defined in order to avoid the existing bias in AI systems. In addition to that, we examined different domains and subdomains in AI showing what researchers have observed with regard to unfair outcomes in the state-of-the-art methods and how they have tried to address them. There are still many future directions and solutions that can be taken to mitigate the problem of bias in AI systems. We are hoping that this survey will motivate researchers to tackle these issues in the near future by observing existing work in their respective fields.

...read moreread less

1,571 citations

Journal Article•DOI•

A Survey on Bias and Fairness in Machine Learning

[...]

Ninareh Mehrabi¹, Fred Morstatter¹, Nripsuta Saxena¹, Kristina Lerman¹, Aram Galstyan¹ - Show less +1 more•Institutions (1)

Information Sciences Institute¹

13 Jul 2021-ACM Computing Surveys

TL;DR: In this article, the authors present a taxonomy for fairness definitions that machine learning researchers have defined to avoid the existing bias in AI systems and examine different domains and subdomains in AI showing what researchers have observed with regard to unfair outcomes in the state-of-the-art methods and ways they have tried to address them.

...read moreread less

Abstract: With the widespread use of artificial intelligence (AI) systems and applications in our everyday lives, accounting for fairness has gained significant importance in designing and engineering of such systems. AI systems can be used in many sensitive environments to make important and life-changing decisions; thus, it is crucial to ensure that these decisions do not reflect discriminatory behavior toward certain groups or populations. More recently some work has been developed in traditional machine learning and deep learning that address such challenges in different subdomains. With the commercialization of these systems, researchers are becoming more aware of the biases that these applications can contain and are attempting to address them. In this survey, we investigated different real-world applications that have shown biases in various ways, and we listed different sources of biases that can affect AI applications. We then created a taxonomy for fairness definitions that machine learning researchers have defined to avoid the existing bias in AI systems. In addition to that, we examined different domains and subdomains in AI showing what researchers have observed with regard to unfair outcomes in the state-of-the-art methods and ways they have tried to address them. There are still many future directions and solutions that can be taken to mitigate the problem of bias in AI systems. We are hoping that this survey will motivate researchers to tackle these issues in the near future by observing existing work in their respective fields.

...read moreread less

899 citations

Posted Content•

Language (Technology) is Power: A Critical Survey of "Bias" in NLP

[...]

Su Lin Blodgett¹, Solon Barocas², Hal Daumé³, Hanna Wallach²•Institutions (3)

University of Massachusetts Amherst¹, Microsoft², University of Maryland, College Park³

28 May 2020-arXiv: Computation and Language

TL;DR: The authors survey 146 papers analyzing "bias" in NLP systems, finding that their motivations are often vague, inconsistent, and lacking in normative reasoning, despite the fact that analyzing bias is an inherently normative process.

...read moreread less

Abstract: We survey 146 papers analyzing "bias" in NLP systems, finding that their motivations are often vague, inconsistent, and lacking in normative reasoning, despite the fact that analyzing "bias" is an inherently normative process. We further find that these papers' proposed quantitative techniques for measuring or mitigating "bias" are poorly matched to their motivations and do not engage with the relevant literature outside of NLP. Based on these findings, we describe the beginnings of a path forward by proposing three recommendations that should guide work analyzing "bias" in NLP systems. These recommendations rest on a greater recognition of the relationships between language and social hierarchies, encouraging researchers and practitioners to articulate their conceptualizations of "bias"---i.e., what kinds of system behaviors are harmful, in what ways, to whom, and why, as well as the normative reasoning underlying these statements---and to center work around the lived experiences of members of communities affected by NLP systems, while interrogating and reimagining the power relations between technologists and such communities.

...read moreread less

465 citations

Proceedings Article•DOI•

Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in Deep Image Representations

[...]

Tianlu Wang¹, Jieyu Zhao², Mark Yatskar³, Kai-Wei Chang², Vicente Ordonez¹ - Show less +1 more•Institutions (3)

University of Virginia¹, University of California, Los Angeles², University of Washington³

01 Oct 2019

TL;DR: It is shown that trained models significantly amplify the association of target labels with gender beyond what one would expect from biased datasets, and an adversarial approach is adopted to remove unwanted features corresponding to protected variables from intermediate representations in a deep neural network.

...read moreread less

Abstract: In this work, we present a framework to measure and mitigate intrinsic biases with respect to protected variables –such as gender– in visual recognition tasks. We show that trained models significantly amplify the association of target labels with gender beyond what one would expect from biased datasets. Surprisingly, we show that even when datasets are balanced such that each label co-occurs equally with each gender, learned models amplify the association between labels and gender, as much as if data had not been balanced! To mitigate this, we adopt an adversarial approach to remove unwanted features corresponding to protected variables from intermediate representations in a deep neural network – and provide a detailed analysis of its effectiveness. Experiments on two datasets: the COCO dataset (objects), and the imSitu dataset (actions), show reductions in gender bias amplification while maintaining most of the accuracy of the original models.

...read moreread less

335 citations

Proceedings Article•DOI•

Mitigating Gender Bias in Natural Language Processing: Literature Review

[...]

Tony Sun¹, Andrew Gaut, Shirlyn Tang, Yuxin Huang, Mai ElSherief¹, Jieyu Zhao², Diba Mirza¹, Elizabeth Belding¹, Kai-Wei Chang², William Yang Wang¹ - Show less +6 more•Institutions (2)

University of California, Santa Barbara¹, University of California, Los Angeles²

01 Jul 2019

TL;DR: This paper discusses gender bias based on four forms of representation bias and analyzes methods recognizing gender bias in NLP, and discusses the advantages and drawbacks of existing gender debiasing methods.

...read moreread less

Abstract: As Natural Language Processing (NLP) and Machine Learning (ML) tools rise in popularity, it becomes increasingly vital to recognize the role they play in shaping societal biases and stereotypes. Although NLP models have shown success in modeling various applications, they propagate and may even amplify gender bias found in text corpora. While the study of bias in artificial intelligence is not new, methods to mitigate gender bias in NLP are relatively nascent. In this paper, we review contemporary studies on recognizing and mitigating gender bias in NLP. We discuss gender bias based on four forms of representation bias and analyze methods recognizing gender bias. Furthermore, we discuss the advantages and drawbacks of existing gender debiasing methods. Finally, we discuss future studies for recognizing and mitigating gender bias in NLP.

...read moreread less

327 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124

Collapse