DISCUSSION PAPER SERIES
IZA DP No. 11000
Friederike Mengel
Jan Sauermann
Ulf Zölitz
Gender Bias in Teaching Evaluations
SEPTEMBER 2017
Any opinions expressed in this paper are those of the author(s) and not those of IZA. Research published in this series may
include views on policy, but IZA takes no institutional policy positions. The IZA research network is committed to the IZA
Guiding Principles of Research Integrity.
The IZA Institute of Labor Economics is an independent economic research institute that conducts research in labor economics
and offers evidence-based policy advice on labor market issues. Supported by the Deutsche Post Foundation, IZA runs the
world’s largest network of economists, whose research aims to provide answers to the global labor market challenges of our
time. Our key objective is to build bridges between academic research, policymakers and society.
IZA Discussion Papers often represent preliminary work and are circulated to encourage discussion. Citation of such a paper
should account for its provisional character. A revised version may be available directly from the author.
Schaumburg-Lippe-Straße 5–9
53113 Bonn, Germany
Phone: +49-228-3894-0
Email: publications@iza.org www.iza.org
IZA – Institute of Labor Economics
DISCUSSION PAPER SERIES
IZA DP No. 11000
Gender Bias in Teaching Evaluations
SEPTEMBER 2017
Friederike Mengel
University of Essex and Lund University
Jan Sauermann
SOFI, Stockholm University, CCP, IZA and ROA
Ulf Zölitz
briq and IZA
ABSTRACT
IZA DP No. 11000 SEPTEMBER 2017
Gender Bias in Teaching Evaluations
*
This paper provides new evidence on gender bias in teaching evaluations. We exploit a
quasi-experimental dataset of 19,952 student evaluations of university faculty in a context
where students are randomly allocated to female or male instructors. Despite the fact that
neither students’ grades nor self-study hours are affected by the instructor’s gender, we find
that women receive systematically lower teaching evaluations than their male colleagues.
This bias is driven by male students’ evaluations, is larger for mathematical courses and
particularly pronounced for junior women. The gender bias in teaching evaluations we
document may have direct as well as indirect effects on the career progression of women
by affecting junior women’s confidence and through the reallocation of instructor resources
away from research and towards teaching.
JEL Classification: J16, J71, I23, J45
Keywords: gender bias, teaching evaluations, female faculty
Corresponding author:
Ulf Zölitz
Behavior and Inequality Research Institute (briq)
Schaumburg-Lippe-Str. 5-9
53113 Bonn
Germany
E-mail: ulf.zoelitz@briq-institute.org
* We thank Elena Cettolin, Kathie Coffman, Patricio Dalton, Luise Görges, Nabanita Datta Gupta, Charles Nouissar,
Björn Öckert, Anna Piil Damm, Robert Dur, Louis Raes, Daniele Paserman, three anonymous reviewers and seminar
participants in Stockholm, Tilburg, Nuremberg, Uppsala, Aarhus, the BGSE Summer Forum in Barcelona, the EALE/
SOLE conference in Montreal, the AEA meetings in San Francisco and the IZA reading group in Bonn for helpful
comments. We thank Sophia Wagner for providing excellent research assistance. Friederike Mengel thanks the Dutch
Science Foundation (NWO Veni grant 016.125.040) for financial support. Jan Sauermann thanks the Jan Wallanders
och Tom Hedelius Stiftelse for financial support (Grant number I2011-0345:1). The Online Appendix can be found on
the authors’ websites.
1 Introduction
Why are there so few female professors? Despite the fact that the fraction
of women enrolling in graduate programs has steadily increased over the last
decades, the proportion of women who continue their careers in academia
remains low. Potential explanations for the controversially debated question
of why some fields in academia are so male dominated include differences in
preferences (e.g., competitiveness), differences in child rearing responsibilities,
and gender discrimination.
1
One frequently used assessment criterion for faculty performance in aca-
demia are student evaluations. In the competitive world of academia, these
teaching evaluations are often part of hiring, tenure and promotion decisions
and, thus, have a strong impact on career progression. Feedback from teaching
evaluations could also affect the confidence and beliefs of young academics and
may lead to a reallocation of scarce resources from research to teaching. This
reallocation of resources may in turn lead to lower (quality) research outputs.
2
In this paper we investigate whether there is a gender bias in university
teaching evaluations. Gender bias exists if women and men receive different
evaluations which cannot be explained by objective differences in teaching
1
The “leaking pipeline” in Economics is summarized by McElroy (2016), who reports
that in 2015 35% of new PhDs were female, 28% of assistant professors, 24% of tenured
associate professors and 12% of full professors. Similar results can be found in Kahn (1993),
Broder (1993), McDowell et al. (1999), European Commission (2009), or National Science
Foundation (2009). Possible explanations for these gender differences in labor market out-
comes are discussed by Heilman and Chen (2005), Croson and Gneezy (2009), Lalanne and
Seabright (2011), Hederos Eriksson and Sandberg (2012), Hern´andez-Arenaz and Iriberri
(2016) or Leibbrandt and List (2015), among others.
2
Indeed, there is evidence that female university faculty allocate more time to teaching
compared to men (Link et al. 2008). Such reallocations of resources away from research can
be detrimental for women with both research and teaching contracts. For instructors with
teaching-only contracts the direct effects on promotion and tenure are likely to be even more
substantial.
1
quality. We exploit a quasi-experimental dataset of 19,952 evaluations of in-
structors at Maastricht University in the Netherlands. To identify causal ef-
fects, we exploit the institutional feature that within each course students are
randomly assigned to either female or male section instructors.
3
In addition to
students’ subjective evaluations of their instructors’ performance, our dataset
also contains students’ course grades, which are mostly based on centralized
exams and are usually not graded by the section instructors whose evaluation
we are analyzing. This provides us with an objective measure of the instruc-
tors’ performance. Furthermore, we observe a measure of effort, namely the
self-reported number of hours students spent studying for the course, which
allows us to test if students adjust their effort in response to female instructors.
Our results show that female faculty receive systematically lower teaching
evaluations than their male colleagues despite the fact that neither students’
current or future grades nor their study hours are affected by the gender of
the instructor. The lower teaching evaluations of female faculty stem mostly
from male students, who evaluate their female instructors 21% of a standard
deviation worse than their male instructors. While female students were found
to rate female instructors about 8% of a standard deviation lower than male
instructors.
When testing whether results differ by seniority, we find the effects to be
driven by junior instructors, particularly PhD students, who receive 28% of
a standard deviation lower teaching evaluations than their male colleagues.
Interestingly, we do not observe this gender bias for more senior female in-
structors like lecturers or professors. We do find, however, that the gender
3
Throughout this paper, we use the term instructor to describe all types of teachers
(students, PhD students, post-docs, assistant, associate and full professors) who are teaching
groups of students (sections) as part of a larger course.
2