Eric A. Hanushek
Margaret E. Raymond
Does School Accountability
Lead to Improved Student
Performance?
Journal of Policy Analysis and Management, Vol. 24, No. 2, 297–327 (2005)
© 2005 by the Association for Public Policy Analysis and Management
Published by Wiley Periodicals, Inc. Published online in Wiley InterScience (www.interscience.wiley.com)
DOI: 10.1002/pam.20091
Manuscript received May 2004; review complete August 2004; revision complete September 2004; accepted September
2004.
Abstract
The leading school reform policy in the United States revolves around strong
accountability of schools with consequences for performance. The federal govern-
ment’s involvement through the No Child Left Behind Act of 2001 reinforces the
prior movement of many states toward policies based on measured student
achievement. Analysis of state achievement growth as measured by the National
Assessment of Educational progress shows that accountability systems introduced
during the 1990s had a clear positive impact on student achievement. This single
policy instrument did not, however, also lead to any narrowing in the Black-White
achievement gap (though it did narrow the Hispanic-White achievement gap).
Moreover, the Black-White gap appears to have been adversely impacted over the
decade by increasing minority concentrations in the schools. An additional issue
surrounding stronger accountability has been a concern about unintended out-
comes related to such things as higher exclusion rates from testing, increased drop-
out rates, and the like. Our analysis of special education placement rates, a fre-
quently identified area of concern, does not show any responsiveness to the
introduction of accountability systems.© 2005 by the Association for Public Policy
Analysis and Management
The cornerstone of current federal educational policy has been expansion of school
accountability based on measured student test performance. Although many states
had already installed accountability systems by 2000, a central campaign theme of
George W. Bush was to expand this to all states, something that became a reality
with the No Child Left Behind Act of 2001 (NCLB). The policy has been controver-
sial for a variety of reasons, leading to assertions that it has distorted schools in
undesirable ways, that it has led to gaming and unintended outcomes, and that it
has not and will not accomplish its objectives of improving student achievement.
This paper provides evidence on the expected effects of NCLB not only on student
performance but also on other potential outcomes.
The landmark NCLB codified a developing policy view that standards, testing, and
accountability were the path to improved performance. Much of earlier educational
policy, both at the federal and state level, concentrated on providing greater
298 / Does School Accountability Lead to Improved Student Performance?
resources—especially for the education of disadvantaged students. But student out-
comes proved noticeably impervious to these policy initiatives. As a result, federal
policy made a distinct shift in focus to emphasizing performance objectives and
outcomes rather than school inputs.
1
It is nonetheless not possible to investigate the impact of NCLB directly. First, and
most importantly, the majority of states had already instituted some sort of
accountability system by the time the federal law took effect. Although only 12
states had accountability systems at the school level in 1996, 39 states did so by
2000. Thus, there is no ready comparison group that can indicate what might have
happened without any law. Second, the law has many facets, making it hard to iso-
late the effects of any single one. Finally, the common pace of NCLB implementa-
tion across the states eliminates any status quo alternatives for comparison.
Isolating the impact of state accountability policies is inherently difficult.
Because accountability invariably applies to entire states at an instant in time, the
variation across schools within a state cannot be employed to identify the impacts
of accountability; it is necessary to rely on state-level variation in student outcomes.
Yet, states differ not only in their accountability policies but also in a variety other
ways involving both population characteristics and other school policies. If these
are not accounted for, they are likely to contaminate the estimates of the states’
accountability systems.
Our approach uses information about state differences in mathematics and read-
ing performance as identified by the National Assessment of Educational Progress
(NAEP). We pursue a number of strategies designed to isolate the effects of school
accountability on performance. First, we look at growth in performance between
fourth and eighth grades to eliminate fixed differences in circumstances and poli-
cies of each state. Second, we include explicit measures for major categories of time
varying inputs: parental education, school spending, and racial exposure in the
schools. Third, we estimate the growth models with state fixed effects to eliminate
any other policies that lead to trends up or down in student performance in each
state. Finally, to identify differences by race or ethnicity, we disaggregate the state
results for Whites, Blacks, and Hispanics.
We find that the introduction of accountability systems into a state tends to lead
to larger achievement growth than would have occurred without accountability.
The analysis, however, indicates that just reporting results has minimal impact on
student performance and that the force of accountability comes from attaching
consequences such as monetary awards or takeover threats to school performance.
This finding supports the contested provisions of NCLB that impose sanctions on
failing schools.
Much of the explicit interest in accountability and the federal legislation, however,
focuses on low achievers. And, given the generally lower achievement by minority
groups, an implicit assumption is that accountability—as revealed through manda-
tory disaggregation of performance for racial and ethnic groups—will simultane-
ously close the large achievement racial/ethnic gaps along with improving all per-
formance. When we look specifically at the performance of subgroups, we find that
Hispanic students gain most from accountability while Blacks gain least.
Since the widespread introduction of accountability, a parallel interest has
been whether more rigorous and consequential accountability also leads to other,
1
This switch to concentration on outcomes is often labeled the “standards movement.” See Smith and
O’Day (1990) for an early discussion of the precepts.
Does School Accountability Lead to Improved Student Performance? / 299
less desirable impacts. For example, does accountability lead to increased cheat-
ing, more classifications of students as “special education,” or undesirable nar-
rowing of teaching? To address a subset of these issues, we analyze the rate of
placement into special education across states but find no evidence of reaction
in this dimension.
RELEVANT STRANDS OF LITERATURE
Any consideration of state accountability systems must recognize the multitude of
potential influences on student outcomes. The scientific challenge lies in separating
the influence of accountability from these other factors.
The vast production function literature on variations in student performance
provides a general backdrop for the analysis of achievement. This literature, dat-
ing from the Coleman Report (Coleman et al., 1966) and still being developed
today, suggests significant differences in student achievement based on both fam-
ily background and on schools (Hanushek, 2002).
2
A variety of controversies
exists, particularly about the impact of various school resources (see Hanushek,
2003a), but without going into detail about these it is sufficient to conclude that
there is a lack of consensus that any specific measures of school characteristics
adequately capture the relevant factors determining student performance. Simi-
lar ambiguities exist when considering the measurement of family influences,
even if there is strong consensus that families are very important in determining
achievement. This lack of consensus on the appropriate specification of the
determinants of student achievement motivates the analytical approach
described below.
Throughout the study of schools and achievement, considerable attention has
gone to the distribution of outcomes, and especially racial aspects of schooling. As
famously highlighted more than 50 years ago by Brown v. Board of Education, the
racial composition of schools may be relevant to achievement. The Coleman Report
itself was legislatively mandated in the Civil Rights Act of 1964 and spawned atten-
tion to the racial composition of schools (U.S. Commission on Civil Rights, 1967).
Although most of the subsequent analysis flowing from Brown has related directly
to the desegregation of schools (for example, Armor, 1995; Rossell, Armor, & Wal-
berg, 2002), recent attention has turned more to issues related to the composition
of schools.
Separating the effects of the racial composition of schools from other factors is
clearly difficult, in large part because measurement errors for other school and fam-
ily factors are likely to be correlated with racial composition. The analysis of
Hanushek, Kain, and Rivkin (2002) approaches the problem through a generalized
peer analysis that controls for family, school, and neighborhood effects through
exploiting the rich longitudinal data from stacked panel data on student perform-
ance in Texas. That analysis suggests that an increased Black concentration in
schools has a detrimental effect on Black achievement, although racial composition
does not seem to affect either Whites or Hispanics. This consideration is particularly
important given recent concern that racial concentration in the schools has been ris-
ing. Partly because court supervision over school racial patterns is ending but more
importantly because White attendance in large urban systems has decreased, minor-
ity concentration has grown throughout the 1990s (Orfield & Eaton, 1996; Clotfelter,
2
Much of this literature is reviewed elsewhere. Here we merely identify sources both of basic analysis
and of extended bibliographies on the relevant issues.
300 / Does School Accountability Lead to Improved Student Performance?
2004).
3
Thus, racial composition of schools may interact with efforts to improve
schools in ways that policy designers have not yet understood.
Each of these influences is embedded within school systems across the states that
are pursuing a variety of policy reforms. The difficulty is that these other reforms
are neither well specified nor readily measured, leading to considerable difficulty in
adequately differentiating the relevant components (Hanushek, 2002). Moreover, as
we look forward to an analysis of state level data, we know the potential damage of
missing key ingredients to performance is amplified with aggregate data
(Hanushek, Rivkin, & Taylor, 1996).
The final strand of relevant literature pertains to accountability itself. Although a
recent policy effort, policies related to accountability have already become quite
controversial—rising to the level of front page stories in the New York Times (Win-
ter, 2002). Much of the work is very new and has not appeared in journals yet. The
available studies generally support the view that accountability has had a positive
effect on student outcomes, although the limited observations introduce some
uncertainty (Carnoy & Loeb, 2002; Hanushek & Raymond, 2003b; Jacob, 2003;
Peterson & West, 2003).
4
The existing analyses of accountability and state differ-
ences in performance (Carnoy & Loeb, 2002; Hanushek & Raymond, 2003b), which
are closely related to our analysis here, rely on more limited NAEP samples (with
both stopping in 2000). The available data constrain the possible analyses, leading
to serious questions about the strategies employed to isolate separate effects. The
analysis of Carnoy and Loeb (2002) attempts to identify selection effects in the
introduction of accountability through a series of separate cross-sectional regres-
sions for each ethnic group. It identifies accountability effects by relating an index
of different components of accountability systems to changes of math scores of
eighth-graders (or fourth-graders) between 1996 and 2000, but the small sample
sizes (25–37 states) limit the ability to control for other possible influences on
achievement. Hanushek and Raymond (2003b) evaluate overall state math per-
formance but consider growth in scores between fourth and eighth grade for spe-
cific cohorts. They employ larger samples by combining data from the different
testing periods of the 1990s (see below) and introduce information about how long
accountability had been in place. Nonetheless, both analyses are subject to bias
from other omitted state changes or state educational policies and stand in contrast
to the work here that identifies effects from the changes within states that occur
from the introduction of accountability. Our extended analysis reported below
expands the sample with newly available testing data, introduces state fixed effects
to deal with unmeasured inputs and policies, and follows achievement over time for
Whites, Blacks, and Hispanics separately. These innovations permit much clearer
identification of accountability impacts along with providing details about impacts
on the different ethnic groups.
A larger body of work has concentrated on whether or not accountability has pro-
duced gaming and subsequent unintended outcomes. This available work, reviewed
in Hanushek and Raymond (2003b), tends to suggest some immediate reactions to
accountability in terms of focusing teaching on relevant subjects or even relevant
students near performance cutoffs; of increased exclusions from tests; of explicit
cheating on tests; and of like attempts to improve scores in ways other than improv-
3
The increased racial concentrations in schools also occurs at a time when residential segregation has
generally declined; see Iceland and Weinberg (2002).
4
Some variation also comes from analytical methods; see Amrein and Berliner (2002) and the analysis
in Raymond and Hanushek (2003).
Does School Accountability Lead to Improved Student Performance? / 301
ing student learning. Nonetheless, as we return to below, little analysis provides
information on the longer run outcomes of this nature.
STRATEGIES FOR DEALING WITH THE ANALYTICAL DIFFICULTIES
Analyzing the effects of accountability on student performance is difficult. Because
accountability systems are introduced across entire states, all local school districts
in a state face a common incentive structure. Thus, the only possible variation comes
from interstate differences in accountability, but, as noted above, states also differ in
ways other than accountability and ways in which past research has not been very
informative. The difficulty is that, with little progress having been made in describ-
ing explicitly the different policies, regulations, and incentives that might be impor-
tant in determining student performance, statistical estimates of accountability will
be biased.
Fundamental educational policy is made at the state level and involves a wide
range of factors, including financial structure, collective bargaining rules and laws,
explicit regulations on educational processes, curricular specification, and so forth.
The analytical complications are immediately apparent.
Consider a simple model of achievement such as:
O
st
ƒ(X
st
, R
st
,
ρ
s
) (1)
where O is the level of student outcomes in state s at time t, X is a vector of family
and nonschool inputs, R is a vector of school resources, and ρ captures the policies
of the state.
5
It is not possible to understand the impact of newly introduced
accountability systems without considering the range of other factors influencing
achievement.
A linearized version of this model is simply:
O
st
β
0
β
X
X
st
β
R
R
st
(
ρ
s
ε
st
) (2)
where the β’s are unknown parameters of the educational process.
6
If, however, ρ is
not observed and the β’s are estimated with just information on X and R, correla-
tions with ρ obviously lead to bias in the estimation. When background factors (X)
and/or school resources (R) are correlated with state policies (ρ), these variables
will partially proxy for the other policies—leading to incorrect inferences about
what would happen if just X or R changed.
Now consider just adding A, a measure of whether or not accountability affects
incentives and thus student performance.
O
st
β
0
β
X
X
st
β
R
R
st
γ
A
st
(
ρ
s
ε
st
) (3)
5
It does not matter for this discussion that we begin with aggregate outcomes for a state instead of build-
ing up from the individual student level (where the outcomes are presumably generated). The more gen-
eral situation is discussed and developed in Hanushek, Rivkin, and Taylor (1996). Where the aggregation
is important, we discuss the implications.
6
The linear form is not particularly crucial but simply makes the exposition easier. An alternative model
where policies act as an efficiency parameter affecting the impact of resources is developed in Hanushek
and Somers (2001). Within the limited data for this study, however, it is virtually impossible to distin-
guish between the alternative models. The results of estimating the alternative form, discussed below,
are qualitatively very close to the included estimates.