scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Accountability, incentives and behavior: the impact of high-stakes testing in the Chicago Public Schools

01 Jun 2005-Journal of Public Economics (North-Holland)-Vol. 89, Iss: 5, pp 761-796
TL;DR: The authors examined the impact of an accountability policy implemented in the Chicago Public Schools in 1996-1997, using a panel of student-level, administrative data, and found that math and reading achievement increased sharply following the introduction of the accountability policy, in comparison to both prior achievement trends in the district and to changes experienced by other large, urban districts.
About: This article is published in Journal of Public Economics.The article was published on 2005-06-01 and is currently open access. It has received 554 citations till now. The article focuses on the topics: Accountability & Special education.

Summary (6 min read)

1. Introduction

  • If HST increased the general skill level, observed achievement gains should be reflected in other measures of student outcomes.
  • By placing low performing students in special education programs, teachers are able to exempt them from most 2 Achievement gains may also be due to increases in cheating on the part of students, teachers or administrators.
  • This paper addresses these questions in the context of a test-based accountability policy that was implemented in Chicago Public Schools in 1996-97.3.
  • On the one hand, they provide strong empirical support for general incentive theories, including the multi-task theories of Holmstrom and Milgrom (1991).

2. Background

  • The evidence on school-based accountability programs and student performance is decidedly mixed.
  • Several studies note that Texas students have made substantial achievement gains since the implementation of that state’s accountability program (Grissmer and Flanagan 1998, Grissmer et. al. 2000, Haney 2000, Klein et. al. 2000, Toenjes et. al. 2000, Deere and Strayer 2001).
  • Koretz and Barron (1998) find survey evidence that elementary teachers in Kentucky shifted the amount of time devoted to math and science across grades to correspond with the subjects tested in each grade.
  • Various studies suggest that test preparation associated with high-stakes testing may artificially inflate achievement, producing gains that are not generalizable to other exams (Linn and Graue 1990, Shepard 1990, Koretz et. al. 1991, Koretz and Barron 1998, Stecher and Barron 1998, Klein et. al. 2000).

2.2 High-Stakes Testing in Chicago

  • In 1996 the ChiPS introduced a comprehensive accountability policy designed to raise academic achievement.
  • The first component of the policy focused on holding students accountable for learning, by ending a practice commonly known as “social promotion” whereby students are advanced to the next grade regardless of ability or achievement level.
  • Students who again fail to meet the standard are required to repeat the grade, with the exception of 15-year-olds who attend newly created “transition” centers.
  • The same whether one considers the eighth grade policy to have been implemented in 1996 or 1997.

3. Empirical strategy

  • Because Chicago instituted its accountability policy district-wide in 1996-97, it is difficult to identify the causal impact of the program with certainty.
  • Similarly, improvements in the economy or other time-varying factors coincident with the policy would bias their estimates.
  • Finally, one might be worried about other policies or programs in Chicago whose impact was felt at the same time as HST, so that 0),( ≠φHighStakesCov .
  • This is essentially a difference-in-difference estimator where the first difference is a within student change over time and the second difference is a district-wide change from pre-policy to post-policy.
  • One might be particularly concerned about unobservable changes on the state or national level effecting student performance (e.g., implementation of state or federal school reform legislation).

4. Data

  • This study utilizes detailed administrative data from the ChiPS.
  • Student records include information on a student’s school, home address, demographic and family background characteristics, special education and bilingual placement, free lunch status, standardized test scores, grade retention and summer school attendance.
  • On the other hand, there was some increase in initial student achievement—e.g., prior reading achievement increased from an average of 0.89 grade equivalents below norms to 0.71 grade equivalents below norms.

5.2 The Heterogeneity of Effects Across Student and School Risk Level

  • If the improvements in student achievement were caused by the accountability policy, one might expect them to vary across students and schools.
  • Model 1 provides the average effect for all students in all of the post-policy cohorts, providing a baseline from which to compare the other results.
  • First, students in low-performing schools seem to have fared considerably better under the policy than comparable peers in higher-performing schools.
  • Moreover, the effect for marginal students appears somewhat stronger in reading than math, suggesting that there may be more intentional targeting of individual students in reading than in math, or that there is greater divisibility in the production of reading achievement.

5.3 Student-Focused versus School-Focused Accountability

  • Unlike most previous accountability systems, high-stakes testing in Chicago provided direct incentives for students as well as teachers.
  • Table 5 presents the policy affects for grades three, six and eight (i.e., promotional gate grades) versus grades four, five and seven (i.e., nongate grades).
  • Finally, it is possible that the first year effects were somewhat anomalous, perhaps because students and teachers were still adjusting to the policy or because the form change that year may have affected grades differentially.
  • Tables available from the author upon request.
  • The 1998 accountability effects are at least twice as large in grades three, six and eight compared with grade five (for example, 0.144 versus 0.067 s.d. gain in math), suggesting that the student accountability provisions may have played a large role in the overall policy in later years.

6. What factors are driving the improvements in performance in Chicago?

  • Even if a positive causal relationship between HST and student achievement can be established, it is important to understand what factors are driving the improvements in performance.
  • Critics of test-based accountability often argue that the primary impact of HST is to increase the time spent on test-specific preparation activities, which could improve testspecific skills at the expense of more general skills.
  • Others argue that test score gains reflect student motivation on the day of the exam.
  • Unfortunately, because such things as effort and test preparation are not directly observable, it is difficult to disentangle the factors underlying the achievement gains in Chicago.
  • This section attempts to shed some light on the factors driving the achievement gains in Chicago, first by comparing student performance across exams and then by examining the ITBS improvements in greater detail.

6.1 The Role of General Skills

  • Even the most comprehensive achievement exam can only cover a fraction of the possible skills and topics within a particular domain.
  • Differences in student effort across exams (or rather changes in student effort) also complicate the comparison of performance trends from one test to another.
  • The data for this analysis is drawn from school “report cards” compiled by the Illinois State Board of Education (ISBE) which provide average IGAP scores by grade and subject as well as background information on schools and districts.
  • 24 To identify the comparison districts, I first identify districts in the top decile in terms of the percent of students receiving free or reduced price lunch, percent minority students, and total enrollment and in the bottom decile in terms of average student achievement (averaged over third, sixth and eighth grade reading and math scores) based on 1990 data.
  • The point estimates indicate that once the authors take into account district-specific pre-existing trends and demographics, HST appears to have a slight negative effect on IGAP achievement in Chicago.

6.2 The Role of Specific Skills

  • Based on analysis of teacher survey data, Tepper (2002) concluded that ITBS-specific test preparation and curriculum alignment increased following the introduction of the accountability policy.
  • 28 Column 1 classifies questions into two groups—those testing basic skills such as math computation and number concepts and those testing more complex skills such as estimation, data interpretation and problem-solving (i.e., word problems).
  • Column 2 separates items into five categories—computation, number concept, data interpretation, estimation and problem-solving— and shows the same pattern.
  • The item difficulty measures are the percentage of students correctly answering the item in a nationally representative ample used by the test publisher to norm the exam.
  • This analysis suggests that test preparation may have played a large role in the math gains, but was perhaps less important in reading improvement.

6.3 The Role of Effort

  • Student effort is another likely candidate for explaining the large ITBS gains.
  • 29 Test completion is one indicator of effort.
  • This pattern is true even among the lowest achieving students who left the greatest number of items blank prior to the accountability policy.
  • While increased guessing cannot explain a significant portion of the ITBS gains, other forms of effort may play a larger role.
  • Comparing the gain across item position groups, the authors see that 1998 students improved nearly 6.7 percentage points on the final 20 percent of items.

6.4. Summary

  • The improvement in math achievement in Chicago appears to be driven largely by gains in specific skill areas such as math computation that make up a large portion of the ITBS, but are emphasized less on the IGAP.
  • This suggests that teachers aligned their math curriculum to more closely match the content of the high-stake exam.
  • In reading, ITBS gains were equally distributed across item types, but were considerably larger among questions at the end of the exam.
  • This suggests that student effort or “stamina” played a larger role than test preparation in the observed reading improvements.
  • The fact that IGAP trends did not jump sharply following the introduction of the accountability policy confirms that the ITBS gains were not driven entirely by improvements in general skills.

7. Did educators respond strategically to high-stakes testing?

  • In evaluating the effectiveness of HST, it is important to understand whether teachers and administrators respond strategically to the incentives provided by the accountability policy.
  • Critics of test-based accountability worry about educator responses along a number of dimensions, ranging from changes in the rate of special education placements to substitution away from low-stakes subjects.
  • This section examines several of these issues.

7.1 Low-stakes versus high-stakes subjects

  • Given the consequences attached to test performance in certain subjects, one might expect teachers and students to shift resources and attention toward subjects included in the accountability program.
  • The authors can test this theory by comparing trends in math and reading achievement after the introduction of HST with test score trends in social studies and science, subjects that are not included in the Chicago accountability policy.
  • Unfortunately science and social studies exams are not given in every grade, and the grades in which these exams are given has changed over time.
  • The distribution of effects is also somewhat different for low versus high-stakes subjects.
  • As the authors noted earlier, in math and reading, students in low-achieving schools experienced greater gains. , However, conditional on school achievement, low-ability students appeared to make only slightly larger gains than their peers.

7.2 Special education placements

  • While the accountability policies in Chicago are designed to increase student achievement, they also create incentives for teachers and administrators to alter the pool of testtakers.
  • The sample only includes third, sixth and eighth grade students from 1994 to 2000 because some special education and reporting data is not available for the 1993 cohort.
  • Figures available from the author upon request.
  • Beginning in 1997, ChiPS began excluding the ITBS scores of students who had been enrolled in bilingual programs for three or fewer years to encourage teachers to test these students for appears that the trend became steeper beginning in 1997, suggesting that the accountability policy may have influenced teacher and administrator behavior.
  • The lowest performing schools increased special education placements for high-risk sixth graders by 50 percent following the introduction of the accountability policy, compared with an increase of roughly 32 percent among moderateachieving schools and no increase among the highest performing schools.

7.3 Grade retention

  • Another way for teachers to shield low-achieving students from the accountability mandates is to preemptively retain them—that is, hold them back before they enter grade three, six or eight.
  • 36 Roderick et al. (2000) found that retention rates in kindergarten, first and second grades started to rise in 1996 and jumped sharply in 1997 among first and second graders.
  • Grade, 2.5 percent in second grade and a little over 1 percent in grades four, five and seven.
  • Retention rates began to increase in 1996, possibly in anticipation of the new standards the students would face in 1997.
  • The bottom panel controls for current achievement, age and special education status as well as demographic variables, thereby accounting for prior retention and giving a better sense of the marginal effect of the policy on the propensity to retain students.

7.4 Sensitivity analysis

  • To test the sensitivity of the findings presented in the previous sections, Table 13 presents comparable estimates for a variety of different specifications and samples.
  • The next three rows show that the results are not sensitive to including students who either were in that grade for the second time (e.g., retained students) or whose test scores were not included for official reporting purposes because of a special education or bilingual classification.
  • This should control for any changes in form difficulty that may confound the results.

8. Conclusions

  • When the federal legislation No Child Left Behind became law earlier this year, high- stakes testing took on a heightened level of importance for students, teachers and parents across the country.
  • If the authors make the conservative assumption that special education rates increased by two percentage points in all grades (mirroring the increases they saw in grades three, six and eight), this would translate to an additional expenditure of $40 per pupil.
  • “Comparing State and District Results to National Norms: The Validity of the Claim that 'Everyone is Above Average'.” Educational Measurement: Issues and Practice 9(3): 5-14.

Did you find this useful? Give us your feedback

Citations
More filters
Journal ArticleDOI
TL;DR: In this article, the authors investigate how accountability pressures under No Child Left Behind (NCLB) may have affected students' rate of overweight and find evidence of small effects of accountability pressures on the percent of students at a school who are overweight.
Abstract: This paper investigates how accountability pressures under No Child Left Behind (NCLB) may have affected students’ rate of overweight. Schools facing pressure to improve academic outcomes may reallocate their efforts in ways that have unintended consequences for children's health. To examine the impact of school accountability, we create a unique panel dataset containing school-level data on test scores and students’ weight outcomes from schools in Arkansas. We code schools as facing accountability pressures if they are on the margin of making Adequate Yearly Progress, measured by whether the school's minimum-scoring subgroup had a passing rate within 5 percentage points of the threshold. We find evidence of small effects of accountability pressures on the percent of students at a school who are overweight. This finding is little changed if we controlled for the school's lagged rate of overweight, or use alternative ways to identify schools facing NCLB pressure.

3 citations

Journal ArticleDOI
TL;DR: This article examined the effects of a compositional shift in a school's testing population brought about by the elimination of special education testing exemptions, which forced schools to add varying levels of generally low-achieving students to their testing pools, altering accountability incentives.

3 citations

Posted Content
TL;DR: A review of the literature on performance-related pay in the public sector can be found in this paper, with a focus on the use of monetary incentives, often referred to as performance related pay or performance based pay.
Abstract: Many Governments wrestle with the issue of designing an appropriate set of human resource practices to motivate public servants to perform. Identifying the right set of practices for the public sector is a source of some controversy, and passions run high particularly in relation to the use of monetary incentives, often referred to as performance-related pay or performance based pay. This GET note reviews recent research on a range of practices Governments utilize to drive employee performance, which rest on the assumptions that public servants are motivated in two ways: (i) ‘intrinsically’ (i.e. internal factors motivated by ‘the right thing to do’), and (ii) ‘extrinsically’ (i.e., external validation from rewards offered by others). Generally, a Human Resource Management (HRM) system designed to motivate employee performance will utilize practices in two broad categories related to: (i) ‘external incentives’ (e.g., financial incentives), and (ii) ‘opportunities to perform’ focusing on ‘intrinsic’ factors (i.e. self-directed work). Within ‘external incentives,’ a financial incentive may either act over the long term (e.g., deferred compensation) or in the short term (e.g., performance-related pay). This note applies this conceptual framework to more clearly understand the range of practices Governments are using to improve staff performance, as well as the pre-conditions for their success. Given the recent attention on performance-related pay, we take a deeper look at the evidence underlying the shorter term performance-related pay, reviewing evidence from both OECD and middle income countries. Annex one provides a brief overview on the theories of motivation for those interested in the theoretical underpinnings of the work, and annex two presents’ experiences of performance pay in practice. This Note draws heavily from performance-related pay in the public sector: A review of theory and evidence (Hasnain and others 2012), a recent review of the literature in fields including political science, public administration, business management, and psychology.

3 citations

Journal ArticleDOI
TL;DR: The authors argue that schools reproduce rather than challenge social inequality, while some commentators view education as a social mobility mechanism, while many scholars argue that education reproduces rather than challenges social inequality.
Abstract: Background/ContextWhile some commentators view education as a social mobility mechanism, many scholars argue that schools reproduce rather than challenge social inequality. A vast literature on the...

3 citations

Dissertation
12 May 2015
TL;DR: In this paper, the authors investigated the long-term impacts of tracking high-achieving students using data from a Boston Public Schools (BPS) program, Advanced Work Class (AWC), and found that AWC has little impact on test scores.
Abstract: This dissertation includes three papers in the field of economics of education. The first paper provides estimates of the long-run impacts of tracking high-achieving students using data from a Boston Public Schools (BPS) program, Advanced Work Class (AWC). AWC is an accelerated curriculum in 4th through 6th grades with dedicated classrooms. Using a fuzzy regression discontinuity approach based on the AWC entrance exam, I find that AWC has little impact on test scores. However, it improves longer-term academic outcomes including Algebra 1 enrollment by 8th grade, AP exam taking, and college enrollment. The college enrollment effect is particularly large for elite institutions. Testing potential channels for program effects provides suggestive evidence that teacher effectiveness and math acceleration account for AWC effects, with little evidence that peer effects contribute to gains. The second paper uses item-level information from standardized tests to investigate whether large test score gains attributed to Boston charter schools can be explained by score inflation. To do so, I estimate the impact of charter school attendance on subscales of the test scores and examine them for evidence of score inflation. If charter schools are teaching to the test to a greater extent than their counterparts, one would expect to see higher scores on commonly tested standards, higher stakes subjects, and frequently tested topics. However, despite incentives to reallocate effort toward highly-tested content, and to coach to item type, I find no evidence of this type of test preparation. Boston charter middle schools perform consistently across all standardized test subscales. The third paper analyzes a Massachusetts merit aid program that gives high-scoring students tuition waivers at in-state public colleges with lower graduation rates than available alternative colleges. A regression discontinuity design comparing students just above and below the eligibility

3 citations

References
More filters
Journal ArticleDOI
TL;DR: In this article, a principal-agent model that can explain why employment is sometimes superior to independent contracting even when there are no productive advantages to specific physical or human capital and no financial market imperfections to limit the agent's borrowings is presented.
Abstract: Introduction In the standard economic treatment of the principal–agent problem, compensation systems serve the dual function of allocating risks and rewarding productive work. A tension between these two functions arises when the agent is risk averse, for providing the agent with effective work incentives often forces him to bear unwanted risk. Existing formal models that have analyzed this tension, however, have produced only limited results. It remains a puzzle for this theory that employment contracts so often specify fixed wages and more generally that incentives within firms appear to be so muted, especially compared to those of the market. Also, the models have remained too intractable to effectively address broader organizational issues such as asset ownership, job design, and allocation of authority. In this article, we will analyze a principal–agent model that (i) can account for paying fixed wages even when good, objective output measures are available and agents are highly responsive to incentive pay; (ii) can make recommendations and predictions about ownership patterns even when contracts can take full account of all observable variables and court enforcement is perfect; (iii) can explain why employment is sometimes superior to independent contracting even when there are no productive advantages to specific physical or human capital and no financial market imperfections to limit the agent's borrowings; (iv) can explain bureaucratic constraints; and (v) can shed light on how tasks get allocated to different jobs.

5,678 citations

Journal ArticleDOI
TL;DR: In this article, the authors focus on the first three months of training under the Manpower Development and Training Act (MDTA) in the U.S. in order to measure the full inter-temporal impact of training.
Abstract: GOVERNMENTAL post-schooling training programs have become a permanent fixture of the U.S. economy in the last decade. These programs are typically advocated for diverse reasons: (1) to reduce inflation by the provision of more skilled workers to alleviate shortages, (2) to reduce unemployment of certain groups, and (3) to reduce poverty by increasing the skills of certain groups. All of these objectives require that training programs increase the earnings of trainees above what they otherwise would be. For example, alleviating shortages by training more highly skilled workers should increase the earnings of these workers. Likewise, the concern for unemployed workers is derived from a concern for the decreased earnings of these workers; and if trainees subsequently suffer less unemployment, their earnings should be higher. Finally, training programs are intended to reduce poverty by increasing the earnings of low income workers. Evaluating the success of training programs is thus inherently a quantitative assessment of the effect of training on trainee earnings.' It is an important process both because it helps to inform discussions of public policy by shedding light on the past value of these programs as investments and because it can provide a means of testing our ability to augment the human capital of certain workers. Although there have been many studies of the effect of post-school classroom training on earnings it is by now rather widely agreed that very little is reliably known about the actual effects of these programs.2 Three main problems account for this state of affairs: (1) the large sample sizes required to detect relatively small anticipated program effects in a variable with such high variance as earnings, (2) the considerable expense required to keep track of trainees over a long enough period of time to measure the full inter-temporal impact of training, and (3) the extreme difficulty of implementing an adequate experimental design so as to obtain a group against which to reliably compare trainees.3 The purpose of this paper is to report on efforts to cope with this third problem using a data collection system that comes some way towards resolving the first two. The basic idea of this data system is to match the program record on each trainee with the trainee's Social Security earnings history. The Social Security Administration maintains a summary year-by-year earnings history for each Social Security account over the period since 1950 that may be used, under the appropriate confidentiality restrictions, for this purpose.4 In this paper I have concentrated on an analysis of all classroom trainees who started training under the Manpower Development and Training Act (MDTA) in the first 3 months of 1964 so as to ensure their having completed training in that year. In choosing to analyze trainees from so early a cohort something is clearly lost. On the one hand, the nature of the participants in these early years was considerably different than in the later years. In particular, programs geared Received for publication February 9, 1977. Revision accepted for publication August 1, 1977. * Princeton University. This research was supported by ASPER, U.S. Department of Labor, but does not represent an official position of the Department of Labor, its agencies, or staff. I would like to thank Gregory Chow, Ronald Ehrenberg, Roger Gordon, Zvi Griliches, George E. Johnson, Nicholas Kiefer, Richard Quandt, and Sherwin Rosen for helpful comments. I also owe a heavy debt to D. Alton Smith for computational and other assistance. 'See Reid (1976), for example, for a clear analysis of how knowledge of these effects is required in order to establish the impact of government training on the black/white wage differential. 2 Surveys of many of these studies may be found in Stromsdorfer (1972) and O'Neill (1973). 3For further discussion of these points see Ashenfelter (1975). 4The idea for using these data to analyze the effectiveness of government training programs is apparently quite an old one, having been suggested by the National Manpower Advisory Committee (U.S. Department of Labor, 1972) to the Secretary of Labor at its first meeting in a letter dated October 10, 1962, the year of passage of the Manpower Development and Training Act. Actual efforts along these lines were ultimately reported by Borus (1967), Commins (1970), Farber (1970), and Prescott and Cooley (1972).

1,456 citations

Journal ArticleDOI
TL;DR: Jacob and Levitt as mentioned in this paper investigated the prevalence and predictors of teacher cheating in Chicago Public Schools. But they did not consider the role of teachers in the cheating and did not identify any teachers who were involved in cheating.
Abstract: NBER WORKING PAPER SERIES ROTTEN APPLES: AN INVESTIGATION OF THE PREVALENCE AND PREDICTORS OF TEACHER CHEATING Brian A. Jacob Steven D. Levitt Working Paper 9413 http://www.nber.org/papers/w9413 NATIONAL BUREAU OF ECONOMIC RESEARCH 1050 Massachusetts Avenue Cambridge, MA 02138 December 2002 We would like to thank Suzanne Cooper, Mark Duggan, Sue Dynarski, Arne Duncan, Michael Greenstone, James Heckman, Lars Lefgren, and seminar participants too numerous to mention for helpful comments and discussions. We also thank Arne Duncan, Phil Hansen, Carol Perlman, and Jessie Qualles of the Chicago Public Schools for their help and cooperation on the project. Financial support was provided by the National Science Foundation and the Sloan Foundation. All remaining errors are our own. The views expressed herein are those of the authors and not necessarily those of the National Bureau of Economic Research. © 2002 by Brian A. Jacob and Steven D. Levitt. All rights reserved. Short sections of text not to exceed two paragraphs, may be quoted without explicit permission provided that full credit including, © notice, is given to the source.

660 citations

Journal ArticleDOI
TL;DR: In this article, the authors examined evidence on the effect of class size on student achievement and showed that the results of quantitative summaries of the literature depend critically on whether studies are accorded equal weight.
Abstract: This paper examines evidence on the effect of class size on student achievement. First, it is shown that results of quantitative summaries of the literature, such as Hanushek (1997), depend critically on whether studies are accorded equal weight. When studies are given equal weight, resources are systematically related to student achievement. When weights are in proportion to their number of estimates, resources and achievements are not systematically related. Second, a cost-benefit analysis of class size reduction is performed. Results of the Tennessee STAR class-size experiment suggest that the internal rate of return from reducing class size from 22 to 15 students is around 6%.

607 citations

Journal ArticleDOI
Walt Haney1
TL;DR: In this article, the authors summarize the recent history of education reform and statewide testing in Texas, which led to the introduction of the Texas Assessment of Academic Skills (TAAS) in 1990-91.
Abstract: I summarize the recent history of education reform and statewide testing in Texas, which led to introduction of the Texas Assessment of Academic Skills (TAAS) in 1990-91. A variety of evidence in the late 1990s led a number of observers to conclude that the state of Texas had made near miraculous progress in reducing dropouts and increasing achievement. The passing scores on TAAS tests were arbitrary and discriminatory. Analyses comparing TAAS reading, writing and math scores with one another and with relevant high school grades raise doubts about the reliability and validity of TAAS scores. I discuss problems of missing students and other mirages in Texas enrollment statistics that profoundly affect both reported dropout statistics and test scores. Only 50% of minority students in Texas have been progressing from grade 9 to high school graduation since the initiation of the TAAS testing program. Since about 1982, the rates at which Black and Hispanic students are required to repeat grade 9 have climbed steadily, such that by the late 1990s, nearly 30% of Black and Hispanic students were "failing" grade 9. Cumulative rates of grade retention in Texas are almost twice as high for Black and Hispanic students as for White students. Some portion of the gains in grade 10 TAAS pass rates are illusory. The numbers of students taking the grade 10 tests who were classified as "in special education" and hence not counted in schools' accountability ratings nearly doubled between 1994 and 1998. A substantial portion of the apparent increases in TAAS pass rates in the 1990s are due to such exclusions. In the opinion of educators in Texas, schools are devoting a huge amount of time and energy preparing students specifically for TAAS, and emphasis on TAAS is hurting more than helping teaching and learning in Texas schools, particularly with at-risk students, and TAAS contributes to retention in grade and dropping out. Five different sources of evidence about rates of high school completion in Texas are compared and contrasted. The review of GED statistics indicated that there was a sharp upturn in numbers of young people taking the GED tests in Texas in the mid-1990s to avoid TAAS. A convergence of evidence indicates that during the 1990s, slightly less than 70% of students in Texas actually graduated from high school. Between 1994 and 1997, TAAS results showed a 20% increase in the percentage of students passing all three exit level TAAS tests (reading, writing and math), but TASP (a college readiness test) results showed a sharp decrease (from 65.2% to 43.3%) in the percentage of students passing all three parts (reading, math, and writing). As measured by performance on the SAT, the academic learning of secondary school students in Texas has not improved since the early 1990s, compared with SAT takers nationally. SAT-Math scores have deteriorated relative to students nationally. The gains on NAEP for Texas fail to confirm the dramatic gains apparent on TAAS. The gains on TAAS and the unbelievable decreases in dropouts during the 1990s are more illusory than real. The Texas "miracle" is more hat than cattle.

549 citations

Frequently Asked Questions (1)
Q1. What are the contributions in "Nber working paper series accountability, incentives and behavior: the impact of high-stakes testing in the chicago public schools" ?

This study examines the impact of an accountability policy implemented in the Chicago Public Schools in 1996-97.