scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Accountability, incentives and behavior: the impact of high-stakes testing in the Chicago Public Schools

01 Jun 2005-Journal of Public Economics (North-Holland)-Vol. 89, Iss: 5, pp 761-796
TL;DR: The authors examined the impact of an accountability policy implemented in the Chicago Public Schools in 1996-1997, using a panel of student-level, administrative data, and found that math and reading achievement increased sharply following the introduction of the accountability policy, in comparison to both prior achievement trends in the district and to changes experienced by other large, urban districts.
About: This article is published in Journal of Public Economics.The article was published on 2005-06-01 and is currently open access. It has received 554 citations till now. The article focuses on the topics: Accountability & Special education.

Summary (6 min read)

1. Introduction

  • If HST increased the general skill level, observed achievement gains should be reflected in other measures of student outcomes.
  • By placing low performing students in special education programs, teachers are able to exempt them from most 2 Achievement gains may also be due to increases in cheating on the part of students, teachers or administrators.
  • This paper addresses these questions in the context of a test-based accountability policy that was implemented in Chicago Public Schools in 1996-97.3.
  • On the one hand, they provide strong empirical support for general incentive theories, including the multi-task theories of Holmstrom and Milgrom (1991).

2. Background

  • The evidence on school-based accountability programs and student performance is decidedly mixed.
  • Several studies note that Texas students have made substantial achievement gains since the implementation of that state’s accountability program (Grissmer and Flanagan 1998, Grissmer et. al. 2000, Haney 2000, Klein et. al. 2000, Toenjes et. al. 2000, Deere and Strayer 2001).
  • Koretz and Barron (1998) find survey evidence that elementary teachers in Kentucky shifted the amount of time devoted to math and science across grades to correspond with the subjects tested in each grade.
  • Various studies suggest that test preparation associated with high-stakes testing may artificially inflate achievement, producing gains that are not generalizable to other exams (Linn and Graue 1990, Shepard 1990, Koretz et. al. 1991, Koretz and Barron 1998, Stecher and Barron 1998, Klein et. al. 2000).

2.2 High-Stakes Testing in Chicago

  • In 1996 the ChiPS introduced a comprehensive accountability policy designed to raise academic achievement.
  • The first component of the policy focused on holding students accountable for learning, by ending a practice commonly known as “social promotion” whereby students are advanced to the next grade regardless of ability or achievement level.
  • Students who again fail to meet the standard are required to repeat the grade, with the exception of 15-year-olds who attend newly created “transition” centers.
  • The same whether one considers the eighth grade policy to have been implemented in 1996 or 1997.

3. Empirical strategy

  • Because Chicago instituted its accountability policy district-wide in 1996-97, it is difficult to identify the causal impact of the program with certainty.
  • Similarly, improvements in the economy or other time-varying factors coincident with the policy would bias their estimates.
  • Finally, one might be worried about other policies or programs in Chicago whose impact was felt at the same time as HST, so that 0),( ≠φHighStakesCov .
  • This is essentially a difference-in-difference estimator where the first difference is a within student change over time and the second difference is a district-wide change from pre-policy to post-policy.
  • One might be particularly concerned about unobservable changes on the state or national level effecting student performance (e.g., implementation of state or federal school reform legislation).

4. Data

  • This study utilizes detailed administrative data from the ChiPS.
  • Student records include information on a student’s school, home address, demographic and family background characteristics, special education and bilingual placement, free lunch status, standardized test scores, grade retention and summer school attendance.
  • On the other hand, there was some increase in initial student achievement—e.g., prior reading achievement increased from an average of 0.89 grade equivalents below norms to 0.71 grade equivalents below norms.

5.2 The Heterogeneity of Effects Across Student and School Risk Level

  • If the improvements in student achievement were caused by the accountability policy, one might expect them to vary across students and schools.
  • Model 1 provides the average effect for all students in all of the post-policy cohorts, providing a baseline from which to compare the other results.
  • First, students in low-performing schools seem to have fared considerably better under the policy than comparable peers in higher-performing schools.
  • Moreover, the effect for marginal students appears somewhat stronger in reading than math, suggesting that there may be more intentional targeting of individual students in reading than in math, or that there is greater divisibility in the production of reading achievement.

5.3 Student-Focused versus School-Focused Accountability

  • Unlike most previous accountability systems, high-stakes testing in Chicago provided direct incentives for students as well as teachers.
  • Table 5 presents the policy affects for grades three, six and eight (i.e., promotional gate grades) versus grades four, five and seven (i.e., nongate grades).
  • Finally, it is possible that the first year effects were somewhat anomalous, perhaps because students and teachers were still adjusting to the policy or because the form change that year may have affected grades differentially.
  • Tables available from the author upon request.
  • The 1998 accountability effects are at least twice as large in grades three, six and eight compared with grade five (for example, 0.144 versus 0.067 s.d. gain in math), suggesting that the student accountability provisions may have played a large role in the overall policy in later years.

6. What factors are driving the improvements in performance in Chicago?

  • Even if a positive causal relationship between HST and student achievement can be established, it is important to understand what factors are driving the improvements in performance.
  • Critics of test-based accountability often argue that the primary impact of HST is to increase the time spent on test-specific preparation activities, which could improve testspecific skills at the expense of more general skills.
  • Others argue that test score gains reflect student motivation on the day of the exam.
  • Unfortunately, because such things as effort and test preparation are not directly observable, it is difficult to disentangle the factors underlying the achievement gains in Chicago.
  • This section attempts to shed some light on the factors driving the achievement gains in Chicago, first by comparing student performance across exams and then by examining the ITBS improvements in greater detail.

6.1 The Role of General Skills

  • Even the most comprehensive achievement exam can only cover a fraction of the possible skills and topics within a particular domain.
  • Differences in student effort across exams (or rather changes in student effort) also complicate the comparison of performance trends from one test to another.
  • The data for this analysis is drawn from school “report cards” compiled by the Illinois State Board of Education (ISBE) which provide average IGAP scores by grade and subject as well as background information on schools and districts.
  • 24 To identify the comparison districts, I first identify districts in the top decile in terms of the percent of students receiving free or reduced price lunch, percent minority students, and total enrollment and in the bottom decile in terms of average student achievement (averaged over third, sixth and eighth grade reading and math scores) based on 1990 data.
  • The point estimates indicate that once the authors take into account district-specific pre-existing trends and demographics, HST appears to have a slight negative effect on IGAP achievement in Chicago.

6.2 The Role of Specific Skills

  • Based on analysis of teacher survey data, Tepper (2002) concluded that ITBS-specific test preparation and curriculum alignment increased following the introduction of the accountability policy.
  • 28 Column 1 classifies questions into two groups—those testing basic skills such as math computation and number concepts and those testing more complex skills such as estimation, data interpretation and problem-solving (i.e., word problems).
  • Column 2 separates items into five categories—computation, number concept, data interpretation, estimation and problem-solving— and shows the same pattern.
  • The item difficulty measures are the percentage of students correctly answering the item in a nationally representative ample used by the test publisher to norm the exam.
  • This analysis suggests that test preparation may have played a large role in the math gains, but was perhaps less important in reading improvement.

6.3 The Role of Effort

  • Student effort is another likely candidate for explaining the large ITBS gains.
  • 29 Test completion is one indicator of effort.
  • This pattern is true even among the lowest achieving students who left the greatest number of items blank prior to the accountability policy.
  • While increased guessing cannot explain a significant portion of the ITBS gains, other forms of effort may play a larger role.
  • Comparing the gain across item position groups, the authors see that 1998 students improved nearly 6.7 percentage points on the final 20 percent of items.

6.4. Summary

  • The improvement in math achievement in Chicago appears to be driven largely by gains in specific skill areas such as math computation that make up a large portion of the ITBS, but are emphasized less on the IGAP.
  • This suggests that teachers aligned their math curriculum to more closely match the content of the high-stake exam.
  • In reading, ITBS gains were equally distributed across item types, but were considerably larger among questions at the end of the exam.
  • This suggests that student effort or “stamina” played a larger role than test preparation in the observed reading improvements.
  • The fact that IGAP trends did not jump sharply following the introduction of the accountability policy confirms that the ITBS gains were not driven entirely by improvements in general skills.

7. Did educators respond strategically to high-stakes testing?

  • In evaluating the effectiveness of HST, it is important to understand whether teachers and administrators respond strategically to the incentives provided by the accountability policy.
  • Critics of test-based accountability worry about educator responses along a number of dimensions, ranging from changes in the rate of special education placements to substitution away from low-stakes subjects.
  • This section examines several of these issues.

7.1 Low-stakes versus high-stakes subjects

  • Given the consequences attached to test performance in certain subjects, one might expect teachers and students to shift resources and attention toward subjects included in the accountability program.
  • The authors can test this theory by comparing trends in math and reading achievement after the introduction of HST with test score trends in social studies and science, subjects that are not included in the Chicago accountability policy.
  • Unfortunately science and social studies exams are not given in every grade, and the grades in which these exams are given has changed over time.
  • The distribution of effects is also somewhat different for low versus high-stakes subjects.
  • As the authors noted earlier, in math and reading, students in low-achieving schools experienced greater gains. , However, conditional on school achievement, low-ability students appeared to make only slightly larger gains than their peers.

7.2 Special education placements

  • While the accountability policies in Chicago are designed to increase student achievement, they also create incentives for teachers and administrators to alter the pool of testtakers.
  • The sample only includes third, sixth and eighth grade students from 1994 to 2000 because some special education and reporting data is not available for the 1993 cohort.
  • Figures available from the author upon request.
  • Beginning in 1997, ChiPS began excluding the ITBS scores of students who had been enrolled in bilingual programs for three or fewer years to encourage teachers to test these students for appears that the trend became steeper beginning in 1997, suggesting that the accountability policy may have influenced teacher and administrator behavior.
  • The lowest performing schools increased special education placements for high-risk sixth graders by 50 percent following the introduction of the accountability policy, compared with an increase of roughly 32 percent among moderateachieving schools and no increase among the highest performing schools.

7.3 Grade retention

  • Another way for teachers to shield low-achieving students from the accountability mandates is to preemptively retain them—that is, hold them back before they enter grade three, six or eight.
  • 36 Roderick et al. (2000) found that retention rates in kindergarten, first and second grades started to rise in 1996 and jumped sharply in 1997 among first and second graders.
  • Grade, 2.5 percent in second grade and a little over 1 percent in grades four, five and seven.
  • Retention rates began to increase in 1996, possibly in anticipation of the new standards the students would face in 1997.
  • The bottom panel controls for current achievement, age and special education status as well as demographic variables, thereby accounting for prior retention and giving a better sense of the marginal effect of the policy on the propensity to retain students.

7.4 Sensitivity analysis

  • To test the sensitivity of the findings presented in the previous sections, Table 13 presents comparable estimates for a variety of different specifications and samples.
  • The next three rows show that the results are not sensitive to including students who either were in that grade for the second time (e.g., retained students) or whose test scores were not included for official reporting purposes because of a special education or bilingual classification.
  • This should control for any changes in form difficulty that may confound the results.

8. Conclusions

  • When the federal legislation No Child Left Behind became law earlier this year, high- stakes testing took on a heightened level of importance for students, teachers and parents across the country.
  • If the authors make the conservative assumption that special education rates increased by two percentage points in all grades (mirroring the increases they saw in grades three, six and eight), this would translate to an additional expenditure of $40 per pupil.
  • “Comparing State and District Results to National Norms: The Validity of the Claim that 'Everyone is Above Average'.” Educational Measurement: Issues and Practice 9(3): 5-14.

Did you find this useful? Give us your feedback

Citations
More filters
Journal ArticleDOI
TL;DR: In this article, the authors present empirical evidence on how education and training policies can be designed to advance both efficiency and equity at each educational stage, ranging from early childhood education and schools over vocational and higher education to training and lifelong learning.
Abstract: This paper reviews empirical evidence, especially from Europe, on how education and training policies can be designed to advance both efficiency and equity. Returns to educational investments tend to decrease over the life cycle. Moreover, they are the highest for disadvantaged children at early stages and for the well-off at late stages of the life cycle. This creates complementarities between efficiency and equity at early stages and trade-offs at late stages. The paper discusses specific policies for efficiency and equity at each educational stage, ranging from early childhood education and schools over vocational and higher education to training and lifelong learning. The available evidence suggests that both efficiency and equity can be enhanced by output-oriented reforms properly designed to each stage, where the state generally sets a regulatory framework that ensures accountability and funding, and uses the forces of choice and competition to deliver best results. Designed this way, education and training systems can advance efficiency and equity at the same time.

92 citations

Journal ArticleDOI
TL;DR: The No Child Left Behind (NCLB) Act of 2001 has proven to be one of the most problematic education reforms in US history, prompting schools labeled “underperforming” to teach to the test, removing low-performing students from the testing pool, curtail “low-stakes” subjects, and artificially manipulate test scores and dropout rates.
Abstract: This article examines research on the impacts of high‐stakes accountability policies in the USA – in particular, the No Child Left Behind (NCLB) Act of 2001 – on Native American learners. NCLB's goals are laudable: close the achievement gap by making schools accountable for learning among all student groups, and by ensuring that all students are taught by highly qualified teachers. In practice, the policy has proven to be one of the most problematic education reforms in US history, prompting schools labeled “underperforming” to teach to the test, remove low‐performing students from the testing pool, curtail “low‐stakes” subjects, and artificially manipulate test scores and drop‐out rates. This article begins with a demographic, cultural, linguistic, and educational profile of Native American communities and an explanation of tribal sovereignty. The next sections provide an orientation to NCLB and an examination of empirical research on its impacts on Native American and other minoritized students. The fin...

90 citations

Journal ArticleDOI
TL;DR: This paper proposed an incentive pay scheme for educators that links educator compensation to the ranks of their students within appropriately defined comparison sets, and showed that under certain conditions their scheme induces teachers to allocate socially optimal levels of effort to all students.
Abstract: We propose an incentive pay scheme for educators that links educator compensation to the ranks of their students within appropriately defined comparison sets, and we show that under certain conditions our scheme induces teachers to allocate socially optimal levels of effort to all students. Because this scheme employs only ordinal information, our scheme allows education authorities to employ completely new assessments at each testing date without ever having to equate various assessment forms. This approach removes incentives for teachers to teach to a particular assessment form and eliminates opportunities to influence reward pay by corrupting the equating process or the scales used to report assessment results. Our system links compensation to the outcomes of properly seeded contests rather than cardinal measures of achievement growth. Thus, education authorities can employ our incentive scheme for educators while employing a separate system for measuring growth in student achievement that involves no stakes for educators. This approach does not create direct incentives for educators to take actions that contaminate the measurement of student progress.

90 citations

Journal ArticleDOI
TL;DR: In the fall of 2007, New York City began using student tests and other measures to assign each school a grade (A to F), and linked grades to rewards and consequences, including possible school closure.
Abstract: In the fall of 2007, New York City began using student tests and other measures to assign each school a grade (A to F), and linked grades to rewards and consequences, including possible school closure. These grades were released in late September, arguably too late for schools to make major changes in programs or personnel, and students were tested again in January (English) and March (math). Despite this time frame, regression discontinuity estimates indicate that receipt of a low grade significantly increased student achievement, more so in math than English, and improved parental evaluations of school quality. (JEL H75, I21, I28, J45)

88 citations

Journal ArticleDOI
TL;DR: Braga et al. as discussed by the authors studied the effects of educational reforms on school attainment and found support for the idea that left-wing parties support reforms that are inclusive, while rightwing parties prefer selective ones, by characterizing each group of reforms for their impact on mean years of education, educational inequality and intergenerational persistence.
Abstract: In this paper we study the effects of educational reforms on school attainment. We construct a dataset of relevant reforms that occurred at the national level over the last century, and match individual information from 24 European countries to the most likely set-up faced when individual educational choices were undertaken. Our identification strategy relies on temporal and geographical variations in the institutional arrangements, controlling for time/country fixed effects, as well as for country specific time trend. By characterizing each group of reforms for their impact on mean years of education, educational inequality and intergenerational persistence, we show an ideal policy menu which has been available to policymakers. We distinguish between groups of policies that are either ‘inclusive’ or ‘selective’, depending on their diminishing or augmenting impact on inequality and persistence. Finally, we correlate these reform measures to political coalitions prevailing in parliament, finding support for the idea that left-wing parties support reforms that are inclusive, while right-wing parties prefer selective ones. — Michela Braga, Daniele Checchi and Elena Meschi

88 citations

References
More filters
Journal ArticleDOI
TL;DR: In this article, a principal-agent model that can explain why employment is sometimes superior to independent contracting even when there are no productive advantages to specific physical or human capital and no financial market imperfections to limit the agent's borrowings is presented.
Abstract: Introduction In the standard economic treatment of the principal–agent problem, compensation systems serve the dual function of allocating risks and rewarding productive work. A tension between these two functions arises when the agent is risk averse, for providing the agent with effective work incentives often forces him to bear unwanted risk. Existing formal models that have analyzed this tension, however, have produced only limited results. It remains a puzzle for this theory that employment contracts so often specify fixed wages and more generally that incentives within firms appear to be so muted, especially compared to those of the market. Also, the models have remained too intractable to effectively address broader organizational issues such as asset ownership, job design, and allocation of authority. In this article, we will analyze a principal–agent model that (i) can account for paying fixed wages even when good, objective output measures are available and agents are highly responsive to incentive pay; (ii) can make recommendations and predictions about ownership patterns even when contracts can take full account of all observable variables and court enforcement is perfect; (iii) can explain why employment is sometimes superior to independent contracting even when there are no productive advantages to specific physical or human capital and no financial market imperfections to limit the agent's borrowings; (iv) can explain bureaucratic constraints; and (v) can shed light on how tasks get allocated to different jobs.

5,678 citations

Journal ArticleDOI
TL;DR: In this article, the authors focus on the first three months of training under the Manpower Development and Training Act (MDTA) in the U.S. in order to measure the full inter-temporal impact of training.
Abstract: GOVERNMENTAL post-schooling training programs have become a permanent fixture of the U.S. economy in the last decade. These programs are typically advocated for diverse reasons: (1) to reduce inflation by the provision of more skilled workers to alleviate shortages, (2) to reduce unemployment of certain groups, and (3) to reduce poverty by increasing the skills of certain groups. All of these objectives require that training programs increase the earnings of trainees above what they otherwise would be. For example, alleviating shortages by training more highly skilled workers should increase the earnings of these workers. Likewise, the concern for unemployed workers is derived from a concern for the decreased earnings of these workers; and if trainees subsequently suffer less unemployment, their earnings should be higher. Finally, training programs are intended to reduce poverty by increasing the earnings of low income workers. Evaluating the success of training programs is thus inherently a quantitative assessment of the effect of training on trainee earnings.' It is an important process both because it helps to inform discussions of public policy by shedding light on the past value of these programs as investments and because it can provide a means of testing our ability to augment the human capital of certain workers. Although there have been many studies of the effect of post-school classroom training on earnings it is by now rather widely agreed that very little is reliably known about the actual effects of these programs.2 Three main problems account for this state of affairs: (1) the large sample sizes required to detect relatively small anticipated program effects in a variable with such high variance as earnings, (2) the considerable expense required to keep track of trainees over a long enough period of time to measure the full inter-temporal impact of training, and (3) the extreme difficulty of implementing an adequate experimental design so as to obtain a group against which to reliably compare trainees.3 The purpose of this paper is to report on efforts to cope with this third problem using a data collection system that comes some way towards resolving the first two. The basic idea of this data system is to match the program record on each trainee with the trainee's Social Security earnings history. The Social Security Administration maintains a summary year-by-year earnings history for each Social Security account over the period since 1950 that may be used, under the appropriate confidentiality restrictions, for this purpose.4 In this paper I have concentrated on an analysis of all classroom trainees who started training under the Manpower Development and Training Act (MDTA) in the first 3 months of 1964 so as to ensure their having completed training in that year. In choosing to analyze trainees from so early a cohort something is clearly lost. On the one hand, the nature of the participants in these early years was considerably different than in the later years. In particular, programs geared Received for publication February 9, 1977. Revision accepted for publication August 1, 1977. * Princeton University. This research was supported by ASPER, U.S. Department of Labor, but does not represent an official position of the Department of Labor, its agencies, or staff. I would like to thank Gregory Chow, Ronald Ehrenberg, Roger Gordon, Zvi Griliches, George E. Johnson, Nicholas Kiefer, Richard Quandt, and Sherwin Rosen for helpful comments. I also owe a heavy debt to D. Alton Smith for computational and other assistance. 'See Reid (1976), for example, for a clear analysis of how knowledge of these effects is required in order to establish the impact of government training on the black/white wage differential. 2 Surveys of many of these studies may be found in Stromsdorfer (1972) and O'Neill (1973). 3For further discussion of these points see Ashenfelter (1975). 4The idea for using these data to analyze the effectiveness of government training programs is apparently quite an old one, having been suggested by the National Manpower Advisory Committee (U.S. Department of Labor, 1972) to the Secretary of Labor at its first meeting in a letter dated October 10, 1962, the year of passage of the Manpower Development and Training Act. Actual efforts along these lines were ultimately reported by Borus (1967), Commins (1970), Farber (1970), and Prescott and Cooley (1972).

1,456 citations

Journal ArticleDOI
TL;DR: Jacob and Levitt as mentioned in this paper investigated the prevalence and predictors of teacher cheating in Chicago Public Schools. But they did not consider the role of teachers in the cheating and did not identify any teachers who were involved in cheating.
Abstract: NBER WORKING PAPER SERIES ROTTEN APPLES: AN INVESTIGATION OF THE PREVALENCE AND PREDICTORS OF TEACHER CHEATING Brian A. Jacob Steven D. Levitt Working Paper 9413 http://www.nber.org/papers/w9413 NATIONAL BUREAU OF ECONOMIC RESEARCH 1050 Massachusetts Avenue Cambridge, MA 02138 December 2002 We would like to thank Suzanne Cooper, Mark Duggan, Sue Dynarski, Arne Duncan, Michael Greenstone, James Heckman, Lars Lefgren, and seminar participants too numerous to mention for helpful comments and discussions. We also thank Arne Duncan, Phil Hansen, Carol Perlman, and Jessie Qualles of the Chicago Public Schools for their help and cooperation on the project. Financial support was provided by the National Science Foundation and the Sloan Foundation. All remaining errors are our own. The views expressed herein are those of the authors and not necessarily those of the National Bureau of Economic Research. © 2002 by Brian A. Jacob and Steven D. Levitt. All rights reserved. Short sections of text not to exceed two paragraphs, may be quoted without explicit permission provided that full credit including, © notice, is given to the source.

660 citations

Journal ArticleDOI
TL;DR: In this article, the authors examined evidence on the effect of class size on student achievement and showed that the results of quantitative summaries of the literature depend critically on whether studies are accorded equal weight.
Abstract: This paper examines evidence on the effect of class size on student achievement. First, it is shown that results of quantitative summaries of the literature, such as Hanushek (1997), depend critically on whether studies are accorded equal weight. When studies are given equal weight, resources are systematically related to student achievement. When weights are in proportion to their number of estimates, resources and achievements are not systematically related. Second, a cost-benefit analysis of class size reduction is performed. Results of the Tennessee STAR class-size experiment suggest that the internal rate of return from reducing class size from 22 to 15 students is around 6%.

607 citations

Journal ArticleDOI
Walt Haney1
TL;DR: In this article, the authors summarize the recent history of education reform and statewide testing in Texas, which led to the introduction of the Texas Assessment of Academic Skills (TAAS) in 1990-91.
Abstract: I summarize the recent history of education reform and statewide testing in Texas, which led to introduction of the Texas Assessment of Academic Skills (TAAS) in 1990-91. A variety of evidence in the late 1990s led a number of observers to conclude that the state of Texas had made near miraculous progress in reducing dropouts and increasing achievement. The passing scores on TAAS tests were arbitrary and discriminatory. Analyses comparing TAAS reading, writing and math scores with one another and with relevant high school grades raise doubts about the reliability and validity of TAAS scores. I discuss problems of missing students and other mirages in Texas enrollment statistics that profoundly affect both reported dropout statistics and test scores. Only 50% of minority students in Texas have been progressing from grade 9 to high school graduation since the initiation of the TAAS testing program. Since about 1982, the rates at which Black and Hispanic students are required to repeat grade 9 have climbed steadily, such that by the late 1990s, nearly 30% of Black and Hispanic students were "failing" grade 9. Cumulative rates of grade retention in Texas are almost twice as high for Black and Hispanic students as for White students. Some portion of the gains in grade 10 TAAS pass rates are illusory. The numbers of students taking the grade 10 tests who were classified as "in special education" and hence not counted in schools' accountability ratings nearly doubled between 1994 and 1998. A substantial portion of the apparent increases in TAAS pass rates in the 1990s are due to such exclusions. In the opinion of educators in Texas, schools are devoting a huge amount of time and energy preparing students specifically for TAAS, and emphasis on TAAS is hurting more than helping teaching and learning in Texas schools, particularly with at-risk students, and TAAS contributes to retention in grade and dropping out. Five different sources of evidence about rates of high school completion in Texas are compared and contrasted. The review of GED statistics indicated that there was a sharp upturn in numbers of young people taking the GED tests in Texas in the mid-1990s to avoid TAAS. A convergence of evidence indicates that during the 1990s, slightly less than 70% of students in Texas actually graduated from high school. Between 1994 and 1997, TAAS results showed a 20% increase in the percentage of students passing all three exit level TAAS tests (reading, writing and math), but TASP (a college readiness test) results showed a sharp decrease (from 65.2% to 43.3%) in the percentage of students passing all three parts (reading, math, and writing). As measured by performance on the SAT, the academic learning of secondary school students in Texas has not improved since the early 1990s, compared with SAT takers nationally. SAT-Math scores have deteriorated relative to students nationally. The gains on NAEP for Texas fail to confirm the dramatic gains apparent on TAAS. The gains on TAAS and the unbelievable decreases in dropouts during the 1990s are more illusory than real. The Texas "miracle" is more hat than cattle.

549 citations

Frequently Asked Questions (1)
Q1. What are the contributions in "Nber working paper series accountability, incentives and behavior: the impact of high-stakes testing in the chicago public schools" ?

This study examines the impact of an accountability policy implemented in the Chicago Public Schools in 1996-97.