scispace - formally typeset
Search or ask a question
Book ChapterDOI

The standards for educational and psychological testing.

01 Jan 2013-
TL;DR: In this article, the authors present a survey of sales in terms of total units sold in the United States for the years 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018
Abstract: Period Notes No. of Units FY 1999 FY 2000 FY 2001 FY 2002 FY 2003 FY 2004 FY 2005 FY 2006 7/1/06-12/31/06 FY 2007 FY 2008 FY 2009 FY 2010 FY 2011 FY 2012 FY 2013 FY 2014 Total Units Sold est. est. est. est. est. est. Actual Actual Actual Actual Actual Actual Actual Actual Actual Actual Actual 1,768 3,797 3,755 5,592 3,310 3,218 3,803 3,888 2,144 3,077 3,358 2,590 3,043 2,132 1,649 1,732 855 49,710
Citations
More filters
Journal ArticleDOI
TL;DR: In this article, the authors proposed a method for the first publication of first publication to the Practical Assessment, Research & Evaluation (PARE) journal for the purpose of obtaining a first publication license.
Abstract: Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation. Permission is granted to distribute this article for nonprofit, educational purposes if it is copied in its entirety and the journal is credited. PARE has the right to authorize third party reproduction of this article in print, electronic and database forms.

992 citations

Journal ArticleDOI
TL;DR: In this article, an integrative and comprehensive review of the 285 articles on servant leadership spanning 20 years (1998-2018) is presented. But, a lack of coherence and clarity around the construct has impeded its theory development.
Abstract: Notwithstanding the proliferation of servant leadership studies with over 100 articles published in the last four years alone, a lack of coherence and clarity around the construct has impeded its theory development. We provide an integrative and comprehensive review of the 285 articles on servant leadership spanning 20 years (1998–2018), and in so doing extend the field in four different ways. First, we provide a conceptual clarity of servant leadership vis-a-vis other value-based leadership approaches and offer a new definition of servant leadership. Second, we evaluate 16 existing measures of servant leadership in light of their respective rigor of scale construction and validation. Third, we map the theoretical and nomological network of servant leadership in relation to its antecedents, outcomes, moderators, mediators. We finally conclude by presenting a detailed future research agenda to bring the field forward encompassing both theoretical and empirical advancement. All in all, our review paints a holistic picture of where the literature has been and where it should go into the future.

689 citations

Journal ArticleDOI
TL;DR: It is concluded that debate over the optimal name for this broad category of personal qualities obscures substantial agreement about the specific attributes worth measuring and medium-term innovations that may make measures of these personal qualities more suitable for educational purposes are highlighted.
Abstract: There has been perennial interest in personal qualities other than cognitive ability that determine success, including self-control, grit, growth mindset, and many others. Attempts to measure such qualities for the purposes of educational policy and practice, however, are more recent. In this article, we identify serious challenges to doing so. We first address confusion over terminology, including the descriptor "non-cognitive." We conclude that debate over the optimal name for this broad category of personal qualities obscures substantial agreement about the specific attributes worth measuring. Next, we discuss advantages and limitations of different measures. In particular, we compare self-report questionnaires, teacher-report questionnaires, and performance tasks, using self-control as an illustrative case study to make the general point that each approach is imperfect in its own way. Finally, we discuss how each measure's imperfections can affect its suitability for program evaluation, accountability, individual diagnosis, and practice improvement. For example, we do not believe any available measure is suitable for between-school accountability judgments. In addition to urging caution among policymakers and practitioners, we highlight medium-term innovations that may make measures of these personal qualities more suitable for educational purposes.

687 citations

Journal Article
TL;DR: The authors pointed out that ProPublica's report was based on faulty statistics and data analysis, and that the report failed to show that the COMPAS itself is racially biased, let alone that other risk instruments are biased.
Abstract: The validity and intellectual honesty of conducting and reporting analysis are critical, since the ramifications of published data, accurate or misleading, may have consequences for years to come.-Marco and Larkin, 2000, p. 692PROPUBLICA RECENTLY RELEASED a much-heralded investigative report claiming that a risk assessment tool (known as the COMPAS) used in criminal justice is biased against black defendants.12 The report heavily implied that such bias is inherent in all actuarial risk assessment instruments (ARAIs).We think ProPublica's report was based on faulty statistics and data analysis, and that the report failed to show that the COMPAS itself is racially biased, let alone that other risk instruments are biased. Not only do ProPublica's results contradict several comprehensive existing studies concluding that actuarial risk can be predicted free of racial and/or gender bias, a correct analysis of the underlying data (which we provide below) sharply undermines ProPublicas approach.Our reasons for writing are simple. It might be that the existing justice system is biased against poor minorities due to a wide variety of reasons (including economic factors, policing patterns, prosecutorial behavior, and judicial biases), and therefore, regardless of the degree of bias, risk assessment tools informed by objective data can help reduce racial bias from its current level. It would be a shame if policymakers mistakenly thought that risk assessment tools were somehow worse than the status quo. Because we are at a time in history when there appears to be bipartisan political support for criminal justice reform, one poorly executed study that makes such absolute claims of bias should not go unchallenged. The gravity of this study's erroneous conclusions is exacerbated by the large-market outlet in which it was published (ProPublica).Before we expand further into our criticisms of the ProPublica piece, we describe some context and characteristics of the American criminal justice system and risk assessments.Mass Incarceration and ARAIsThe United States is clearly the worldwide leader in imprisonment. The prison population in the United States has declined by small percentages in recent years and at year-end 2014 the prison population was the smallest it had been since 2004. Yet, we still incarcerated 1,561,500 individuals in federal and state correctional facilities (Carson, 2015). By sheer numbers, or rates per 100,000 inhabitants, the United States incarcerates more people than just about any country in the world that reports reliable incarceration statistics (Wagner & Walsh, 2016).Further, it appears that there is a fair amount of racial disproportion when comparing the composition of the general population with the composition of the prison population. The 2014 United States Census population projection estimates that, across the U.S., the racial breakdown of the 318 million residents comprised 62.1 percent white, 13.2 percent black or African American, and 17.4 percent Hispanic. In comparison, 37 percent of the prison population was categorized as black, 32 percent was categorized as white, and 22 percent as Hispanic (Carson, 2015). Carson (2015:15) states that, "As a percentage of residents of all ages at yearend 2014, 2.7 percent of black males (or 2,724 per 100,000 black male residents) and 1.1 percent of Hispanic males (1,090 per 100,000 Hispanic males) were serving sentences of at least 1 year in prison, compared to less than 0.5 percent of white males (465 per 100,000 white male residents)."Aside from the negative effects caused by imprisonment, there is a massive financial cost that extends beyond official correctional budgets. A recent report by The Vera Institute of Justice (Henrichson & Delaney, 2012) indicated that the cost of prison operations (including such things as pension and insurance contributions, capital costs, legal fees, and administrative fees) in 40 states participating in their study was 39. …

679 citations

Journal ArticleDOI
TL;DR: The authors present seven principles that have guided our thinking about emotional intelligence, some of them new, and reformulated our original ability model here guided by these principles, and present a new ability model based on these principles.
Abstract: This article presents seven principles that have guided our thinking about emotional intelligence, some of them new. We have reformulated our original ability model here guided by these principles,...

642 citations

References
More filters
Journal ArticleDOI
TL;DR: In this article, the adequacy of the conventional cutoff criteria and several new alternatives for various fit indexes used to evaluate model fit in practice were examined, and the results suggest that, for the ML method, a cutoff value close to.95 for TLI, BL89, CFI, RNI, and G...
Abstract: This article examines the adequacy of the “rules of thumb” conventional cutoff criteria and several new alternatives for various fit indexes used to evaluate model fit in practice. Using a 2‐index presentation strategy, which includes using the maximum likelihood (ML)‐based standardized root mean squared residual (SRMR) and supplementing it with either Tucker‐Lewis Index (TLI), Bollen's (1989) Fit Index (BL89), Relative Noncentrality Index (RNI), Comparative Fit Index (CFI), Gamma Hat, McDonald's Centrality Index (Mc), or root mean squared error of approximation (RMSEA), various combinations of cutoff values from selected ranges of cutoff criteria for the ML‐based SRMR and a given supplemental fit index were used to calculate rejection rates for various types of true‐population and misspecified models; that is, models with misspecified factor covariance(s) and models with misspecified factor loading(s). The results suggest that, for the ML method, a cutoff value close to .95 for TLI, BL89, CFI, RNI, and G...

76,383 citations

Journal ArticleDOI
TL;DR: This transmutability of the validation matrix argues for the comparisons within the heteromethod block as the most generally relevant validation data, and illustrates the potential interchangeability of trait and method components.
Abstract: Content Memory (Learning Ability) As Comprehension 82 Vocabulary Cs .30 ( ) .23 .31 ( ) .31 .31 .35 ( ) .29 .48 .35 .38 ( ) .30 .40 .47 .58 .48 ( ) As judged against these latter values, comprehension (.48) and vocabulary (.47), but not memory (.31), show some specific validity. This transmutability of the validation matrix argues for the comparisons within the heteromethod block as the most generally relevant validation data, and illustrates the potential interchangeability of trait and method components. Some of the correlations in Chi's (1937) prodigious study of halo effect in ratings are appropriate to a multitrait-multimethod matrix in which each rater might be regarded as representing a different method. While the published report does not make these available in detail because it employs averaged values, it is apparent from a comparison of his Tables IV and VIII that the ratings generally failed to meet the requirement that ratings of the same trait by different raters should correlate higher than ratings of different traits by the same rater. Validity is shown to the extent that of the correlations in the heteromethod block, those in the validity diagonal are higher than the average heteromethod-heterotrait values. A conspicuously unsuccessful multitrait-multimethod matrix is provided by Campbell (1953, 1956) for rating of the leadership behavior of officers by themselves and by their subordinates. Only one of 11 variables (Recognition Behavior) met the requirement of providing a validity diagonal value higher than any of the heterotrait-heteromethod values, that validity being .29. For none of the variables were the validities higher than heterotrait-monomethod values. A study of attitudes toward authority and nonauthority figures by Burwen and Campbell (1957) contains a complex multitrait-multimethod matrix, one symmetrical excerpt from which is shown in Table 6. Method variance was strong for most of the procedures in this study. Where validity was found, it was primarily at the level of validity diagonal values higher than heterotrait-heteromethod values. As illustrated in Table 6, attitude toward father showed this kind of validity, as did attitude toward peers to a lesser degree. Attitude toward boss showed no validity. There was no evidence of a generalized attitude toward authority which would include father and boss, although such values as the VALIDATION BY THE MULTITRAIT-MULTIMETHOD MATRIX

15,795 citations

Journal ArticleDOI
TL;DR: In this paper, the authors examined the change in the goodness-of-fit index (GFI) when cross-group constraints are imposed on a measurement model and found that the change was independent of both model complexity and sample size.
Abstract: Measurement invariance is usually tested using Multigroup Confirmatory Factor Analysis, which examines the change in the goodness-of-fit index (GFI) when cross-group constraints are imposed on a measurement model. Although many studies have examined the properties of GFI as indicators of overall model fit for single-group data, there have been none to date that examine how GFIs change when between-group constraints are added to a measurement model. The lack of a consensus about what constitutes significant GFI differences places limits on measurement invariance testing. We examine 20 GFIs based on the minimum fit function. A simulation under the two-group situation was used to examine changes in the GFIs (ΔGFIs) when invariance constraints were added. Based on the results, we recommend using Δcomparative fit index, ΔGamma hat, and ΔMcDonald's Noncentrality Index to evaluate measurement invariance. These three ΔGFIs are independent of both model complexity and sample size, and are not correlated with the o...

10,597 citations

Journal ArticleDOI
TL;DR: The establishment of measurement invariance across groups is a logical prerequisite to conducting substantive cross-group comparisons (e.g., tests of group mean differences, invariance of structura, etc.).
Abstract: The establishment of measurement invariance across groups is a logical prerequisite to conducting substantive cross-group comparisons (e.g., tests of group mean differences, invariance of structura...

6,086 citations