scispace - formally typeset
Search or ask a question
Author

Valen E. Johnson

Bio: Valen E. Johnson is an academic researcher from Texas A&M University. The author has contributed to research in topics: Bayesian probability & Bayesian inference. The author has an hindex of 43, co-authored 155 publications receiving 9541 citations. Previous affiliations of Valen E. Johnson include University of North Carolina at Chapel Hill & Lanzhou University.


Papers
More filters
Journal ArticleDOI
Daniel J. Benjamin1, James O. Berger2, Magnus Johannesson3, Magnus Johannesson1, Brian A. Nosek4, Brian A. Nosek5, Eric-Jan Wagenmakers6, Richard A. Berk7, Kenneth A. Bollen8, Björn Brembs9, Lawrence D. Brown7, Colin F. Camerer10, David Cesarini11, David Cesarini12, Christopher D. Chambers13, Merlise A. Clyde2, Thomas D. Cook14, Thomas D. Cook15, Paul De Boeck16, Zoltan Dienes17, Anna Dreber3, Kenny Easwaran18, Charles Efferson19, Ernst Fehr20, Fiona Fidler21, Andy P. Field17, Malcolm R. Forster22, Edward I. George7, Richard Gonzalez23, Steven N. Goodman24, Edwin J. Green25, Donald P. Green26, Anthony G. Greenwald27, Jarrod D. Hadfield28, Larry V. Hedges14, Leonhard Held20, Teck-Hua Ho29, Herbert Hoijtink30, Daniel J. Hruschka31, Kosuke Imai32, Guido W. Imbens24, John P. A. Ioannidis24, Minjeong Jeon33, James Holland Jones34, Michael Kirchler35, David Laibson36, John A. List37, Roderick J. A. Little23, Arthur Lupia23, Edouard Machery38, Scott E. Maxwell39, Michael A. McCarthy21, Don A. Moore40, Stephen L. Morgan41, Marcus R. Munafò42, Shinichi Nakagawa43, Brendan Nyhan44, Timothy H. Parker45, Luis R. Pericchi46, Marco Perugini47, Jeffrey N. Rouder48, Judith Rousseau49, Victoria Savalei50, Felix D. Schönbrodt51, Thomas Sellke52, Betsy Sinclair53, Dustin Tingley36, Trisha Van Zandt16, Simine Vazire54, Duncan J. Watts55, Christopher Winship36, Robert L. Wolpert2, Yu Xie32, Cristobal Young24, Jonathan Zinman44, Valen E. Johnson1, Valen E. Johnson18 
University of Southern California1, Duke University2, Stockholm School of Economics3, Center for Open Science4, University of Virginia5, University of Amsterdam6, University of Pennsylvania7, University of North Carolina at Chapel Hill8, University of Regensburg9, California Institute of Technology10, New York University11, Research Institute of Industrial Economics12, Cardiff University13, Northwestern University14, Mathematica Policy Research15, Ohio State University16, University of Sussex17, Texas A&M University18, Royal Holloway, University of London19, University of Zurich20, University of Melbourne21, University of Wisconsin-Madison22, University of Michigan23, Stanford University24, Rutgers University25, Columbia University26, University of Washington27, University of Edinburgh28, National University of Singapore29, Utrecht University30, Arizona State University31, Princeton University32, University of California, Los Angeles33, Imperial College London34, University of Innsbruck35, Harvard University36, University of Chicago37, University of Pittsburgh38, University of Notre Dame39, University of California, Berkeley40, Johns Hopkins University41, University of Bristol42, University of New South Wales43, Dartmouth College44, Whitman College45, University of Puerto Rico46, University of Milan47, University of California, Irvine48, Paris Dauphine University49, University of British Columbia50, Ludwig Maximilian University of Munich51, Purdue University52, Washington University in St. Louis53, University of California, Davis54, Microsoft55
TL;DR: The default P-value threshold for statistical significance is proposed to be changed from 0.05 to 0.005 for claims of new discoveries in order to reduce uncertainty in the number of discoveries.
Abstract: We propose to change the default P-value threshold for statistical significance from 0.05 to 0.005 for claims of new discoveries.

1,586 citations

Posted Content
TL;DR: This article proposed to change the default P-value threshold for statistical significance for claims of new discoveries from 0.05 to 0.005, which is the threshold used in this paper.
Abstract: We propose to change the default P-value threshold for statistical significance for claims of new discoveries from 0.05 to 0.005.

1,415 citations

Journal ArticleDOI
TL;DR: Modifications of common standards of evidence are proposed to reduce the rate of nonreproducibility of scientific research by a factor of 5 or greater and to correct the problem of unjustifiably high levels of significance.
Abstract: Recent advances in Bayesian hypothesis testing have led to the development of uniformly most powerful Bayesian tests, which represent an objective, default class of Bayesian hypothesis tests that have the same rejection regions as classical significance tests. Based on the correspondence between these two classes of tests, it is possible to equate the size of classical hypothesis tests with evidence thresholds in Bayesian tests, and to equate P values with Bayes factors. An examination of these connections suggest that recent concerns over the lack of reproducibility of scientific studies can be attributed largely to the conduct of significance tests at unjustifiably high levels of significance. To correct this problem, evidence thresholds required for the declaration of a significant finding should be increased to 25–50:1, and to 100–200:1 for the declaration of a highly significant finding. In terms of classical hypothesis tests, these evidence standards mandate the conduct of tests at the 0.005 or 0.001 level of significance.

671 citations

Journal ArticleDOI
TL;DR: It is demonstrated that landmark pairs can be used to assess DIR spatial accuracy within a narrow uncertainty range and based on fewer than the required validation landmarks results in misrepresentation of the relative spatial accuracy.
Abstract: Expert landmark correspondences are widely reported for evaluating deformable image registration (DIR) spatial accuracy. In this report, we present a framework for objective evaluation of DIR spatial accuracy using large sets of expert-determined landmark point pairs. Large samples (>1100) of pulmonary landmark point pairs were manually generated for five cases. Estimates of inter- and intra-observer variation were determined from repeated registration. Comparative evaluation of DIR spatial accuracy was performed for two algorithms, a gradient-based optical flow algorithm and a landmark-based moving least-squares algorithm. The uncertainty of spatial error estimates was found to be inversely proportional to the square root of the number of landmark point pairs and directly proportional to the standard deviation of the spatial errors. Using the statistical properties of this data, we performed sample size calculations to estimate the average spatial accuracy of each algorithm with 95% confidence intervals within a 0.5 mm range. For the optical flow and moving least-squares algorithms, the required sample sizes were 1050 and 36, respectively. Comparative evaluation based on fewer than the required validation landmarks results in misrepresentation of the relative spatial accuracy. This study demonstrates that landmark pairs can be used to assess DIR spatial accuracy within a narrow uncertainty range.

513 citations

Book
30 Mar 1999
TL;DR: Review of Classical and Bayesian Inference, and Graded Response Models: A Case Study of Undergraduate Grade Data.
Abstract: Review of Classical and Bayesian Inference.- Review of Bayesian Computation.- Regression Models for Binary Data.- Regression Models for Ordinal Data.- Analyzing Data from Multiple Raters.- Item Response Modeling.- Graded Response Models: A Case Study of Undergraduate Grade Data.

484 citations


Cited by
More filters
Book
24 Aug 2012
TL;DR: This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach, and is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.
Abstract: Today's Web-enabled deluge of electronic data calls for automated methods of data analysis. Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach. The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic methods, the book stresses a principled model-based approach, often using the language of graphical models to specify models in a concise and intuitive way. Almost all the models described have been implemented in a MATLAB software package--PMTK (probabilistic modeling toolkit)--that is freely available online. The book is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.

8,059 citations

Journal ArticleDOI
TL;DR: A fatal flaw of NHST is reviewed and some benefits of Bayesian data analysis are introduced and illustrative examples of multiple comparisons in Bayesian analysis of variance and Bayesian approaches to statistical power are presented.
Abstract: Bayesian methods have garnered huge interest in cognitive science as an approach to models of cognition and perception. On the other hand, Bayesian methods for data analysis have not yet made much headway in cognitive science against the institutionalized inertia of 20th century null hypothesis significance testing (NHST). Ironically, specific Bayesian models of cognition and perception may not long endure the ravages of empirical verification, but generic Bayesian methods for data analysis will eventually dominate. It is time that Bayesian data analysis became the norm for empirical methods in cognitive science. This article reviews a fatal flaw of NHST and introduces the reader to some benefits of Bayesian data analysis. The article presents illustrative examples of multiple comparisons in Bayesian analysis of variance and Bayesian approaches to statistical power. Copyright © 2010 John Wiley & Sons, Ltd. For further resources related to this article, please visit the WIREs website.

6,081 citations

Journal ArticleDOI
TL;DR: The American Statistical Association (ASA) released a policy statement on p-values and statistical significance in 2015 as discussed by the authors, which was based on a discussion with the ASA Board of Trustees and concerned with reproducibility and replicability of scientific conclusions.
Abstract: Cobb’s concern was a long-worrisome circularity in the sociology of science based on the use of bright lines such as p< 0.05: “We teach it because it’s what we do; we do it because it’s what we teach.” This concern was brought to the attention of the ASA Board. The ASA Board was also stimulated by highly visible discussions over the last few years. For example, ScienceNews (Siegfried 2010) wrote: “It’s science’s dirtiest secret: The ‘scientific method’ of testing hypotheses by statistical analysis stands on a flimsy foundation.” A November 2013, article in Phys.org Science News Wire (2013) cited “numerous deep flaws” in null hypothesis significance testing. A ScienceNews article (Siegfried 2014) on February 7, 2014, said “statistical techniques for testing hypotheses...havemore flaws than Facebook’s privacy policies.” Aweek later, statistician and “Simply Statistics” blogger Jeff Leek responded. “The problem is not that people use P-values poorly,” Leek wrote, “it is that the vast majority of data analysis is not performed by people properly trained to perform data analysis” (Leek 2014). That same week, statistician and science writer Regina Nuzzo published an article in Nature entitled “Scientific Method: Statistical Errors” (Nuzzo 2014). That article is nowone of the most highly viewedNature articles, as reported by altmetric.com (http://www.altmetric.com/details/2115792#score). Of course, it was not simply a matter of responding to some articles in print. The statistical community has been deeply concerned about issues of reproducibility and replicability of scientific conclusions. Without getting into definitions and distinctions of these terms, we observe that much confusion and even doubt about the validity of science is arising. Such doubt can lead to radical choices, such as the one taken by the editors of Basic andApplied Social Psychology, who decided to ban p-values (null hypothesis significance testing) (Trafimow and Marks 2015). Misunderstanding or misuse of statistical inference is only one cause of the “reproducibility crisis” (Peng 2015), but to our community, it is an important one. When the ASA Board decided to take up the challenge of developing a policy statement on p-values and statistical significance, it did so recognizing this was not a lightly taken step. The ASA has not previously taken positions on specific matters of statistical practice. The closest the association has come to this is a statement on the use of value-added models (VAM) for educational assessment (Morganstein and Wasserstein 2014) and a statement on risk-limiting post-election audits (American Statistical Association 2010). However, these were truly policy-related statements. The VAM statement addressed a key educational policy issue, acknowledging the complexity of the issues involved, citing limitations of VAMs as effective performance models, and urging that they be developed and interpreted with the involvement of statisticians. The statement on election auditing was also in response to a major but specific policy issue (close elections in 2008), and said that statistically based election audits should become a routine part of election processes. By contrast, the Board envisioned that the ASA statement on p-values and statistical significance would shed light on an aspect of our field that is too often misunderstood and misused in the broader research community, and, in the process, provides the community a service. The intended audience would be researchers, practitioners, and science writers who are not primarily statisticians. Thus, this statementwould be quite different from anything previously attempted. The Board tasked Wasserstein with assembling a group of experts representing a wide variety of points of view. On behalf of the Board, he reached out to more than two dozen such people, all of whom said theywould be happy to be involved. Several expressed doubt about whether agreement could be reached, but those who did said, in effect, that if there was going to be a discussion, they wanted to be involved. Over the course of many months, group members discussed what format the statement should take, tried to more concretely visualize the audience for the statement, and began to find points of agreement. That turned out to be relatively easy to do, but it was just as easy to find points of intense disagreement. The time came for the group to sit down together to hash out these points, and so in October 2015, 20 members of the group met at the ASA Office in Alexandria, Virginia. The 2-day meeting was facilitated by Regina Nuzzo, and by the end of the meeting, a good set of points around which the statement could be built was developed. The next 3 months saw multiple drafts of the statement, reviewed by group members, by Board members (in a lengthy discussion at the November 2015 ASA Board meeting), and by members of the target audience. Finally, on January 29, 2016, the Executive Committee of the ASA approved the statement. The statement development process was lengthier and more controversial than anticipated. For example, there was considerable discussion about how best to address the issue of multiple potential comparisons (Gelman and Loken 2014). We debated at some length the issues behind the words “a p-value near 0.05 taken by itself offers only weak evidence against the null

4,361 citations

Journal ArticleDOI
TL;DR: A look at progress in the field over the last 20 years is looked at and some of the challenges that remain for the years to come are suggested.
Abstract: The analysis of medical images has been woven into the fabric of the pattern analysis and machine intelligence (PAMI) community since the earliest days of these Transactions. Initially, the efforts in this area were seen as applying pattern analysis and computer vision techniques to another interesting dataset. However, over the last two to three decades, the unique nature of the problems presented within this area of study have led to the development of a new discipline in its own right. Examples of these include: the types of image information that are acquired, the fully three-dimensional image data, the nonrigid nature of object motion and deformation, and the statistical variation of both the underlying normal and abnormal ground truth. In this paper, we look at progress in the field over the last 20 years and suggest some of the challenges that remain for the years to come.

4,249 citations

Journal ArticleDOI
TL;DR: A survey of factor analytic studies of human cognitive abilities can be found in this paper, with a focus on the role of factor analysis in human cognitive ability evaluation and cognition. But this survey is limited.
Abstract: (1998). Human cognitive abilities: A survey of factor analytic studies. Gifted and Talented International: Vol. 13, No. 2, pp. 97-98.

2,388 citations