scispace - formally typeset
Open AccessJournal ArticleDOI

How persuasive is a good fit? A comment on theory testing.

Seth Roberts, +1 more
- 01 Jan 2000 - 
- Vol. 107, Iss: 2, pp 358-367
Reads0
Chats0
TLDR
The use of good fits as evidence is not supported by philosophers of science nor by the history of psychology; there seem to be no examples of a theory supported mainly by good fits that has led to demonstrable progress as mentioned in this paper.
Abstract
Quantitative theories with free parameters often gain credence when they closely fit data. This is a mistake. A good fit reveals nothing about the flexibility of the theory (how much it cannot fit), the variability of the data (how firmly the data rule out what the theory cannot fit), or the likelihood of other outcomes (perhaps the theory could have fit any plausible result), and a reader needs all 3 pieces of information to decide how much the fit should increase belief in the theory. The use of good fits as evidence is not supported by philosophers of science nor by the history of psychology; there seem to be no examples of a theory supported mainly by good fits that has led to demonstrable progress. A better way to test a theory with free parameters is to determine how the theory constrains possible outcomes (i.e., what it predicts), assess how firmly actual outcomes agree with those constraints, and determine if plausible alternative outcomes would have been inconsistent with the theory, allowing for the variability of the data.

read more

Content maybe subject to copyright    Report

UC San Diego
UC San Diego Previously Published Works
Title
How persuasive is a good fit? A comment on theory testing
Permalink
https://escholarship.org/uc/item/5vt0z72k
Journal
Psychological Review, 107(2)
Authors
Roberts, Seth
Pashler, Harold
Publication Date
2000
Peer reviewed
eScholarship.org Powered by the California Digital Library
University of California

Psychological Review (in press)
Running head: Testing Theories With Free Parameters
How Persuasive is a Good Fit?
Seth Roberts
University of California, Berkeley
Harold Pashler
University of California, San Diego
Quantitative theories with free parameters often gain credence when they "fit" data closely. This is a
mistake, we argue. A good fit reveals nothing about (a) the flexibility of the theory (how much it cannot
fit), (b) the variability of the data (how firmly the data rule out what the theory cannot fit), and (c) the
likelihood of other outcomes (perhaps the theory could have fit any plausible result)and a reader needs
to know all three to decide how much the fit should increase belief in the theory. As far as we can tell, the
use of good fits as evidence receives no support from philosophers of science nor from the history of
psychology; we have been unable to find examples of a theory supported mainly by good fits that has led
to demonstrable progress. We consider and rebut arguments used to defend the use of good fits as
evidencefor example, that a good fit is meaningful when the number of free parameters is small
compared to the number of data points, or when one model fits better than others. A better way to test a
theory with free parameters is to (a) determine how the theory constrains possible outcomes (i.e., what it
predicts); (b) assess how firmly actual outcomes agree with those constraints; and (c) determine if
plausible alternative outcomes would have been inconsistent with the theory, allowing for the variability
of the data.
How Persuasive is a Good Fit?
Many quantitative psychological
theories with free parameters are supported
mainly or entirely by demonstrations that
they can "fit" datathat the parameters can
be adjusted so that the output of the theory
resembles actual results. The similarity is
often shown via a graph with two functions:
one labeled observed (or data), the other
labeled predicted (or theory or simulated).
That the theory fits data is supposed to show
that the theory should be taken seriously
should be published, for example.
This type of argument is common;
judging from a search of Psychological
Abstracts, the research literature probably
contains thousands of examples. Early
instances involved sensory processes
(Hecht, 1934) and animal learning (Hull,
1943), but it is now used in many areas.
Here are three recent examples:
1. Cohen, Dunbar, and McClelland
(1990) proposed a parallel-distributed-
processing model to explain the Stroop
effect and related data. The model was
meant to embody a "continuous" view of
automaticity, in contrast to an "all-or-none"
(p. 332) view. The model contained many
adjustable parameters, including number of
units per module, ratio of training
frequencies, learning rate, maximum
response time, initial input weights, indirect
pathway strengths, cascade rate, noise,
magnitude of attentional influence (two
parameters), and response-mechanism
parameters (three). The model was fit to six
data sets. Some parameters (e.g., number of
units per module) were separately adjusted
for each data set; other parameters were
adjusted based on one data set and held
constant for the rest. The function relating
cycle time (model) to average reaction time
(observed) was always linear but its slope

Psychological Review (in press)
Running head: Testing Theories With Free Parameters
and intercept varied from one data set to the
next. That the model could fit several data
sets led the authors to conclude that
compared to the all-or-none view, "a more
useful approach is to consider automaticity
in terms of a continuum" (Cohen et al.,
1990, p. 357)although they did not try to fit
a model based on the all-or-none view.
2. Zhuikov, Couvillon, & Bitterman
(1994) presented a theory to explain goldfish
avoidance conditioning. It is a quantitative
version of Mowrer’s two-process theory, in
which some responses are generated by fear,
some by reinforcement. When some
simplifying assumptions are made, the
theory has three equations and six adjustable
parameters. The authors fit the theory to data
from four experiments, and concluded that
"the good fit suggests that the theory is
worth developing further" (Zhuikov,
Couvillon, & Bitterman, 1994, p. 32).
3. Rodgers and Rowe (1993)
proposed a theory that explains how
teenagers come to engage in various sexual
behaviors for the first time. It emphasizes
contact with other teenagers--a "contagion"
(p. 479) explanation. The theory has eight
equations with twelve free parameters.
Rodgers and Rowe fitted the theory to
survey data about the prevalence of kissing,
petting, and intercourse in boys and girls of
different ages and races and concluded that
the theory "appears to have successfully
captured many of the patterns in two
empirical data sets" (p. 505). This success
was the main support for the theory.
Why the Use of Good Fits as Evidence is
Wrong
This type of argument has three
serious problems. First, what the theory
predictshow much it constrains the fitted
datais unclear. Theorists who use good fits
as evidence seem to reason as follows: if our
theory is correct, it will be able to fit the
data; our theory fits the data; therefore it is
more likely that our theory is correct.
However, if a theory did not constrain
possible outcomes, the fit is meaningless.
A prediction is a statement of what a
theory does and does not allow. When a
theory has adjustable parameters, a
particular fit is just one example of what it
allows. To know what a theory predicts for a
particular measurement you need to know
all of what it allows (what else it can fit) and
all of what it does not allow (what it cannot
fit). For example, suppose two measures are
positively correlated, and it is shown that a
certain theory can produce such a relation
that is, can fit the data. This does not show
that the theory predicts the correlation. A
theory predicts such a relation only if it
could not fit other possible relations between
the two measureszero correlation, negative
correlationand this is not shown by fitting a
positive correlation.
When a theory does constrain
possible outcomes, it is necessary to know
how much. The more constraintthe
narrower the predictionthe more
impressive a confirmation of the constraint
(e.g., Meehl, 1997). Without knowing how
much a theory constrains possible outcomes,
you cannot know how impressed to be when
observation and theory are consistent.
Second, the variability of the data
(e.g., between-subject variation) is unclear.
How firmly do the data agree with the
predictions of the theory? Are they
compatible with the outcomes that the
theory rules out? The more conclusively the
data rule out what the theory rules out, the

Psychological Review (in press)
Running head: Testing Theories With Free Parameters
more impressive the confirmation. For

Psychological Review (in press)
Running head: Testing Theories With Free Parameters
example, suppose a theory predicts that a
certain measure should be greater than zero.
If the measure is greater than zero, the
shorter the confidence interval, the more
impressive the confirmation. That a theory
fits data does not show how firmly the data
rule out outcomes inconsistent with the
theory; without this information, you cannot
know how impressed to be that theory and
observation are consistent.
Adding error bars may not solve this
problem; it is variability on the constrained
dimension(s) that matters. For example,
suppose a theory predicts that several points
will lie on a straight line. To judge the
accuracy of this prediction, the reader needs
to know the variability of a measure of
curvature (or some other measure of non-
linearity). Adding vertical error bars to each
point is a poor substitute (unless the answer,
linear or non-linear, is very clear); the
vertical position of the points is not what the
theory predicts.
Figure 1: Four possible relationships between theory and data.
(Measures A and B are both measures of behavior. For both
measures, the axes cover the whole range of possible values. The
dotted areas indicate the range of outcomes that would be
consistent with the theory. The error bars indicate standard errors.
In every case, the theory can closely fit the data, but only when
both theory and data provide substantial constraints does this
provide significant evidence for the theory.)
To further illustrate these points,
Figure 1 shows four ways a "two-
dimensional" predictiona constraint
involving two measures at oncecan be
compatible with data. Measures A and B in
Figure 1 are both derived from
measurements of behavior. Either might be
quite simple (e.g., trials to criterion) or
relatively complex (the quadratic component
of a fitted function); it does not matter. The
axis of each measure covers the entire range
of plausible values of the measure before the
experiment is done (e.g., from 0 to 1, if the
measure is a probability). The dotted area
shows the predictions of the theory, the
range of outcomes that are consistent with
the theory. In the two upper panels of Figure
1, the theory tightly constrains possible
outcomes; in the two lower panels, it does
not. In each case there is one data point. In
the two left-hand panels, the observations
tightly constrain the population value; in the
two right-hand panels, they do not. In every
case, the data are consistent with the theory
(the data point is within the dotted area),
which means in every case the theory can
closely fit the data. But only the situation in
the upper left panel is substantial evidence
for the theory.
Third, the a-priori likelihood that the
theory will fit--the likelihood it will fit
whether or not it is true--is ignored. Perhaps
the theory could fit any plausible result. It is
well-known that a theory gains more support
from the correct prediction of an unlikely
event than from the correct prediction of
something that was expected anyway.
Lakatos (1978) made this point vividly: "It
is no success for Newtonian theory that
stones, when dropped, fall towards the earth,

Citations
More filters
Journal ArticleDOI

An Integrated Theory of the Mind.

TL;DR: The perceptual-motor modules, the goal module, and the declarative memory module are presented as examples of specialized systems in ACT-R, which consists of multiple modules that are integrated to produce coherent cognition.
Journal ArticleDOI

Heuristic Decision Making

TL;DR: Research indicates that individuals and organizations often rely on simple heuristics in an adaptive way, and ignoring part of the information can lead to more accurate judgments than weighting and adding all information, for instance for low predictability and small samples.
Journal ArticleDOI

A discounting framework for choice with delayed and probabilistic rewards.

TL;DR: The present effort illustrates the value of studying choice involving both delayed and probabilistic outcomes within a general discounting framework that uses similar experimental procedures and a common analytical approach.
Journal ArticleDOI

Homo Heuristicus: Why Biased Minds Make Better Inferences

TL;DR: The study of heuristics shows that less information, computation, and time can in fact improve accuracy, in contrast to the widely held view that less processing reduces accuracy.
Journal ArticleDOI

Moving beyond multiple regression analysis to algorithms: Calling for adoption of a paradigm shift from symmetric to asymmetric thinking in data analysis and crafting theory

TL;DR: In this paper, the same data used for the MRA is used to conduct a fuzzy-set qualitative comparative analysis (fsQCA) for accounting, consumer research, finance, management, and marketing.
References
More filters
Journal ArticleDOI

A new look at the statistical model identification

TL;DR: In this article, a new estimate minimum information theoretical criterion estimate (MAICE) is introduced for the purpose of statistical identification, which is free from the ambiguities inherent in the application of conventional hypothesis testing procedure.
Book

The Logic of Scientific Discovery

Karl Popper
TL;DR: The Open Society and Its Enemies as discussed by the authors is regarded as one of Popper's most enduring books and contains insights and arguments that demand to be read to this day, as well as many of the ideas in the book.
Journal ArticleDOI

Confirmation Bias: A Ubiquitous Phenomenon in Many Guises:

TL;DR: Confirmation bias, as the term is typically used in the psychological literature, connotes the seeking or interpreting of evidence in ways that are partial to existing beliefs, expectations, or a h...