scispace - formally typeset
Open AccessJournal ArticleDOI

Detecting Faking on a Personality Instrument Using Appropriateness Measurement

TLDR
In this article, the authors evaluated the use of appropriateness measurement for identifying dishonest respondents on personality tests and found that the item response theory approach classified a higher number of faking respondents at low rates of misclassification of honest respondents than did a social desirability scale.
Abstract
Research has demonstrated that people can and often do consciously manipulate scores on personality tests. Test constructors have responded by using social desir ability and lying scales in order to identify dishonest re spondents. Unfortunately, these approaches have had limited success. This study evaluated the use of appropri ateness measurement for identifying dishonest respon dents. A dataset was analyzed in which respondents were instructed either to answer honestly or to fake good. The item response theory approach classified a higher number of faking respondents at low rates of misclassification of honest respondents (false positives) than did a social de sirability scale. At higher false positive rates, the social desirability approach did slightly better. Implications for operational testing and suggestions for further research are provided.

read more

Content maybe subject to copyright    Report

71
Detecting
Faking
on
a
Personality
Instrument
Using
Appropriateness
Measurement
Michael
J.
Zickar
and
Fritz
Drasgow
University
of
Illinois,
Urbana-Champaign
Research
has
demonstrated
that
people
can
and
often
do
consciously
manipulate
scores
on
personality
tests.
Test
constructors
have
responded
by
using
social
desir-
ability
and
lying
scales
in
order
to
identify
dishonest
re-
spondents.
Unfortunately,
these
approaches
have
had
limited
success.
This
study
evaluated
the
use
of
appropri-
ateness
measurement
for
identifying
dishonest
respon-
dents.
A
dataset
was
analyzed
in
which
respondents
were
instructed
either
to
answer
honestly
or
to
fake
good.
The
item
response
theory
approach
classified
a
higher
number
of
faking
respondents
at
low
rates
of
misclassification
of
honest
respondents
(false
positives)
than
did
a
social
de-
sirability
scale.
At
higher
false
positive
rates,
the
social
desirability
approach
did
slightly
better.
Implications
for
operational
testing
and
suggestions
for
further
research
are
provided.
Index
terms:
appropriateness
measure-
ment,
detecting
faking,
item
response
theory,
lying
scales,
person fit,
personality
measurement.
Much
literature
has
linked
measures
of
personality
traits
with behavior
in
organizational
settings.
For
example,
Sparks
(1983)
found
consistent
relationships
between
scores
on
a
standardized
personality
scale
and
measures
of
job
success,
job
effectiveness,
and
management
potential.
Personality
variables
such
as
conscientiousness
and
anxiety
have
been
found
to
correlate
with
absenteeism
and
turnover
(Bemardin,
1977).
Army
enlistees
who
were
low
in
traits
such
as
emotional
stability
and
nondelinquency
had
a
higher
drop-out
rate
during
a
four-year
army
term
(White,
Nord,
Mael,
&
Young,
1993).
Barrick
&
Mount
(1991)
discovered
a
small
but
consistent
relationship
(r
=
.22)
between
conscientiousness
and
a
wide
variety
of
criteria
across
a
broad
range
of
jobs
in
a
meta-analysis
of
previous
research.
Extroversion
was
also
a
significant
predictor
of
job-related
behaviors
for
both
sales
and
management
positions
(r
=
.15
and
r
=
.18,
respectively).
Although
these
relationships
are
lower
than
validity
coefficients
typical
of
cognitive
ability
tests,
personality
measures
assess
quite
different
human
attributes,
thus
providing
incremental
validity
when
combined
with
cognitive
ability
measures.
Given
this
potential,
personality
constructs
would
be
expected
to
be
prevalent
in
personnel
selection
programs.
However,
companies
have
been
reluctant
to
include
personality
instruments
in
their
pro-
grams ;
instead,
they
primarily
use
ability
tests
and
interviews.
One
reason
for
this
reluctance
is
the
possibility
of
faking
on
personality
measures.
Past
research
has
established
that
respondents
are
able
to
significantly
distort
scores
on
a
wide
variety
of
personality
measures
(e.g.,
Gillis,
Rogers,
&
Dickes,
1990;
Krahe,
1989).
Respondents
who
are
instructed
to
answer
personality
measures
in
a
pattern
that
will
present
themselves
in
a
favorable
light
typically
receive
higher
scores
than
respondents
instructed
to
answer
honestly
or
than those
given
no
instructions.
Thus,
it
seems
clear
that
personality
scales
can
be
consciously
manipulated.
However,
there
is
some
dis-
agreement
on
the
prevalence
of
faking
in
real-life
operational
situations.
In
an
Army
sample,
Hough,
Eaton,
Dunnette,
Kamp,
&
McCloy
(1990)
compared
respondents
who
had
no
motivation
to
distort
responses
with
actual
applicants
and
found
similar
scores
between
groups.
Con-
trary
to
those
findings,
Anderson,
Warner,
&
Spector
(1984)
found
that
almost
half
of
the
job
applicants
for
APPLIED
PSYCHOLOGICAL
MEASUREMENT
Vol.
20,
No. 1,
March
1996,
pp.
71-87
@
Copyright
1996
Applied
Psychological
Measurement
Inc.
0746-6276/96~70077-77~2.70
Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.
May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction
requires payment of royalties through the Copyright Clearance Center,
http://www.copyright.com/

72
a
variety
of
positions
claimed
that
they
had
experience
performing
at
least
one
of
several
imaginary
tasks
that
the
researchers
had
invented,
such
as
&dquo;matrixing
solvency
files.&dquo;
Job
applicants
who
claimed
experi-
ence
on
these
spurious
tasks
also
inflated
their
responses
on
items
related
to
experience
on
real
tasks.
Regardless
of
the
prevalence
of
faking,
organizations
will
continue
to
resist
using
personality
measures
in
operational
programs
in
which
errors
have
a
high
cost,
as
long
as
the
potential
to
fake
with
impunity
exists.
Detecting
Faking
in
Personality
Measurement
One
approach
to
countering
faking
is
to
write
items
that
are
difficult
to
fake.
Becker
&
Colquitt
(1992)
found
that
respondents
distort
less
on
personality
items
for
which
the
answers
can
potentially
be
verified.
For
example,
a
question
such
as
&dquo;Do
you
enjoy
talking
to
people?&dquo;
is
difficult
to
verify,
but
the
question
&dquo;Were
you
a
member
of
any
social
clubs
while
in
high
school?&dquo;
could
be
verified
by
school
records,
yearbooks,
and
so
forth.
According
to
Becker
&
Colquitt
(1992),
there
would
be
more
faking
on
the
former,
less
objective
item.
Another
approach
is
to
ask
questions
that
are
ambiguous
or
less
transparent
about
what
is
being
mea-
sured
(Edwards,
1970).
These
subtle
items
usually
are
generated
by
an
external
validation
procedure;
items
are
selected
that
have
mean
differences
between
different
groups.
Unfortunately,
there
are
problems
with
the
external
validation
technique:
(1)
items
may
function
differently
in
cross-validation
samples,
and
(2)
research
has
indicated
that
subtle
items
often
have
lower
validity
than
more
transparent
items
(e.g.,
Burkhart,
Gynther,
&
Fromuth,
1980;
Duff, 1965;
Wiener,
1948).
If
personality
instruments
cannot
easily
be
made
resistant
to
faking,
then
an
alternative
solution
would
be
to
identify
those
respondents
who
have
distorted
their
responses.
Research
on
this
approach
goes
back
to
the
1940s
with
the
seminal
work
on
the
Minnesota
Multiphasic
Personality
Inventory
(MMPI).
The
MMPI
was
originally
designed
for
psychodiagnostic
evaluation
(Butcher,
Dahlstrom,
Graham,
Tellegen,
&
Kaemmer,
1989).
Three
scales
designed
to
detect
invalid
responses
were
embedded
within
the
MMPI
(Meehl
&
Hathaway,
1946).
Scale
F
was
composed
of
64
items
that
were
answered
with
an
extremely
low
frequency
in
one
direction
by
a
normal
sample.
For
example,
a
respondent
who
answers
false
to
&dquo;I
believe
in
law
enforcement&dquo;
(the
key
was
formed
in
the
1940s)
as
well
as
answering
other
items
in
a
similarly
unlikely
manner
would
score
high
on
the
F
Scale.
The
MMPI
K
Scale
uses
items
that
differentiated
patients
with
known
psychological
disorders
whose
MMPI
profiles
appeared
normal
and
respondents
with
no
psychological
disorders.
Gough
(1950)
proposed
an
additional
detection
index
composed
of
an
individual’s
score
on
the
F
scale
minus
the
score
on
the
K
scale
(i.e.,
F -
K).
One
additional
scale,
the
MMPI
L
Scale
consists
of
items
that
have
a
socially
desirable
answer
that
cannot
honestly
be
answered
in
the
extreme
direction
by
more
than
a
small
number
of
individuals.
An
example
is
&dquo;I
read
all
the
editorials
in
the
newspaper
every
day.&dquo;
If
a
number
of
these
ques-
tions
are
answered
in
the
affirmative,
there
is
reason
to
believe
that
the
respondent
is
not
answering
honestly.
The
L
Scale
was
rationally
constructed,
as
opposed
to
the
construction
of
the
other
two
scales,
which
used
purely
empirical
methods.
Many
detection
scales
used
on
personality
inventories,
often
called
social
desir-
ability
scales,
are
rationally
constructed,
such
as
the
L
Scale.
Numerous
studies
(e.g.,
Bagby,
Buis,
&
Nicholson,
1995;
Gillis
et
al.,
1990;
Gough,
1950;
Lanyon,
1993)
have
examined
the
usefulness
of
detection
scales
for
identifying
individuals
who
consciously
distort
re-
sponses.
Much
of
this
research
has
been
conducted
using
the
MMPI.
Of
the
work
done
on
the
MMPI,
a
primary
concern
has
been
in
detecting
respondents
who
are
exaggerating
mental
symptoms
in
order
to
attract
attention
(i.e.,
faking
bad).
For
instance,
Gillis
et
al.
(1990)
used
cut
scores
on
the
F -
K
index
that
had
in
previous
research
differentiated
between
normal
individuals
and
psychiatric
patients
responding
honestly
versus
nor-
mal
individuals
feigning
psychopathological
symptoms.
The
recommended
cut
score
of
the
F -
K
index
identified
92%
of
the
fakers
but
misclassified
13%
of
the
honest
sample.
Similarly,
Gough
(1950)
found
that
58%
of
the
faking
normals
were
correctly
identified,
and
1%
of
the
honest
sample
was
misclassified
using
the
Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.
May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction
requires payment of royalties through the Copyright Clearance Center,
http://www.copyright.com/

73
F -
K
index.
Thus,
there
appears
to
be
some
utility
in
using
the
F -
K
score
in
identifying
individuals
explicitly
manipulating
responses.
However,
it
is
difficult
to
generalize
from
detection
of
individuals
feigning
mental
illnesses
(or
its
lack)
to
the
detection
of
normal
individuals
faking
good
on
personality
items.
Lanning
(1989)
suggested
that
it
may
be
possible
to
detect
normal
respondents
who
are
faking
good.
Lanning
computed
a
regression
equation
using
scores
on
a
&dquo;good
impression&dquo;
scale
along
with
other
substan-
tive
personality
scales
on
the
California
Personality
Inventory
to
differentiate
between
a
sample
of
college
students
asked
to
fake
good
and
a
heterogeneous
sample
of
normal
individuals.
The
scores
derived
from
the
regression
equation
achieved
a
hit
rate
of
67%
with
a
1%
false
positive
(FP)
rate.
In
a
faking
context,
hit
rate
refers
to
the
percentage
of
fakers
correctly
classified
with
a
particular
cut
score
on
a
detection
index;
the
FP
rate
refers
to
the
percentage
of
honest
respondents
incorrectly
identified
as
fakers
with
that
same
score.
However,
the
generalizability
of
the
results
based
on
a
regression
equation
is
difficult
to
determine
when
the
two
samples
also
differ
in
composition.
The
high
hit
rate
at
such
a
low FP
rate
may
suggest
that
the
regression
equation
also
differentiated
between
college
students
and
others,
a
feature
that
aided
differentiation
of
fakers
in
this
context
but
that
would
be
irrelevant
in
other
situations.
Thus,
the
effectiveness
of
social
desirability
scales
in
detecting
faking
in
an
organizational
context
is
still
somewhat
unclear.
Detecting
Other
Kinds
of
Unusual
Responses
There
has
also
been
research
developing
scales
designed
to
detect
response
patterns
that
appear
to
be
random.
The
variable
response
inconsistency
scale
(VRIN;
Butcher
et
al.,
1989)
was
designed
for
detecting
inconsistent
responses
on
the
MMPI.
Wetter,
Baer,
Berry,
Smith,
&
Larsen
(1988)
administered
the
MMPI
under
four
experimental
conditions.
Respondents
were
instructed
to
either
answer
honestly,
simulate
a
moderate
psychological
disturbance,
simulate
a
severe
psychological
disturbance,
or
answer
randomly.
In
the
random
response
condition,
respondents
filled
out
the
answer
sheet
without
access
to
the
questionnaire
items.
Although
VRIN
identified
individuals
in
the
random
response
condition,
there
were
not
mean
differ-
ences
between
individuals
in
the
simulated
psychological
disturbance
conditions
and
the
honest
condition.
Thus,
this
scale
may
be
useful
for
detecting
individuals
who
are
responding
to
items
in
a
manner
that
suggests
lack
of
comprehension,
misgridding
(e.g.,
an
individual
who
misgrids
an
optical
scanning
sheet
by
answering
Item
10
in
the
Item
11
blank
and
continues
answering
in
the
wrong
blanks),
or
idiosyncratic
personality
trait
structures
(e.g.,
see
Reise
&
Waller, 1993;
Waller
&
Reise,
1992).
VRIN
seems,
however,
to
have
little
power
to
detect
intentional
faking.
Another
strategy
used
to
detect
distorted
responses
is
based
on
response
latencies.
It
has
been
hypoth-
esized
that
respondents
who
distort
their
responses
take
a
longer
time
to
respond
to
individual
items
pre-
sented
on
a
computer
(Hsu,
Santelli,
&
Hsu,
1989).
This
strategy
may
have
limited
practical
value
because
respondents
with
low
dexterity
or
unfamiliarity
with
computers
may
have
a
higher
rate
of
being
classified
as
fakers.
Little
research
has
been
conducted
using
this
detection
technique.
External
procedures
for
detecting
falsified
responses
use
information
for
the
classification
decision
(i.e.,
faker
vs.
honest)
that
is
distinct
from
the
information
that
is
used
for
the
substantive
classification
(e.g.,
high
vs. low
self-esteem).
Research
on
the
success
of
external
techniques
has
been
mixed,
prompting
a
search
for
better
techniques.
In
addition,
with
external
techniques
there
is
the
possibility
that
respondents
could
be
given
sophisticated
training
or
coaching
to
thwart
such
aberrance
classification.
Meehl
&
Hathaway
(1946)
stated
&dquo;One
may
conclude
that
the
intent
to
deceive
is
not
often
detectable
by
[MMPI
Scale]
L
when
the
subjects
are
relatively
normal
and
sophisticated&dquo;
(p.
538).
Appropriateness
Measurement
An
internal
aberrance
detection
technique
simply
uses
the
information
contained
in
the
substantive
scale
item
responses
to
detect
respondents
who
distort
their
responses.
An
internal
technique
that
has
Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.
May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction
requires payment of royalties through the Copyright Clearance Center,
http://www.copyright.com/

74
developed
from
item
response
theory
(IRT)
may
be
useful
in
addressing
the
problem
of
faking
on
personal-
ity
tests.
Appropriateness
measurement
(one
of
a
number
of
procedures
for
determining
person
fit)
is
a
technique
introduced
by
Levine
&
Rubin
(1979)
to
identify
mismeasured
individuals
on
a
test
or
scale
that
provides
adequate
measurement
for
a
large
majority
of
individuals.
For
instance,
an
individual
who
misgrids
an
optical
scanning
sheet
will
present
a
confusing
pattern
of
responses
with
little
obvious
psychological
meaning.
Another
example
would
be
an
examinee
who
copies
a
small
number
of
answers
from
a
high
ability
neigh-
bor
when
a
test
administrator
leaves
the
examination
room.
In
the
personality
testing
domain,
a
respondent
who
answers
verifiable
items
in
an
honest
manner
but
answers
transparent
items
in
a
socially
desirable
manner
will
present
a
seemingly
inconsistent
pattern
of
responses
that
may
be
possible
to
identify
using
appropriateness
measurement.
Appropriateness
measurement
quantifies
the
difference
between
an
examinee’s
observed
pattern
of
item
responses
to
responses
expected
on
the
basis
of
that
person’s
standing
on
the
latent
trait
0
and
a
set
of
item
response
functions
(IRFs),
as
specified
by
some
IRT
model.
IRFs
are
functions
that
relate
0
to
the
probability
of
affirming
an
item.
An
examinee
whose
pattern
of
responses
greatly
differs
from
the
expected
pattern
of
responses
will
have
an
extreme
appropriateness
index.
Levine
&
Drasgow
(1988)
developed
an
approach
to
optimal
statistical
analysis
for
appropriateness
mea-
surement.
Optimal
indexes
provide
most
powerful
statistics
for
detecting
aberrant
responses.
Based
on
the
Neyman-Pearson
Lemma,
a
most
powerful
statistic
uses
a
likelihood
ratio
test
consisting
of
the
likelihood
of
a
response
pattern
under
a
model
for
aberrant
responding
and
the
likelihood
of
a
response
pattern
given
a
model
for
nonaberrant
responding.
Thus
LR(u)
=
Pa(u)
IP,,(u),
where
Pa(u)
is
the
likelihood
of
an
observed
response
pattern
u
given
a
certain
model
of
aberrant
responding,
and
Pn(u)
refers
to
the
corresponding
likeli-
hood
for
the
nonaberrant
model.
Models
for
nonaberrant
and
aberrant
responding
should
be
determined
by
the
characteristics
of
the
test
and
the
nature
of
the
individuals
who
complete
the
test
or
scale.
Purpose
The
objective
of
this
study
was
to
examine
the
effectiveness
of IRT
appropriateness
measurement
tech-
niques
in
detecting
respondents
who
were
faking
on
a
personality
inventory.
Previous
work
using
appro-
priateness
measurement
has
generally
been
limited
to
simulation
data
because
of
large
sample
size
requirements
and
the
inherent
difficulty
of
gaining
access
to
an
identifiable
set
of
aberrant
response
pat-
terns.
In
a
noted
exception,
Reise
&
Waller
(1993)
used
a
practical
(i.e.,
nonoptimal)
appropriateness
statistic, lz
(Levine
&
Drasgow,
1982),
to
identify
individuals
with
seemingly
idiosyncratic
response
pat-
terns
on
a
personality
questionnaire.
Reise
and
Waller,
however,
were
not
able
to
judge
aberrance
classifi-
cation
accuracy
because
their
dataset
did
not
have
independently
identifiable
aberrant
response
patterns.
In
this
study,
an
Army
dataset
that
provided
clearly
delineated
nonaberrant
and
aberrant
samples,
each
with
an
adequate
sample
size,
was
analyzed.
Consequently,
the
effectiveness
of
the
appropriateness
in-
dexes
in
correctly
classifying
honest
and
faking
good
respondents
could
be
directly
tested.
Moreover,
the
IRT
approach
was
compared
to
a
traditional
approach
to
detecting
faking
good
because
a
social
desirability
scale
was
included
in
the
inventory.
Method
ABLE
Dataset
The
United
States
Army
constructed
a
large
personality
inventory
as
part
of
its
Project
A
(Peterson,
Hough,
Dunnette,
Rosse,
Houston,
Toquam,
&
Wing,
1990).
The
Assessment
of
Biographical
and
Life
Events
(ABLE)
consists
of
11
content
scales
that
measure
separate
personality
or
temperamental
constructs.
The
ABLE,
developed
with
a
factor-analytic
approach,
was
designed
to
predict
attrition
in
the
first
term
Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.
May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction
requires payment of royalties through the Copyright Clearance Center,
http://www.copyright.com/

75
enlistment
of
new
Army
recruits.
The
ABLE
includes
a
social
desirability
scale
(SOD;
13
items),
designed
to
detect
respondents
who
answer
questions
in
a
socially
desirable
fashion
(i.e.,
faking
good).
Respondents
who
chose
the
most
socially
desirable
option
received
a
score
of
1
for
that
item;
all
other
options
were
given
a
0
score.
Respondents
with
high
scores
on
this
scale
might
be
asked
to
verify
answers.
Six
substantive
scales
were
selected
from
the
ABLE
for
analysis
in
the
present
study:
Emotional
Stability
(ES;
17
items),
Cooperativeness
(COOP;
18
items),
Nondelinquency
(NOND;
20
items),
Work
Orientation
(wo;
19
items),
Internal
Control
(1c;
16
items),
and
Energy
Level
(EL;
21
items).
Datasets
Two
datasets
from
a
large-scale
Army
research
project
were
made
available
for
this
research
(White
et
al.,
1993).
The
first
dataset,
hereafter
called the
validation
dataset,
consisted
of
48,725
respondents
who
were
administered
the
ABLE
inventory
by
paper-and-pencil
upon
entrance
to
the
U.S.
Army.
Respondents
were
told
that
personnel
decisions
(e.g.,
promotion
or
dismissal)
would
not
be
based
on
ABLE
scores.
The
second
set
of
respondents
(N
=
1,987)
took
part
in
an
experimental
study
with
several
conditions:
Respondents
were
instructed
either
to
answer
in
a
fashion
that
would
make
them
&dquo;look
good&dquo;
or
to
answer
honestly.
All
examinees
in
this
experiment
were
informed
that
their
responses
would
not
be
used
in
future
personnel
decisions.
N
=
324
respondents
were
asked to
answer
all
questions
honestly
(the
honest
condi-
tion).
Two
fake
good
conditions
were
investigated.
In
one
condition,
respondents
(N
=
550)
were
asked
simply
to
present
themselves
in
a
&dquo;good
light&dquo;
(the
adlib
faking
condition).
In the
second
condition,
re-
spondents
(N
=
550)
were
asked
to
present
themselves
in
a
&dquo;good
light&dquo;
and
then
were
coached
on
how
to
respond
to
items
in
a
fashion
that
would
present
themselves
in
a
&dquo;good
light&dquo;
(the
coached faking
condi-
tion).
The
coaching
consisted
of
feedback
on
three
practice
items.
Models
Recent
research
has
examined
the
use
of
IRT
models,
developed
in
the
context
of
ability
testing,
for
personality
assessment
(e.g.,
see
Drasgow
&
Hulin,
1990;
Muraki,
1990;
Reise
&
Waller,
1990,
1993;
Waller
&
Reise,
1992).
The
two-parameter
logistic
model
(2PLM)
has
been
used
extensively
on
personality
and
attitude
scales
because
of
its
simplicity
and
attractive
properties
for
this
type
of
data.
This
model
has
been
demonstrated
to
provide
reasonable
fit
to
personality
data
(Reise
&
Waller,
1990).
The
2PLM
is
a
model
for
dichotomously
scored
responses
and
it
incorporates
an
item
response
function
(w),
which
denotes
the
probability
of
selecting
the
positively
keyed
option
given
0.
The
2PLM
has
the
form
where
a,
is
the
discrimination
parameter
for
item
i,
i =
1,...,
n,
b,
is
the
location
parameter
for
item
i,
u,
is
the
response
of
the
person
with
trait
level
0
to item
i,
and
1.7
is
a
scaling
constant.
Because
the
2PLM
is
for
dichotomous
responses,
if
there
are
more
than
two
answer
options
in
an
item,
there
must
be
an
artificial
dichotomization
so
that
one
or
more
options
is
recoded
to
be
the
single
positive
category
and
all
remaining
options
are
recoded
to
a
single
negative
category.
Because
the
ABLE
has
three
options,
the
first
two
options
were
negatively
keyed
(i.e.,
scored
0)
and
the
third
option
was
positively
keyed
(i.e.,
scored
1).
Although
the
2PLM
has
the
advantage
of
simplicity,
some
important
information
may
be
lost
when
polytomous
responses
are
dichotomized.
Therefore,
the
data
were
also
analyzed
with
polytomous
IRT
models.
Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227.
May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction
requires payment of royalties through the Copyright Clearance Center,
http://www.copyright.com/

Citations
More filters
Journal ArticleDOI

Reconsidering the use of personality tests in personnel selection contexts

TL;DR: The use of personality tests in high-stakes selection environments was discussed in a panel discussion held at the 2004 SIOP conference as discussed by the authors, where five former journal editors from Personnel Psychology and the Journal of Applied Psychology (2 primary outlets for such research) came to the conclusion that faking on self-report personality tests cannot be avoided and perhaps is not the issue.
Journal ArticleDOI

Meta-Analyses of Fakability Estimates: Implications for Personality Measurement

TL;DR: This paper examined whether individuals can fake their responses to a personality inventory if instructed to do so, and concluded that within-subjects designs produce more accurate estimates than between-subject designs.
Journal ArticleDOI

The impact of response distortion on preemployment personality testing and hiring decisions.

TL;DR: This article found that response distortion is significantly greater among job applicants than among job incumbents, that there are significant individual differences in response distortion, and that response distortions can have a significant effect on who is hired.
Journal ArticleDOI

Methodology Review : Evaluating Person Fit

TL;DR: Person-fit methods based on classical test theory and item response theory (IRT), and methods investigating particular types of response behavior on tests, are examined in this paper, where the usefulness of person-fit statistics for improving measurement depends on the application.
Journal ArticleDOI

Personnel selection: looking toward the future--remembering the past

TL;DR: This chapter reviews personnel selection research from 1995 through 1999, with three major themes revealed: better taxonomies produce better selection decisions, and the field of personality research is healthy, as new measurement methods, personality constructs, and compound constructs of well-known traits are being researched and applied to personnel selection.
References
More filters
Journal ArticleDOI

The big five personality dimensions and job performance: a meta-analysis

TL;DR: In this article, the authors investigated the relation of the Big Five personality dimensions (extraversion, emotional stability, Agreeableness, Conscientiousness, and Openness to Experience) to three job performance criteria (job proficiency, training proficiency, and personnel data) for five occupational groups (professionals, police, managers, sales, and skilled/semi-skilled).
Journal ArticleDOI

Estimation of latent ability using a response pattern of graded scores

TL;DR: In this article, the authors considered the problem of estimating latent ability using the entire response pattern of free-response items, first in the general case and then in the case where the items are scored in a graded way, especially when the thinking process required for solving each item is assumed to be homogeneous.
Journal ArticleDOI

Estimating item parameters and latent ability when responses are scored in two or more nominal categories

TL;DR: In this article, a multivariate logistic latent trait model for items scored in two or more nominal categories is proposed, and statistical methods based on the model provide estimation of two item parameters for each response alternative of each multiple choice item and recovery of information from “wrong” responses when estimating latent ability.
Related Papers (5)