Content Analysis in Mass Communication: Assessment and Reporting of Intercoder Reliability

doi:10.1111/J.1468-2958.2002.TB00826.X

3,<,3(5+$:(:,&50<,890:?3,<,3(5+$:(:,&50<,890:?

5.(.,+$*/63(89/07$&5.(.,+$*/63(89/07$&

644;50*(:065(*;3:?";)30*(:0659 $*/6636-644;50*(:065



65:,5:5(3?90905!(99644;50*(:06599,994,5:(5+65:,5:5(3?90905!(99644;50*(:06599,994,5:(5+

#,768:05.6-5:,8*6+,8#,30()030:?#,768:05.6-5:,8*6+,8#,30()030:?

!(::/,= 64)(8+

,550-,8$5?+,8;*/

/,8?38(*2,5

3,<,3(5+$:(:,&50<,890:?

*)8(*2,5*9;6/06,+;

6336=:/09(5+(++0:065(3=6829(:/::79,5.(.,+9*/63(89/07*9;6/06,+;*3*64'-(*7;)

"(8:6-:/,644;50*(:065644659

6=+6,9(**,99:6:/09=682),5,B:?6; ,:;9256=6=+6,9(**,99:6:/09=682),5,B:?6; ,:;9256=

";)309/,89$:(:,4,5:

%/0909:/,(**,7:,+<,890656-:/,-6336=05.(8:0*3, 64)(8+!$5?+,8;*/8(*2,5

65:,5:5(3?90905!(99644;50*(:06599,994,5:(5+#,768:05.6-5:,8*6+,8

#,30()030:?;4(5644;50*(:065#,9,(8*/A/::79+6068.

1:)>=/0*//(9),,57;)309/,+05B5(3-684(:/::79+6068.

1:)>

#,*644,5+,+0:(:065#,*644,5+,+0:(:065

64)(8+!$5?+,8;*/8(*2,565:,5:5(3?90905!(99644;50*(:065

99,994,5:(5+#,768:05.6-5:,8*6+,8#,30()030:?;4(5644;50*(:065#,9,(8*/A

/::79+6068.1:)>

%/098:0*3,09)86;./::6?6;-68-8,,(5+67,5(**,99)?:/,$*/6636-644;50*(:065(:

5.(.,+$*/63(89/07$&:/(9),,5(**,7:,+-6805*3;906505644;50*(:065(*;3:?";)30*(:0659)?(5

(;:/680@,+(+40509:8(:686-5.(.,+$*/63(89/07$&68468,05-684(:06573,(9,*65:(*:

30)8(8?,9*9;6/06,+;

Content

Analysis

in

Mass

Communication

Assessment

and

Reporting

of

Intercoder

Reliability

MATTHEW

LOMBARD

JENNIFER

SNYDER-DUCH

CHERYL

CAMPANELLA

BRACKEN

As

a

method

specifically

intended

for

the

study

of

messages,

content

analysis

is

fundamental

to

mass

communication

research.

Intercoder

reliability,

more

specifically

termed

intercoder

agreement,

is

a

measure

of

the

extent

to

which

independent

judges

make

the

same

coding

decisions

in

evaluating

the

characteristics

of

messages,

and

is

at

the

heart

of

this

method.

Yet

there

are

few

standard

and

accessible

guidelines

available

regarding

the

appropriate

proce



dures

to

use

to

assess

and

report

intercoder

reliability,

or

software

tools

to

calculate

it.

As

a

result,

it

seems

likely

that

there

is

little

consistency

in

how

this

critical

element

of

content

analysis

is

assessed

and

reported

in

published

mass

communication

studies.

Following

a

re

view

of

relevant

concepts,

indices,

and

tools,

a

content

analysis

of

200

studies

utilizing

content

analysis

published

in

the

communication

literature

between

1994

and

1998

is

used

to

charac



terize

practices

in

the

field.

The

results

demonstrate

that

mass

communication

researchers

often

fail

to

assess

(or

at

least

report)

intercoder

reliability

and

often

rely

on

percent

agreement,

an

overly

liberal

index.

Based

on

the review

and

these

results,

concrete

guidelines

are

offered

regarding

procedures

for

assessment

and

reporting

of

this

important

aspect

of

content

analysis.

T

he

study

of

communication

is

interdisciplinary,

sharing

topics,

lit



eratures,

expertise,

and

research

methods

with

many

academic

fields

and

disciplines.

But

one

method,

content

analysis,

is

specifi



cally

appropriate

and

necessary

for

(arguably)

the

central

work

of

com



munication

scholars,

in

particular

those

who

study

mass

communication:

the

analysis

of

messages.

Given

that

content

analysis

is

fundamental

to

communication

research

(and

thus

theory),

it

would

be

logical

to

expect

researchers

in

communication

to

be

among

the

most,

if

not

the

most,

pro



ficient

and

rigorous

in

their

use

of

this

method.

Intercoder

reliability

(more

specifically

"intercoder

agreement";

Tinsley

&

Weiss,

1975,2000)

is

"near

the

heart

of

content

analysis;

if

the

coding

is

not

reliable, the

analysis

cannot

be

trusted"

(Singletary,

1993,

p.

294).

However,

there

are

few

standards

or

guidelines available

concerning

how

to

properly

calculate

and

report

intercoder

reliability.

Further,

although

a

handful

of

tools

are

available

to

implement

the

sometimes

complex

formulae

required,

information

about

them

is

often

difficult

to

find

and

they

are

often

difficult

to

use.

It

therefore

seems

likely

that

many

studies

fail

to

adequately

estab



lish

and

report

this critical

component

of

the

content

analysis

method.

This

article

reviews

the

importance

of

intercoder

agreement

for

con



tent

analysis

in

mass

communication

research.

It

first

describes

several

indices

for

calculating

this

type

of

reliability

(varying

in

appropriateness,

complexity,

and

apparent

prevalence

of

use),

and

then

presents

a

content

analysis

of

content

analyses

reported

in

communication

journals

to

es



tablish

how

mass

communication

researchers

have

assessed

and

reported

reliability,

demonstrating

the

importance

of

the

choices

they

make

con



cerning

it.

The

article

concludes

with

a

presentation

of

guidelines

and

recommendations

for

the

calculation

and reporting

of

intercoder

reliability.

CONTENT

ANALYSIS

AND

THE

IMPORTANCE

OF

INTERCODER

RELIABILITY

Berelson's

(1952)

often

cited

definition

of

content

analysis

as

"a

research

technique

for

the

objective,

systematic,

and

quantitative

description

of

the

manifest

content

of

communication"

(p.

18)

makes

clear

the

technique's

unique

appropriateness

for

researchers

in

our

field.

This

is

reinforced

by

Kolbe

and

Burnett's

(1991)

definition

which

states

that

content

analysis

is

"an

observational

research

method

that

is

used

to

systematically

evalu



ate

the

symbolic

content

of

all

forms

of

recorded

communication.

These

communications

can

also

be

analyzed

at

many

levels

(image,

word,

roles,

etc.),

thereby

creating

a

realm

of

research

opportunities"

(p.

243).

While

content

analysis

can

be

applied

to

any

message,

the

method

is

often

used

in

research

on

mass

mediated

communication.

Riffe

and

Freitag

(1997)

note

several

studies

that

demonstrate

the

wide



spread

and

increasing

use

of

content

analysis

in

communication.

The

method

has

been

well

represented

in

graduate

research

methods

courses,

theses,

dissertations,

and

journals.

In

their

own

study

they

report

a

statis



tically

significant

trend

over

25

years

(1971-1995)

in

the

percentage

of

full

research

reports

in

Journalism

&

Mass

Communication

Quarterly

that

fea



ture

this

method,

and

they

note

that

improved

access

to

media

content

through

databases

and

archives,

along

with

new

tools

for

computerized

content

analysis,

suggests

the

trend

is

likely

to

continue.

Intercoder

reliability

is

the

widely

used

term

for

the

extent

to

which

independent

coders

evaluate

a

characteristic

of

a

message

or

artifact

and

reach

the

same

conclusion.

Although

this

term

is

appropriate

and

will

be

used

here,

Tinsley

and

Weiss

(1975,2000)

note

that

the

more

specific

term

for

the

type

of

consistency

required

in

content

analysis

is

intercoder

(or

interrater)

agreement.

They

write

that

while

reliability

could

be

based

on

correlational

(or

analysis

of

variance)

indices

that

assess

the

degree

to

which

"ratings

of

different

judges

are

the

same

when

expressed

as

de



viations

from

their

means,"

intercoder

agreement

is

needed

in

con



tent

analysis

because

it

measures

only

"the

extent

to

which

the

differ



ent

judges

tend

to

assign

exactly

the

same

rating

to

each

object"

(Tinsley

&

Weiss,

2000,

p.

98).

1

It

is

widely

acknowledged

that

intercoder

reliability

is

a

critical

com



ponent

of

content

analysis

and

(although

it

does

not

ensure

validity)

when

it

is

not

established,

the

data

and

interpretations

of

the

data

can

never

be

considered

valid.

As

Neuendorf

(2002)

notes,

"given

that

a

goal

of

con



tent

analysis

is

to

identify

and

record

relatively

objective

(or

at

least

intersubjective) characteristics

of

messages,

reliability

is

paramount.

With



out

the

establishment

of

reliability,

content

analysis

measures

are

use



less"

(p.

141).

Kolbe

and

Burnett

(1991)

write

that

"interjudge

reliability

is

often

perceived

as

the

standard

measure

of

research

quality.

High

levels

of

disagreement

among

judges

suggest

weaknesses

in

research

methods,

including

the

possibility

of

poor

operational

definitions,

categories,

and

judge

training"

(p.

248).

A

distinction

is

often

made

between

the

coding

of

the

manifest

con



tent,

information

"on

the

surface,"

and

the

latent

content

beneath

these

surface

elements.

Potter

and

Levine-Donnerstein

(1999)

note

that

for

la



tent

content

the

coders

must

provide

subjective

interpretations

based

on

their

own

mental

schema

and

that

this

"only

increases

the

importance

of

making

the

case

that

the

judgments

of

coders

are

intersubjective,

that

is,

those

judgments,

while

subjectively

derived,

are

shared

across

coders,

and

the

meaning

therefore

is

also

likely

to

reach

out

to

readers

of

the

research"

(p.

266).

There

are

important

practical

reasons

to

establish

intercoder

reliability

as

well.

Neuendorf

(2002)

argues

that,

in

addition

to

being

a

necessary

(although

not

sufficient)

step

in

validating

a

coding

scheme,

establishing

a

high

level

of

reliability

also

has

the

practical

benefit

of

allowing

the

researcher

to

divide

the

coding

work

among

many

different

coders.

Rust

and

Cooil

(1994)

note

that

intercoder

reliability

is

important

to

marketing

researchers

in

part

because

"high

reliability

makes

it

less

likely

that

bad

managerial

decisions

will

result

from

using

the

data"

(p.

11).

Potter

and

Levine-Donnerstein

(1999)

make

a

similar

argument

regarding

applied

work

in

public

information

campaigns.

MEASURING

INTERCODER

RELIABILITY

Intercoder

reliability

is

assessed

by

having

two

or

more

coders

catego



rize

units

(programs,

scenes,

articles,

stories,

words,

etc.),

and

then

using

these

categorizations

to

calculate

a

numerical

index

of

the

extent

of

agree



ment

between

or

among

the

coders.

There

are

many

variations

in

how

this

process

can

and

should

be

conducted,

but

at

a

minimum

the

researcher

has

to

create

a

representative

set

of

units

for

testing

reliability

and

the

coding

decisions

must

be

made

independently

under

the

same

condi



tions.

A

separate

pilot

test

is

often

used

to

assess

reliability

during

coder

training,

with

a

final

test

to

establish

reliability

levels

for

the

coding

of

the

full

sample

(or

census)

of

units.

Researchers

themselves

may

serve

as

cod



ers,

a

practice

questioned

by

some

(e.g.,

Kolbe

&

Burnett,

1991)

because

it

weakens

the

argument

that

other

independent

judges

can

reliably

apply

the

coding

scheme.

In

some

cases

the

coders

evaluate

different

but

over



lapping

units

(e.g.,

coder

1

codes

units

1-20,

coder

2

codes

units

11-30,

etc.),

but

this

technique

has

also

been

questioned

(Neuendorf,

2002).

With

the

coding

data

in

hand,

the

researcher

calculates

and

reports

one

or

more

indices

of

reliability.

Popping

(1988)

identified

39

different

"agreement

indices"

for

coding

nominal

categories,

which

excludes

sev



eral

techniques

for

ratio

and

interval

level

data,

but

only

a

handful

of

techniques

are

widely

used.

2

Percent

Agreement

Percent

agreement

—

also

called

simple

agreement,

percentage

of

agree



ment, raw

percent

agreement,

or

crude

agreement

—

is

the

percentage

of

all

coding

decisions

made

by

pairs

of

coders

on

which

the

coders

agree.

As

with

most

indices,

percent

agreement

takes

values

of

.00

(no

agreement)

to

1.00

(perfect

agreement).

The

obvious

advantages

of

this

index

are

that

it

is

simple,

intuitive,

and

easy

to

calculate.

It

also

can

accommodate

any

number

of

coders.

However,

this

method

also

has

major

weaknesses,

the

most

important

of

which

involves

its

failure

to

account

for

agreement

that

would

occur

simply

by

chance.

Consider

this

example:

Two

coders

are

given

100

units

(news stories,

words,

etc.)

to

code

as

having

or

not

hav



ing

a

given

property.

Without

any

instructions

or

training,

without

even

knowing

the

property

they

are

to

identify,

they

will

agree

half

of

the

time,

and these

random

agreements

will

produce

a

percent

agreement

value

of

.50.

This

problem

is

most

severe

when

there

are

fewer

Content Analysis in Mass Communication: Assessment and Reporting of Intercoder Reliability

Figures (2)

Citations

Cites background from "Content Analysis in Mass Communicat..."

Cites background or methods from "Content Analysis in Mass Communicat..."

Cites background from "Content Analysis in Mass Communicat..."

Cites background or methods from "Content Analysis in Mass Communicat..."

References

"Content Analysis in Mass Communicat..." refers background or methods in this paper

Additional excerpts

"Content Analysis in Mass Communicat..." refers background or methods in this paper

"Content Analysis in Mass Communicat..." refers methods in this paper

"Content Analysis in Mass Communicat..." refers background in this paper

Related Papers (5)