scispace - formally typeset
Open AccessJournal ArticleDOI

Automated image-based tracking and its application in ecology

TLDR
Automated image-based tracking should continue to advance the field of ecology by enabling better understanding of the linkages between individual and higher-level ecological processes, via high-throughput quantitative analysis of complex ecological patterns and processes across scales, including analysis of environmental drivers.
Abstract
The behavior of individuals determines the strength and outcome of ecological interactions, which drive population, community, and ecosystem organization. Bio-logging, such as telemetry and animal-borne imaging, provides essential individual viewpoints, tracks, and life histories, but requires capture of individuals and is often impractical to scale. Recent developments in automated image-based tracking offers opportunities to remotely quantify and understand individual behavior at scales and resolutions not previously possible, providing an essential supplement to other tracking methodologies in ecology. Automated image-based tracking should continue to advance the field of ecology by enabling better understanding of the linkages between individual and higher-level ecological processes, via high-throughput quantitative analysis of complex ecological patterns and processes across scales, including analysis of environmental drivers.

read more

Content maybe subject to copyright    Report

Automated
image-based
tracking
and
its
application
in
ecology
Anthony
I.
Dell
1
,
John
A.
Bender
2
,
Kristin
Branson
3
,
Iain
D.
Couzin
4
,
Gonzalo
G.
de
Polavieja
5
,
Lucas
P.J.J.
Noldus
6
,
Alfonso
Pe
´
rez-Escudero
5
,
Pietro
Perona
7
,
Andrew
D.
Straw
8
,
Martin
Wikelski
9,10
,
and
Ulrich
Brose
1
1
Systemic
Conservation
Biology,
Department
of
Biology,
Georg-August
University
Go
¨
ttingen,
Go
¨
ttingen,
Germany
2
HasOffers
Inc.,
2220
Western
Ave,
Seattle,
WA,
USA
3
Howard
Hughes
Medical
Institute,
Janelia
Farm
Research
Campus,
Ashburn,
VA,
USA
4
Department
of
Ecology
and
Evolutionary
Biology,
Princeton
University,
Princeton,
NJ,
USA
5
Instituto
Cajal,
CSIC,
Av.
Doctor
Arce,
37,
Madrid,
Spain
6
Noldus
Information
Technology
BV,
Nieuwe
Kanaal
5,
6709
PA
Wageningen,
The
Netherlands
7
Computation
and
Neural
Systems
Program,
California
Institute
of
Technology,
Pasadena,
CA,
USA
8
Research
Institute
of
Molecular
Pathology
(IMP),
Vienna,
Austria
9
Max
Planck
Institute
for
Ornithology,
Radolfzell,
Germany
10
Biology
Department,
University
of
Konstanz,
Konstanz,
Germany
The
behavior
of
individuals
determines
the
strength
and
outcome
of
ecological
interactions,
which
drive
popula-
tion,
community,
and
ecosystem
organization.
Bio-log-
ging,
such
as
telemetry
and
animal-borne
imaging,
provides
essential
individual
viewpoints,
tracks,
and
life
histories,
but
requires
capture
of
individuals
and
is
often
impractical
to
scale.
Recent
developments
in
automated
image-based
tracking
offers
opportunities
to
remotely
quantify
and
understand
individual
behavior
at
scales
and
resolutions
not
previously
possible,
providing
an
essential
supplement
to
other
tracking
methodologies
in
ecology.
Automated
image-based
tracking
should
con-
tinue
to
advance
the
field
of
ecology
by
enabling
better
understanding
of
the
linkages
between
individual
and
higher-level
ecological
processes,
via
high-throughput
quantitative
analysis
of
complex
ecological
patterns
and
processes
across
scales,
including
analysis
of
environ-
mental
drivers.
Measuring
behavior
Individual
behavior
(see
Glossary)
underlies
almost
all
aspects
of
ecology
[1–5].
Accurate
and
highly
resolved
behavioral
data
are
therefore
critical
for
obtaining
a
mech-
anistic
and
predictive
understanding
of
ecological
systems
[5].
Historically,
direct
observation
by
trained
biologists
was
used
to
quantify
behavior
[6,7].
However,
the
extent
and
resolution
to
which
direct
observations
can
be
made
is
highly
constrained
[8]
and
the
number
of
individuals
that
can
be
observed
simultaneously
is
small.
In
addition,
an
exact
record
of
events
is
not
preserved,
only
the
biologist’s
subjective
account
of
them.
Recent
technological
advances
in
tracking
now
make
it
possible
to
collect
large
amounts
of
highly
precise
and
accurate
behavioral
data.
For
many
organisms
equipment
can
be
attached
that
provide
information
about
the
Review
Glossary
Background
subtraction:
a
method
used
by
software
to
compare
the
current
video
frame
with
a
stored
picture
of
the
background;
any
pixel
of
the
current
frame
that
is
significantly
different
from
the
corresponding
pixel
in
the
background
is
likely
to
be
associated
with
the
body
of
an
animal.
Useful
in
situations
where
the
background
is
unchanging,
for
example,
when
the
surface
of
the
background
is
rigid
and
lighting
does
not
change.
Behavior:
the
actions
of
individuals,
often
in
response
to
stimuli.
Behavior
can
involve
movement
of
the
individual’s
body
through
space,
such
as
walking
or
chasing,
or
can
occur
while
the
animal
is
stationary,
such
as
grooming
or
eating.
Bio-logging:
attachment
or
implantation
of
equipment
to
organisms
to
provide
information
about
their
identity,
location,
behavior,
or
physiology
(e.g.,
global
positioning
systems,
accelerometers,
video
cameras,
telemetry
tags).
Ecological
interaction:
any
interaction
between
an
organism
and
its
environment,
or
between
two
organisms
(i.e.,
includi ng
interactions
between
conspecifics).
Fingerprinting:
a
method
used
to
identify
unmarked
individuals
using
natural
variation
in
their
physical
and/or
behavioral
appearance.
The
method
works
by
transforming
the
images
of
each
individual
into
a
characteristic
‘fingerprint’ ,
which
can
then
be
used
to
distinguish
individual
organisms
both
within
and
across
videos.
FPS
(frames
per
second):
the
number
of
frames
in
an
image
sequence
collected
per
second.
Image:
any
measurement
of
the
spatiotemporal
position
or
pose
of
organisms
that
can
be
recast
into
a
digital
image
and
analyzed
using
computer
vision
techniques
(see
Box
2).
Machine
learning:
a
set
of
techniques
that
allow
computer
software
to
learn
from
empirical
data,
user
assumptions,
or
manual
annotation.
These
approaches
are
becoming
increasingly
common
in
the
analysis
of
behavior,
where
users
can
tag
behavior
in
short
sequences
of
images
and
the
software
can
predict
occurrences
of
these
behaviors
throughout
the
entire
image
sequence.
Marking:
the
attachment
of
artificial
‘marks’
to
organisms
to
maintain
their
identity,
such
as
paint
or
barcodes.
Occlusion:
when
the
view
of
any
individual
in
an
image
is
disrupted
either
by
another
individual
or
physical
habitat
(i.e.,
the
occluding
object
lies
in
a
straight
line
between
the
focus
individual
and
the
camera).
Pixel:
a
physical
point
in
a
2D
digital
image,
and
therefore
the
smallest
controllable
element
of
a
picture
represented
on
the
screen.
The
equivalent
of
a
pixel
in
3D
space
is
a
voxel.
Pose:
any
additional
geometrical
quantity
of
interest
other
than
the
center
of
the
main
body
of
the
animal,
such
as
orientation,
wing
positions,
body
curvature,
etc.
Position:
the
center
of
body
mass
of
an
individual
in
time
and
space.
Resolution:
the
number
of
pixels/voxels
in
a
digital
image.
0169-5347/
ß
2014
Elsevier
Ltd.
All
rights
reserved.
http://dx.doi.org/10.1016/j.tree.2014.05.004
Corresponding
author:
Dell,
A.I.
(adell@gwdg.de,
tonyidell@gmail.com).
Keywords:
behavior;
bio-logging;
ecological
interactions;
tracking;
automated
image-
based
tracking.
Trends
in
Ecology
&
Evolution,
July
2014,
Vol.
29,
No.
7
417

individuals’
spatiotemporal
position,
orientation,
and
physiology.
This
‘bio-logging’
allows
remote
reconstruction
of
behavior
over
large
spatiotemporal
extents,
providing
essential
individual
viewpoints,
tracks,
and
life
histories,
and
thus
important
ecological
and
evolutionary
insights
[9–11].
Image-based
tracking,
for
example
with
video,
is
another
tracking
method
that
shows
great
potential
in
ecology.
Similar
to
bio-logging,
image-based
tracking
involves
digital
recording
of
data,
meaning
an
objective
view
of
events
is
maintained,
increasing
repeatability
of
studies,
and
allowing
biologists
to
mine
data
for
quantities
not
originally
considered.
Image-based
tracking
can
be
used
when
individuals
are
too
small
to
attach
bio-loggers,
or
if
the
equipment
itself
changes
behavior,
and
all
visible
and
sufficiently
resolved
individuals
within
the
imaged
area
can
be
tracked,
not
just
those
with
loggers
attached.
Also,
image-based
tracking
generally
allows
for
higher
spatiotemporal
resolution
of
behavioral
data
than
bio-log-
ging,
and
many
imaging
methods
allow
extraction
of
quan-
titative
information
about
the
environment,
such
as
its
temperature
or
topography.
Currently,
constraints
on
the
acquisition,
processing,
and
storage
of
digital
information
limit
the
spatiotemporal
extent
of
image-based
tracking,
and
extracting
the
position
and
pose
of
every
individual
in
each
image
is
difficult
in
complex
habitat
and
at
high
densities.
Nonetheless,
constraints
are
rapidly
being
over-
come
and
image-based
tracking
now
provides
a
valuable
tool
to
undertake
rigorous
hypothesis-driven
research
in
ecology
(Box
1).
Here
we
review
the
state-of-the-art
of
image-based
tracking,
its
strengths
and
limitations
when
applied
to
ecological
research,
and
its
application
to
solve
relevant
ecological
questions.
Automated
image-based
tracking
Initial
applications
of
image-based
tracking
required
man-
ual
analysis
[12,13],
which
is
effort
intensive,
often
leads
to
poor
spatiotemporal
resolution,
and
is
open
to
observer
effects
such
as
subjective
decisions
about
which
informa-
tion
to
record.
Recent
advances
in
automation
are
over-
coming
these
issues
[14–16],
and
there
now
exist
several
image-based
systems
capable
of
extracting
individual
be-
havior
with
minimal
or
zero
manual
intervention
(Table
S1
in
the
supplementary
material
online).
Tracking
over
eco-
logically
relevant
spatiotemporal
scales
is
becoming
easier,
owing
to
advances
in
imaging
and
computing
technologies,
and
by
the
development
of
software
that
can
track
in
real
time
[17–19]
and
recognize
individuals
across
image
sequences
[20,21].
Biologists
now
employ
a
wide
range
of
imaging
methods
(e.g.,
near
infrared,
thermal
infrared,
sonar,
3D)
that
permit
tracking
in
environments
where
optical
video
is
unsuitable
(Box
2).
To
date,
automated
image-based
tracking
has
primarily
been
undertaken
in
the
laboratory,
where
biologists
have
examined
genetic
and
physiochemical
drivers
of
behavior
in
model
species
(Table
S1
in
the
supplementary
material
online)
(Box
1).
However,
the
past
decade
has
seen
expansion
of
these
methods
into
the
field,
and
automated
image-based
track-
ing
has
now
been
undertaken
on
a
wide
diversity
of
species,
including
plants,
worms,
spiders,
insects,
fish,
birds,
mam-
mals,
and
more
(Table
S1
in
the
supplementary
material
online).
Automated
image-based
tracking
involves
three
main
steps
(Figure
1):
(i)
acquisition
of
image
sequences
(Box
2);
(ii)
detection
of
individuals
and
their
pose
in
each
image
and
appropriate
‘linking’
of
detections
in
consecutive
images
to
create
trajectories
through
time
(Box
3);
and
(iii)
analysis
of
behavioral
data
(Box
4).
Real-time
tracking
is
performed
as
images
are
acquired,
removing
the
need
for
storing
large
amounts
of
digital
information
[17–19]
and
allowing
researchers
to
influence
the
animal’s
environ-
ment
in
real
time
through
virtual
reality,
robotics,
or
other
dynamical
stimulus
regimes
[22–24].
Even
under
con-
trolled
laboratory
conditions
with
small
numbers
of
indi-
viduals,
automated
image-based
tracking
is
a
difficult
computer
vision
problem.
Biological
organisms
are
highly
deformable
objects
which
behave
in
unconstrained
and
variable
ways
[25],
and
the
environments
within
which
they
exist
are
complex
and
dynamic.
Ultimately,
in
automated
image-based
tracking
there
is
a
trade-off
between
the
difficulty
of
the
tracking
problem
Box
1.
Ecological
insights
from
automated
image-based
tracking
We
see
three
key
areas
where
considerable
intellectual
progress
has
been
made
in
ecology
using
automated
image-based
tracking.
First,
the
kinematics
of
animal
behavior
[17–19,23,24,34,42,57,66,69,70,74,
76–81],
including
the
role
of
the
internal
state
of
animals,
such
as
their
physiology
or
genes,
and
the
external
environment.
Recent
break-
throughs
in
remote
quantification
of
physical
landscapes
[58–60]
and
3D
imaging
[29]
should
be
especially
helpful
for
these
questions.
Second,
collective
behavior
in
animal
groups
[1,26,33,38,40,43,45,62,
82,83],
including
understanding
how
information
about
the
physical
and
biological
environment
transfers
between
individuals.
Generally,
this
research
centers
on
intraspecific
groups
comprising
large
numbers
of
similar
sized
individuals.
Third,
determinants
of
social
behavior
[8,27–29,31,53,54,67,71,73,84].
Research
in
this
last
category
usually
focuses
on
a
small
number
of
individuals,
because
identifying
the
detailed
pose
required
for
automated
behavioral
analysis
is
difficult
in
larger
groups.
Tracking
over
short
durations
(minutes)
has
aided
in
our
understanding
of
the
genetic
basis
of
social
behavior,
such
as
aggression
or
courtship
[8,85],
where
the
high
throughput
that
automation
allows
provides
enhanced
power
for
uncovering
patterns
in
behavioral
data
[27].
Research
over
longer
times
can
uncover
complex
temporal
linkages
between
social
behaviors
[8,28],
and
experiments
over
the
order
of
weeks
provide
unique
insight
into
the
social
and
behavioral
development
of
individuals
in
intraspecific
groups
[31,53,54].
Enormous
potential
exists
for
automated
image-based
tracking
to
address
other
key
issues
in
ecology.
One
area
we
expect
significant
growth
is
in
the
study
of
interspecific
interactions,
which
are
critical
to
ecological
systems
[1–5].
For
example,
biologists
recently
used
automated
analysis
of
sonar
images
to
reveal
how
coordinated
hunting
by
predators
leads
to
increased
fragmentation
and
irregula-
rities
in
the
spatial
structure
of
prey
groups,
and
thus
inhibition
of
information
transfer
among
prey
[4].
Laboratory
research
alone
provides
much
scope
for
experimentally
testing
basic
ideas
about
ecology,
such
as
the
role
of
body
size
or
predator
density
in
determining
trophic
interaction
strength
(Movie
S3
in
the
supple-
mentary
material
online)
(A.I.
Dell,
unpublished).
Image-based
tracking
can
also
address
more
applied
questions,
such
as
the
role
of
fragmentation
in
population
dynamics
(A.I.
Dell,
unpublished)
or
determining
the
size
of
animal
populations
that
are
historically
difficult
to
measure
[52].
Integrating
automated
tracking
techniques
into
images
already
collected
by
trigger-based
cameras
to
assess
species
occurrence
and
population
abundances
[21]
would
provide
important
information
about
the
behavior
of
organisms
in
natural
ecosystems.
Review
Trends
in
Ecology
&
Evolution
July
2014,
Vol.
29,
No.
7
418

(horizontal
axis
in
Figure
2)
and
the
quality
of
tracking
output
(vertical
axis
in
Figure
2).
Difficulty
of
the
tracking
problem
Tracking
is
easiest
in
laboratory-based
systems
with
a
simple
environmental
landscape
and
low
numbers
of
indi-
viduals
(left
panel
in
Figure
2),
and
most
difficult
in
the
field
where
many
individuals
from
many
different
species
interact
across
a
complex
environmental
landscape
(right
panel
in
Figure
2).
From
individuals
to
interactions
Monitoring
the
behavior
of
individuals
as
they
interact
with
each
other
is
difficult
for
several
reasons.
First,
Box
2.
Obtaining
an
image
sequence
The
first
step
in
automated
image-based
tracking
involves
obtaining
a
machine-readable
sequence
of
images
that
accurately
represents
the
real
world.
This
translation
between
the
real
and
digital
world
is
a
critical
step,
and
time
spent
optimizing
the
image
(such
as
ensuring
sufficient
contrast
between
foreground
and
background)
pays
substantial
dividends
during
subsequent
steps
(see
Figure
1
in
main
text).
Optical
video
is
commonly
used
owing
to
its
accessibility
and
low
cost,
but
other
imaging
technologies
considerably
expand
the
range
of
environmental
contexts
within
which
tracking
can
be
undertaken
(Figure
I).
These
include
infrared
(Figure
IA,B),
thermal
infrared
[50]
(Figure
IC;
Movie
S7
in
the
supplementary
material
online),
X-ray
microtomogra phy
[55]
(Figure
ID),
and
sonar
[4]
(Figure
IE;
Movie
S9
in
the
supplementary
material
online).
Light-field
(Figure
IF)
and
multi-s ca le
gigapixel
[86]
(Figure
IG)
imaging
should
permit
tracking
and
scene
reconstruction
in
3D
from
a
single
image
viewpoint.
Although
frame
rates
of
gigapixel
cameras
are
increasing
(S.D.
Feller,
unpublished) ,
at
three
frames
per
minute
[86],
they
are
currently
too
slow
for
most
automated
tracking
applications .
Light-
field
cameras
work
at
higher
frame
rates
and
there
are
several
laboratories
exploring
if
they
can
be
successfully
incorporated
into
automated
tracking
systems
(I.D.
Couzin
and
G.G.
de
Polavieja,
unpublished) .
Ultimately,
decisions
about
which
imaging
method
to
use
should
be
determined
by
the
specific
needs
of
the
project.
Automated
tracking
generally
requires
a
high-contrast
image
so
that
computer
vision
algorithms
can
adequately
discern
organisms
and
their
appendages
from
the
surrounding
background
(Box
3).
A
common
and
low-cost
method
of
obtaining
such
images
is
to
construct
an
artificial
arena
for
tracking
experiments,
which
is
often
colored
in
contrast
with
the
animals,
and
brightly
and
uniformly
lit
with
diffuse
lighting
(Figure
IA,B).
Deciding
on
the
spatial
and
temporal
resolution
of
images
is
also
a
key
consideration.
Higher
resolutions
generally
result
in
better
tracking
results
and
more
precise
quantification
of
behavior,
but
bottlenecks
during
the
transmission,
storage,
and
processing
of
digital
information
can
limit
high
temporal
resolution
to
low
spatial
resolution
and/or
short
durations.
Con-
straints
on
low
spatial
resolutions
can
be
overcome
by
integrating
output
from
multiple
cameras
[18]
and
should
become
less
important
as
technology
advances.
Recording
software
is
another
important
consideration,
such
as
the
choice
of
codec
for
encoding
and
compressing
digital
data
or
ensuring
that
accurate
time
stamps
are
obtained
and
that
frames
are
not
silently
dropped,
and
robust
open
source
[87,88]
and
commercial
[Noldus
Information
Technology,
media
recorder,
2013
(http://www.noldus.com/media-recorder);
Nor-
pix,
StreamPix,
2013
(http://www.norpix.com/products/streampix/
streampix.php)]
options
are
available.
Meters
1.0
2.0
3.0
5.0
6.0
7.0
8.0
9.0
9.0
10.0
10.0
8.0
7.0
6.0
5.0
4.0
4.0
3.0
2.0
1.0
(A)
(B)
(F)
(C)
(D)
(E)
(G)
TRENDS in Ecology & Evolution
Figure
I.
A
growing
number
of
technologies
allow
capturing
of
digital
images
for
automated
image-based
tracking.
(A)
The
most
common
is
optical
or
near
infrared
video,
most
often
used
in
simple
2D
laboratory
settings
(left
panel
in
Figure
1)
(Movie
S1–S4,
Movie
S5,
Movie
S10,
Movie
S11,
Movie
S14,
and
Movie
S17
in
the
supplementary
material
online).
(B)
Images
from
multiple
cameras
allow
tracking
in
3D,
even
with
some
degree
of
habitat
complexity
present
(Movie
S6
and
Movie
S15
in
the
supplementary
material
online).
(C)
Thermal
imaging
allows
tracking
in
complete
darkness,
but
requires
that
tracked
animals
have
a
surface
temperature
different
from
the
surrounding
landscape
(Movie
S7
in
the
supplementary
material
online).
(D)
High-resolution
X-ray
microtomography
permits
imaging
through
complex
habitat
structure,
such
as
soil
(burrowing
invertebrate
highlighted
by
red
arrow).
(E)
Acoustic
imaging
(sonar)
can
also
image
in
habitats
where
optical
video
would
be
unusable,
such
as
this
image
of
predators
foraging
for
schooling
bait
fish
in
a
turbid
estuary
[4]
(Movie
S9
in
the
supplementary
material
online).
(F)
Light-field
cameras
allow
for
post-hoc
selection
of
focal
points,
thus
potentially
allowing
tracking
and
construction
of
the
scene
in
3D
from
a
single
image
point.
The
three
panels
in
(F)
were
obtained
from
a
single
light-field
image
each
panel
representing
different
focal
points
(highlighted
by
a
red
arrow).
(G)
Newly
developed
gigapixel
technologies
also
permit
capturing
of
images
from
a
single
image
point
with
very
high
spatial
resolutions
and
at
multi-scales,
again
allowing
for
3D
tracking
from
a
single
image
point
[86].
The
three
lower
panels
in
(G)
are
enlarged
sections
of
the
main
image.
See
Acknowledgments
for
credits
and
permissions.
Review
Trends
in
Ecology
&
Evolution
July
2014,
Vol.
29,
No.
7
419

organisms
often
move
rapidly
when
interacting
(Movie
S13
in
the
supplementary
material
online),
requiring
data
with
high
spatiotemporal
resolution.
Second,
because
multiple
individuals
are
involved,
interactions
are
prone
to
occlu-
sions,
made
especially
worse
because
interactions
often
involve
close
physical
contact.
Occlusions
cause
identity
errors,
which
are
not
local
but
propagate
throughout
the
remaining
image
sequence.
Manual
corrections
of
these
errors
are
labor
intensive.
Customized
automated
algo-
rithms
which
predict
identity
based
on
the
relative
speed
and
direction
of
movement
can
reduce
mistakes,
and
thus
dramatically
reduce
the
number
of
manual
interventions
needed
[26,27],
but
error
propagation
is
still
unavoidable
because
of
the
stochastic
behavior
of
organisms
[15]
(Box
3).
‘Fingerprinting’
somewhat
resolves
this
problem
(see
below),
but
maintaining
identities
always
becomes
more
difficult
as
the
number
of
close
individuals
scales
with
increasing
density.
Tracking
individuals
during
occlusions
is
an
additional
problem
and
can
be
partly
overcome
when
prior
knowledge
about
the
shape
of
the
organisms
is
incor-
porated
into
the
system
[26–28].
Recent
approaches
utiliz-
ing
multiple
3D
depth
cameras
are
especially
useful
in
this
regard
[29]
(Movie
S22
in
the
supplementary
material
online)
and
could
eventually
be
integrated
with
finger-
printing
to
assist
in
resolving
identities
during
occlusions.
Most
current
attempts
to
track
multiple
individuals
involve
organisms
that
are
similar
in
size
and
shape
(Table
S1
in
the
supplementary
material
online).
In
nature,
how-
ever,
interactions
between
species
often
involve
individu-
als
that
differ
greatly
in
size
and
shape
[30]
(Movie
S13
in
the
supplementary
material
online).
Although
such
differ-
ences
can
be
useful
for
distinguishing
individuals
[8,20],
many
tracking
systems
rely
on
knowledge
about
the
typical
shape
of
individuals
to
aid
in
the
segmentation
and
analy-
sis
of
images
[27,28,31].
Even
if
shape
issues
are
overcome,
it
remains
a
difficult
task
for
computer
vision
algorithms
to
separate
small
animals
from
the
body
and
appendages
of
larger
animals.
Algorithm
features
allowing
tracking
of
differently
sized
and
shaped
organisms,
such
as
more
sophisticated
contour
representations
or
fingerprinting,
would
greatly
enhance
the
usefulness
of
image-based
tracking
to
ecologists
(Box
5).
Tracking
in
three
dimensions
Automated
image-based
tracking
in
2D
environments
is
substantially
more
straightforward
than
in
3D
(Figure
2).
Therefore,
many
tracking
systems
are
limited
to
simple
2D
arenas
and
either
involve
organisms
that
naturally
move
in
2D
or
quasi-2D,
or
work
by
constraining
normally
3D
individuals
to
only
move
in
2D.
This
latter
method
can
be
achieved
by
modifying
organisms
directly,
such
as
by
wing
clipping
[27],
or
by
using
physical
boundaries
to
constrain
behavior
to
near
2D
[1,20,27,32,33]
(Movie
S1,
Movie
S4,
Movie
S5,
and
Movie
S10
in
the
supplementary
material
online).
In
nature,
however,
most
organisms
incorporate
at
least
some
degree
of
movement
in
3D,
which
influences
ecological
interactions
[3].
Tracking
systems
designed
for
2D
can
provide
some
resolution
for
behavior
in
a
third
spatial
dimension
[34],
but
ultimately
developers
must
produce
tracking
systems
that
can
successfully
track
large
numbers
of
animals
in
3D
space
(Movie
S8
in
the
supple-
mentary
material
online).
Tracking
unconstrained
flying
or
swimming
animals
can
be
achieved
in
several
ways,
but
most
often
multiple
cameras
are
employed
[18,29,35–45]
(Movie
S6,
Movie
S8,
and
Movie
S22
in
the
supplementary
material
online).
Although
only
two
calibrated
cameras
taking
images
of
the
same
point
in
space
are
required
for
triangulation,
information
from
additional
cameras
can
incrementally
improve
localization,
especially
if
some
cameras
are
limit-
ed
by
occlusion
or
low
contrast
[18].
Synchronizing
multi-
ple
cameras
requires
additional
hardware
and
more
complicated
software
that
relates
equivalent
objects
be-
tween
image
sequences;
however,
this
complexity
can
be
hidden
from
the
user
by
dedicated
multi-camera
systems
[18].
Triangulation
is
optimized
when
cameras
are
posi-
tioned
with
maximally
divergent
locations,
which
in
the
field
can
introduce
problems
because
arranging
unob-
structed
cameras
at
multiple
locations
can
be
difficult,
as
can
be
obtaining
multiple
views
of
every
location
of
interest.
Some
technologies
allow
3D
tracking
from
a
single
imaging
device,
which
could
solve
many
of
these
issues.
For
example,
3D
images
can
be
reconstructed
from
a
single
image
of
reflections
or
shadows
on
a
3D
surface
[46,47],
(iii) Analysis
Data saved as digital
image sequences with
defined spaal (pixel)
and temporal (FPS)
resoluons.
Temporal
resoluon
Sleeping
Mang
Foraging
Eang
Walking
Predator Prey (1)
Prey (2)
Prey (3)
Predator
Key:
Prey (1)
Prey (2)
Prey (3)
40
20
0
0102030
Body velocity (mm / s)
Frequency
40 50 60 80
Spaal
resoluon
Tracking soware uses
computer vision algorithms
to isolate individuals
(foreground) from the
surrounding landscape
(background), using
methods such as
background
subtracon (see
Box 3).
Analysis of trajectories
quanfies individual
(e.g., body velocity,
turning rates, search
strategy) and inter-
individual (e.g., aack
distance) traits.
Further analysis, which can be
automated if behaviors are
stereotyped, can condense
these high-dimensional
quanes into behavioral
categories.
Posions and
orientaons
of individuals are
then integrated
across image sequences
to form trajectories through me.
(ii) Tracking(i) Imaging
TRENDS in Ecology & Evolution
Figure
1.
The
three
general
steps
involved
in
automated
image-based
tracking
of
behavior
are:
(i)
imaging
(Box
2);
(ii)
detection
of
individuals
and
their
pose
in
the
image
and
appropriate
‘linking’
of
detections
to
create
separate
tracks
through
time
for
each
individual
(Box
3);
and
(iii)
analysis
of
trajectory
and
behavioral
data
(Box
4).
To
date,
imaging
is
often
done
in
the
laboratory
(left
panel),
which
can
more
easily
provide
a
clean,
crisp
image
that
minimizes
tracking
errors.
Each
of
these
steps
are
strongly
interlinked
and
time
spent
optimizing
one
step
(e.g.,
imaging)
can
pay
huge
dividends
in
time
and
effort
saved
at
later
steps
(e.g.,
reducing
tracking
errors).
Review
Trends
in
Ecology
&
Evolution
July
2014,
Vol.
29,
No.
7
420

Box
3.
Identifying
individuals
and
behaviors
in
images
Once
a
set
of
suitable
images
has
been
obtained
(Box
2),
the
position
of
individuals,
and
often
their
pose,
must
be
automatically
computed
to
form
trajectories
through
time.
First,
the
software
must
determine
whether
and
where
individuals
are
present
in
each
image.
How
easily
this
is
done
varies
with
the
type
and
quality
of
images
(Box
2),
as
well
as
how
accurately
each
individual’s
position
can
be
predicted
from
its
previous
behavior
(see
below).
Detection
is
straightforward
when
the
contrast
between
individuals
and
the
background
is
substantial,
and
when
the
background
is
known
or
does
not
change
throughout
the
entire
image
sequence
is
most
easily
performed
by
background
subtraction
(Figure
IA–C).
The
physical
complexity
of
natural
systems
will
ultimately
require
more
advanced
techniques,
such
as
those
which
constantly
update
their
background
image
[18],
or
through
visual
recognition
methods
[21,63–67],
where
the
distinctive
pattern
associated
with
an
individual’s
body
and
its
motion
can
be
recognized
against
the
clutter
of
the
background.
The
output
of
the
detection
stage
is
an
estimate
of
the
pixels
associated
with
individuals
in
each
image.
The
position
and
pose
of
organisms
with
stiff
and
simple-shaped
bodies
can
be
computed
by
fitting
a
shape
contour
to
the
image
of
the
organism
[8,27]
(Figure
ID),
including
determining
whether
clumps
of
pixels
should
be
separated
into
multiple
individuals
(Figure
IE–I).
The
situation
is
more
complex
when
the
body
is
flexible
and
multiple
degrees
of
freedom
are
of
interest,
such
as
wing
angles
or
head
orientation
(Figure
IJ).
Algorithms
for
learning
and
computing
an
individual’s
pose
is
an
active
area
of
research,
and
involves
either
explicit
modeling
of
the
body,
or
learning
associations
between
image
brightness
patterns
and
pose
parameters
[68,72,76].
Finally,
the
position
of
each
individual
must
be
linked
over
multiple
frames
to
form
trajectories
(Figure
IL–P).
This
is
relatively
simple
for
single
individuals,
although
false
and
missed
detections
become
more
likely
when
detection
is
problematic.
Constructing
trajectories
for
multi-
ple
individuals
often
involves
parameterization
of
a
movement
model
which
includes
information
from
previous
frames,
such
as
the
accelera-
tion
of
each
individual
or
their
preferred
direction
of
motion
[89,90].
Movement
models
also
improve
the
detection
phase
of
tracking,
but
ultimately
suffer
from
error
propagation
and
thus
can
be
labor
intensive.
Fingerprinting
identifies
individuals
from
their
image
structure
(see
main
text)
and
therefore
recovers
identities
after
occlusion
[20]
(Figure
IK;
Movie
S5
in
the
supplementary
material
online).
(A)
(E)
(J)
(L)
(N)
(P)
(M)
(O)
(K)
(F)
(G)
(H)
(I)
150
100
50
0
1
2
3
Individuals
Penalty
4
(B)
(C)
(D)
TRENDS in Ecology & Evolution
Figure
I.
After
imaging
(Box
2),
computer
vision
software
must
automatically
detect
the
position,
and
sometimes
pose,
of
individuals
in
the
image
to
create
trajectories.
(A–C)
A
common
approach
for
detecting
individuals
is
background
subtraction,
where
detection
of
individuals
in
raw
images
is
achieved
by
removing
an
estimated
background-only
image,
resulting
in
isolation
of
foreground
pixels.
(D)
Contours,
denoting
individuals,
can
then
be
mapped
on
to
clusters
of
these
foreground
pixels.
How
many
individuals
are
within
a
pixel
cluster
can
be
determined
in
a
number
of
ways.
The
cluster
of
pixels
in
(E–H)
can
be
grouped
as
one,
two,
three,
or
four
individuals,
with
(I)
the
optimal
grouping
being
three
individuals
based
on
some
quantifiable
measure.
When
overlaps
are
large
or
body
shapes
are
non-rigid,
other
methods
using
past
and
future
dynamics
are
more
suitable
(see
main
text).
(J)
More
complex
contours
can
precisely
map
the
pose
of
individuals,
such
as
swimming
in
Caenorhabditis
elegans
[19]
(Movie
S2
in
the
supplementary
material
online),
wing
positioning
in
Drosophila
[8]
(Movie
S14
in
the
supplementary
material
online),
or
body
posturing
of
mice
during
social
interactions
[28]
(Movie
S11
in
the
supplementary
material
online).
(K)
Fingerprinting
allows
for
maintenance
of
identities
through
time
by
analyzing
the
complete
image
structure,
often
using
differences
between
individuals
that
are
undetectable
to
the
human
eye,
such
as
these
zebra
fish
[20]
(Movie
S5
in
the
supplementary
material
online).
Once
individuals
are
detected
and
identified,
their
positions
are
linked
across
frames
to
form
trajectories.
(L)
This
could
be
a
single
individual
in
a
2D
landscape
[27],
(M)
a
single
individual
in
a
3D
landscape
(shown
here
with
some
habitat
complexity)
[18]
(Movie
S6
in
the
supplementary
material
online),
(N)
multiple
individuals
in
a
simple
2D
landscape
[27]
(Movie
S1
in
the
supplementary
material
online),
or
(O)
multiple
individuals
in
a
3D
landscape
(Movie
S8
in
the
supplementary
material
online).
(P)
Trajectories
throughout
complex
habitat
can
also
be
obtained,
such
as
this
woodlice
navigating
for
1
h
between
two
habitat
patches
connected
by
a
dispersal
corridor
(A.I.
Dell,
unpublished).
See
Acknowledgments
for
credits
and
permissions.
Review
Trends
in
Ecology
&
Evolution
July
2014,
Vol.
29,
No.
7
421

Figures
Citations
More filters
Journal ArticleDOI

DeepLabCut: markerless pose estimation of user-defined body parts with deep learning

TL;DR: Using a deep learning approach to track user-defined body parts during various behaviors across multiple species, the authors show that their toolbox, called DeepLabCut, can achieve human accuracy with only a few hundred frames of training data.
Journal ArticleDOI

Using DeepLabCut for 3D markerless pose estimation across species and behaviors

TL;DR: This protocol describes how to use an open-source toolbox, DeepLabCut, to train a deep neural network to precisely track user-defined features with limited training data, which allows noninvasive behavioral tracking of movement.
Journal ArticleDOI

Toward a Science of Computational Ethology

TL;DR: This work explores the opportunities and long-term directions of research in the new field of Computational Ethology, made possible by advances in technology, mathematics, and engineering that allow scientists to automate the measurement and the analysis of animal behavior.
Journal ArticleDOI

DeepPoseKit, a software toolkit for fast and robust animal pose estimation using deep learning

TL;DR: A new easy-to-use software toolkit, DeepPoseKit, is introduced that addresses animal pose estimation problems using an efficient multi-scale deep-learning model, called Stacked DenseNet, and a fast GPU-based peak-detection algorithm for estimating keypoint locations with subpixel precision.
Journal ArticleDOI

Applications of machine learning in animal behaviour studies

TL;DR: This review aims to introduce animal behaviourists unfamiliar with machine learning (ML) to the promise of these techniques for the analysis of complex behavioural data and illustrate key ML approaches by developing data analytical pipelines for three different case studies that exemplify the types of behavioural and ecological questions ML can address.
References
More filters
Book

Multiple view geometry in computer vision

TL;DR: In this article, the authors provide comprehensive background material and explain how to apply the methods and implement the algorithms directly in a unified framework, including geometric principles and how to represent objects algebraically so they can be computed and applied.

Multiple View Geometry in Computer Vision.

TL;DR: This book is referred to read because it is an inspiring book to give you more chance to get experiences and also thoughts and it will show the best book collections and completed collections.
Journal ArticleDOI

Observational study of behavior: sampling methods.

TL;DR: Seven major types of sampling for observational studies of social behavior have been found in the literature and the major strengths and weaknesses of each method are pointed out.
Proceedings ArticleDOI

KinectFusion: Real-time dense surface mapping and tracking

TL;DR: A system for accurate real-time mapping of complex and arbitrary indoor scenes in variable lighting conditions, using only a moving low-cost depth camera and commodity graphics hardware, which fuse all of the depth data streamed from a Kinect sensor into a single global implicit surface model of the observed scene in real- time.
Related Papers (5)
Frequently Asked Questions (16)
Q1. What are the contributions in "Automated image-based tracking and its application in ecology" ?

Its application in ecology Anthony I. Dell, John A. Bender, Kristin Branson, Iain D. Couzin, Gonzalo G. de Polavieja, Lucas P. J. J. Noldus, Alfonso Pérez-Escudero, Pietro Perona, Andrew D. Straw, Martin Wikelski, and Ulrich Brose 1 Systemic Conservation Biology, Department of Biology, Georg-August University Göttingen, Göttingen, Germany 2 HasOffers Inc., 2220 Western Ave, Seattle, WA, USA 3 Howard Hughes Medical Institute, Janelia Farm Research Campus, Ashburn, VA, USA 4 Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ, USA 5 Instituto Cajal, CSIC, Av. Doctor Arce, 37, Madrid, Spain 6 Noldus Information Technology BV, Nieuwe Kanaal 5, 6709 PA Wageningen, The Netherlands 7 Computation and Neural Systems Program, California Institute of Technology, Pasadena, CA, USA 8 Research Institute of Molecular Pathology ( IMP ), Vienna, Austria 9 Max Planck Institute for Ornithology, Radolfzell, Germany 10 Biology Department, University of Konstanz, Konstanz, Germany Review 

Unsupervised techniques offer the advantage of decreased subjectivity, and increased throughput, repeatability, and the chance of finding rare behaviors [68,74,75]. 

(A–C) A common approach for detecting individuals is background subtraction, where detection of individuals in raw images is achieved by removing an estimated background-only image, resulting in isolation of foreground pixels. 

(F) Light-field cameras allow for post-hoc selection of focal points, thus potentially allowing tracking and construction of the scene in 3D from a single image point. 

Methods for quantifying the physical structure of 3D landscapes are rapidly advancing [58– 60] and can be used for rendering features of natural habitats, such as trees or streams. 

The final step in automated image-based tracking is analysis, where position and pose data are analyzed to understand relevant biological, and ecological, patterns and processes. 

In addition, the storage and management issues that arise from the huge amounts of digital data that are easily produced by imaging must be addressed. 

Image-based tracking can also address more applied questions, such as the role of fragmentation in population dynamics (A.I. Dell, unpublished) or determining the size of animal populations that are historically difficult to measure [52]. 

This often involves application of artificial markings; however, natural variation in the morphology of individuals can also be used to maintain identities throughout image sequences, even following occlusion (Table S1 in the supplementary material online). 

Remote quantification of the environment can easily be accomplished by imaging in the appropriate sensory regime, such as optical video cameras for quantifying light conditions and thermal cameras for quantifying the thermal landscapes. 

Lightfield cameras work at higher frame rates and there are several laboratories exploring if they can be successfully incorporated into automated tracking systems (I.D. Couzin and G.G. de Polavieja, unpublished). 

Once coordinates (and pose estimates if available) are produced, then even very simple analysis can address basic ecological questions such as where and how animals behave and interact [4,8] (Figure IA–C). 

As in 2D, multiple 3D imaging cameras can be employed simultaneously to provide additional resolution and to cope with occlusions [29]. 

The position and pose of organisms with stiff and simple-shaped bodies can be computed by fitting a shape contour to the image of the organism [8,27] (Figure ID),including determining whether clumps of pixels should be separated into multiple individuals (Figure IE–I). 

constraints on the acquisition, processing, and storage of digital information limit the spatiotemporal extent of image-based tracking, and extracting the position and pose of every individual in each image is difficult in complex habitat and at high densities. 

General traits can be sufficient for maintaining identities at low densities or when individuals vary greatly in size or shape, but in many other instances in ecology individuals are likely to be similarly sized or shaped.