General-purpose object recognition in 3D volume data sets using gray-scale invariants - classification of airborne pollen-grains recorded with a confocal laser scanning microscope

doi:10.1109/ICPR.2002.1048297

Proceedings of the 16th International Conference on Pattern Recognition, Quebec, Canada, September 2002

1

General-purpose Object Recognition in 3D Volume Data Sets using Gray-Scale

Invariants – Classiﬁcation of Airborne Pollen-Grains Recorded with a Confocal

Laser Scanning Microscope

Olaf Ronneberger / Hans Burkhardt

Albert-Ludwigs-University of Freiburg

Computer Science Department

79110 Freiburg, Germany

Olaf@Ronneberger.net / burkhardt@informatik.uni-freiburg.de

Eckart Schultz

German Weather Service

Human Biometeorology

79104 Freiburg, Germany

Eckart.Schultz@dwd.de

Abstract

A technique is described which may be employed to es-

tablish a fully automated system for recognition of airborne

pollen. As the different pollen taxa have only marginal dif-

ferences, a full 3D volume data set of the pollen grain was

recorded with a confocal laser scanning microscope (LSM)

at a voxel size of about (0.2µm)

3

. This represents an intrin-

sic and complete data set. 14 invariant gray-scale features

based on an integration over the 3D Euclidian transforma-

tion group with nonlinear kernels were extracted from these

volume data sets. The classiﬁcation was done with support

vector machines. The use of these general gray scale fea-

tures allows to easily adapt the system to other objectives

(e.g., pollen of a special area) or even other objects than

pollen (e.g., spores, bacteria etc.) just by exchanging the

reference data base. When using a reference data base with

the 26 most important German pollen taxa (385 samples),

the recognition rate is 92%. With a special database for al-

lergological purposes recognizing only Corylus, Alnus, Be-

tula, Poaceae, Secale, Artemisia and “allergological non-

relevant” the recognition rate is 97.4%.

1 Introduction

About 10% of the human population are allergic to

pollen. Today’s pollen-forecasts are based on time consum-

ing and expensive “manual” pollen countings done by ex-

perienced microscopists. Real-time data of actual pollen-

concentration are not available by that technique. In con-

trast to the microscopist, a pollen-recognition system based

on image recognition techniques could be integrated into a

pollen-trap to provide such real-time data.

Even though pattern recognition on images is widely

used in several biological applications, there are only very

few papers in the literature dealing with pollen recogni-

tion and most of them focus on fossil pollen and 2D data

[6, 7, 4]. Bonton et al.[1] proceed like human micro-

scopists, i.e. they use multiple images of the pollen from

different focus planes and extract the same “high-level”

features as human do (e.g., the number of pores) by use

of a plenty of highly pollen-taxa-speciﬁc or even pollen-

grain-orientation-speciﬁc algorithms for feature extraction

and they employ (to the authors knowledge) a simple hard-

coded classiﬁer. Using a set of 350 pollen grains from 30

different taxa the according recognition rate is about 73%.

In contrast to such approaches, we use 3D volume data

and a general-purpose feature extraction, namely Euclidian

gray-scale invariants [11, 3], and we employ a classiﬁer (a

set of Support Vector Machines [13]) that is trained auto-

matically with a labeled reference data base. In our pro-

grams the only a priori assumption is, that the objects are

rigid and have random orientations and positions. In other

words the system can easily be adopted to objects other than

pollen (e.g., spores, bacteria etc.) just by exchanging the

reference data base.

2 Material and Methods

2.1 Sampling, Preparation and Recording

To set up a reference data base, the pollen grains were

directly collected from the plants of interest in order to pre-

pare pure samples of the following pollen taxa:

Acer, Artemisia, Alnus, Alnus viridis, Betula, Carpi-

nus, Corylus, Chenopodium, Compositae, Cruciferae, Fa-

gus, Quercus, Aesculus, Juglans, Fraxinus, Plantago, Pla-

tanus, Poaceae, Secale, Rumex, Populus, Salix, Taxus, Tilia,

Ulmus, Urtica.

In conventional pollen counting translucent microscopy

is used. The pollen recognition of pollen which have been

Proceedings of the 16th International Conference on Pattern Recognition, Quebec, Canada, September 2002

2

b)

z = −7.5µm −5.0µm −2.5µm 0.0µm +2.5µm +5.0µm +7.5µm

Figure 1. Alnus (alder) pollen grain: slices of the volume data set recorded with a confocal laser scanning microscope

collected in the open air, is complicated since one is con-

fronted with a huge variety of particles of not only biolog-

ical origin. The strong primary ﬂuorescence of pollen pro-

vides an easily accessible feature which allows to reliably

separate them from the background and the other mostly

inorganic particles.

Even for a human pollen counter it is hard to recognize

a pollen from a single 2D view at some unfavorable orien-

tations of the pollen. As today’s computer codes are still by

far inferior to the object recognition capabilities of a human

observer, the identiﬁcation of all the pollen from a single

2D image is extremely unlikely[8].

To obtain sufﬁcient information for the recognition the

microscopist focuses the microscope to different planes of

the pollen grain. Similarly we record 2D images from sev-

eral focus planes and stack them up to a volume data set.

Translucent microscopy is not well suited for this purpose,

because the recorded images are the result of complicated

integrals of light defraction and refraction due to the inho-

mogeneous refraction coefﬁcient of the pollen grain and its

surrounding. In ﬂuorescence microscopy, however, all ﬂu-

orescence active molecules of the pollen act as small light

sources. The resulting image therefore can be regarded as

the measurement of the local ﬂuorescence activity, which is

largely independent from the direction of viewing and the

direction of illumination.

As a general problem, conventional imaging systems

generate a superposition of the wanted image of the focused

plane and out of focus images of neighboring planes. In

order to eliminate the contribution from these non-focused

planes, one can use either confocal microscopy, which

eliminates unwanted light by hardware components provid-

ing images with the highest possible quality at very high

costs, though. An alternative are deconvolution techniques

(Wiener ﬁlter), which remove the light dispersion by post-

processing of the digital images taken with a conventional

ﬂuorescence microscope. For the development of the recog-

nition system we started with high-quality images from a

confocal laser-scan microscope (Fig. 1)

2.2 Pattern Recognition with gray-scale invari-

ants

A quite simple but very powerful way of a general fea-

ture extraction is the computation of so-called “gray-scale

invariants”, which were described ﬁrst for two-dimensional

image data [11, 3], but can be straightforwardly extended to

three-dimensional volumetric data [10]. These gray-scale

invariants do not need any segmentation within the object,

but operate directly on the gray-values of the data set.

The advantageous property of such an invariant feature

is the following: The set of all possible 3D volume data

sets of one individual pollen grain – scanned in all possible

positions and orientations (Euclidian motion) – is an equiv-

alence class. An invariant transformation is able to map all

elements of this equivalence class into one point of the fea-

ture space and there represents one bit of information on the

intrinsic structure of the considered pollen grain, indepen-

dent of its position and orientation (Fig. 2).

volume data set(3-dim.) feature space (n-dim.)

Figure 2.Invariant transformation: All represen-

tations of an object (here: the object at any orientation

or position) are mapped into the same point in the fea-

ture space.

The basic recipe for calculating these invariants is to take

a non-linear kernel function of local support f (X) in order

to relate or combine the grey scale values of some neigh-

boring pixels or voxels and to integrate the result of this

function over all possible representations of the object in

Proceedings of the 16th International Conference on Pattern Recognition, Quebec, Canada, September 2002

3

the equivalence class [11].

T [f ](X) :=

Z

G

f(gX)dg (1)

f : Kernel function

X : gray-value image or volume data

G : transformation group

g : one element of the transformation

group

For the sake of clearness, the 2D version of the formulas

(for images) are presented in the following. By replacing

the 2D translations and rotations with 3D operations and the

images with volume data sets one obtains the 3D versions

of the formulae:

For rigid objects, which is a fair approximation of pollen

grains in the present context, the different elements of the

equivalence class can be described by an Euclidian trans-

formation (rotation and translation) of the object:

T [f ](X) :=

~x

max

Z

~x=

~

0

2π

Z

ϕ=0

f(g

~x,ϕ

X)dϕd~x (2)

~x

max

: extension of the image

Actually, it is not necessary to apply the transformation to

the full image, instead the kernel function can be trans-

formed which considerably speeds up the computation and

results in linear complexity of the algorithm O(N ). This is

illustrated by an example in ﬁgure 3.

A further speedup of this still expensive computation is

accomplished for a special class of kernel functions by us-

ing a convolution with the image of a circle (or in 3D: of

a spherical surface) C. This convolution may be computed

by means of the Fast Fourier Transform (FFT). For kernel

functions of the type

f(X) = f

a



X(

~

0)



· f

b



X(~q)



(3)

f

a

, f

b

: any functions that transform the

gray values

~q : span of the kernel function

one can rewrite equation 2 for the two-dimensional case

using A := f

a

(X) and B := f

b

(X) as

T [f ](X) :=

N

x

N

y

ZZ

x=0

y=0

A(x, y)

2π

Z

ϕ=0

B(x + |~q| cos(ϕ), y + |~q| sin(ϕ)) dϕ dxdy

a) b) c)

Figure 3. Calculation of a 2D gray-scale in-

variant: (a) non-linear kernel function for combining

some neighboring pixels, e.g., the multiplication of two

gray values of distance 3. (b) Evaluation for all angles.

The results are summed up, to become invariant to

rotations of the object. Gray values at fractional pixel-

positions are bilinearly interpolated (c) Evaluation at

all possible positions of the image. Again the results

are summed up, to become invariant to translations of

the object.

·







∗







T =

X

X · (X ∗ C)

T =

X

X · FFT

−1



FFT(X) · FFT(C)



Figure 4. Fast calculation of a special class

of gray scale invariants:The sequential evalua-

tion of the rotated kernel functions (as shown in ﬁg.

3b) is split into two steps: step 1: the gray values

touched by the second kernel point within the rotation

are summed up. step 2: the result is multiplied with the

gray value of the ﬁrst kernel point. The evaluation of

step 1 for all positions in the image is a simple convo-

lution which could efﬁciently be calculated by means

of the Fast Fourier Transform (FFT).

which then can be written also as

T [f ](X) :=

N

x

N

y

ZZ

x=0

y=0

A(x, y) · (B ∗ C) (x, y) dxdy (4)

where C(x, y) =



1 :

p

x

2

+ y

2

= |~q|

0 : otherwise

and ‘∗’ denotes a convolution. This again is illustrated for

one example in ﬁg. 4. Besides saving computing time we

are released from the decision, by which angular steps we

Proceedings of the 16th International Conference on Pattern Recognition, Quebec, Canada, September 2002

4

should proceed in the computation, which is not trivial, par-

ticularly in 3D volumes [10].

A more general method for saving computing costs has

been described in [12]: the considered features are com-

puted only approximately (Monte-Carlo-integration). Once

the permissible error is ﬁxed, this results in a constant com-

plexity, independent from the size of the image.

Even though these features were designed to be only in-

variant to Euclidian transformations, they are also quite ro-

bust against other transformations like articulated motion or

even slight topological deformations, due to the ﬁniteness of

the kernel support [3]

Different kernel functions can be used in order to de-

velop a set of gray scale invariants which are adapted to a

given problem. In fact, it is not difﬁcult to construct fea-

tures that provide a required discrimination power. Using

a small-scale kernel results in a feature which is sensitive

to small-scale structures of the object. For example coarse

or ﬁne-grained plasm. Correspondingly large-scale kernels

sense the large-scale structure of the object, e.g., the differ-

ence between spherical and ellipsoid objects.

In the case of the pollen recognition, the following fea-

tures turned out to provide a high discrimination perfor-

mance: a vector of 14 features is constructed by evaluating

the two kernel functions, f (X) = X(0, 0, 0) · X(0, 0, 2) and

f(X) =

p

X(0, 0, 0) ·

p

X(0, 0, 2) at 7 different scalings of

the object (1:1, 1:2, 1:4, 1:8, 1:16, 1:32 and 1:64). Since the

gray scale values of the input volume data sets were normal-

ized to unit variance the elements of the feature vector are in

the range [−1 : 1] corresponding to normalized correlation

coefﬁcients.

Due to the non-linearity of the transformation and the

particularly shape of the resulting clusters in the feature

space, a simple MAP classiﬁer based on normal distribu-

tions does not perform satisfactorily. A much better recog-

nition rate is achieved by the so-called support vector ma-

chines [13]. The principal idea behind the support vector

machine is to identify the clusters by searching for the thick-

est hyperplane, which separates this cluster from the re-

maining points. A good introduction to the theory of SVMs

is given by C. J. Burges Tutorial [2]

2.3 Measuring the recognition rate

In order to measure the quality of our recognition sys-

tem, we have used a reference data base with the 26 most

relevant German pollen taxa. 3D volume data sets of about

15 samples from each pollen taxon were recorded with a

resolution of ca. 5 voxels/µm in each direction using a con-

focal laser scanning microscope with a 40x oil-objective, an

excitation wavelength of 450-490nm and an emission wave-

length greater than 510nm.

With these 385 high-quality volume data sets we tested

our recognition system using the “leave one out” technique.

As classiﬁer we use a set of 26 SVMs with a Gaussian ker-

nel where each SVM was trained to separate one particular

class from the rest. The radius of the Gaussian kernel was

determined by optimizing the recognition rate.

3 Results and Discussion

The achieved recognition rate for all 26 taxa was about

92%. The details are listed in table 1. For pollen forecasts,

however, we are interested only in the allergologically rele-

vant pollen. So it doesn’t matter if the computer cannot dis-

tinguish, for example, between an Ulmus and a Platanus.

So we can put all the allergologically irrelevant taxa into

one class resulting in a recognition rate for allergologically

relevant pollen of 97.4%.

With regard to this high recognition rate one has to keep

in mind that the examined pollen may have less variations

in size and morphology than airborne pollen because the

pollen for each taxa were taken just from one plant. Fur-

thermore our reference pollen are not expected to have de-

formations due to sampling stress in the pollen trap [5] and

there are no contaminated or agglomerated pollen grains.

Anyway, the nearly perfect performance of the automatic

recognition working on these high-quality pollen images is

encouraging enough to test the technique with reduced data

quality by using a normal ﬂuorescence microscope with

subsequent deconvolution and real-world air samples with

deformed or contaminated pollen. Last not least, we can

use a pollen-calendar to reduce the reference data base to

the seasonally possible set of pollen, which again should

increase the recognition rate.

For establishing this system in a laboratory environment,

one main aspect is the time needed for the analysis. The

imaging with the LSM currently takes about 40s per ob-

ject (depending on its size) and the calculation of the 14

gray scale invariants for a 128

3

voxel volume takes about

15s on a Pentium II Dual-Processor PC with 400MHz, so

that we end up with a recognition time of about 1min for

each object. This time will be drastically reduced by us-

ing the conventional ﬂuorescence microscope, which can

record the same 3D volume in a few seconds. On the com-

putational side, the use of a faster processor and a reduction

of the resolution by a factor of 2 in each direction ﬁnally

may reduce the recognition time to a few seconds per ob-

ject.

Our current work also focuses on the 2D pre-recognition

of the objects, so that only pollen with an unfavorable orien-

tation or other doubtful objects have to be subjected to the

relatively time-consuming 3D recognition.

Proceedings of the 16th International Conference on Pattern Recognition, Quebec, Canada, September 2002

5

Table 1. Classiﬁcation Results using 3D LSM Data (leave-one-out Classiﬁcation)

Acer

14 correct

1 → Tilia

Artemisia

a)

13 correct

1 → Compositae

1 → Platanus

Alnus

a)

15 correct

no wrong

Alnus viridis

a)

12 correct

no wrong

Betula

a)

13 correct

2 → Plantago

Carpinus

14 correct

no wrong

Corylus

a)

13 correct

1 → Alnus

Chenopodium

12 correct

1 → Quercus

1 → Plantago

1 → Populus

Compositae

15 correct

no wrong

Cruciferae

13 correct

1 → Acer

1 → Populus

Fagus

15 correct

no wrong

Quercus

11 correct

1 → Acer

2 → Chenopodium

1 → Plantago

Aesculus

15 correct

no wrong

Juglans

13 correct

1 → Carpinus

1 → Poaceae

Fraxinus

12 correct

2 → Compositae

1 → Plantago

Plantago

13 correct

2 → Fraxinus

Platanus

15 correct

no wrong

Poaceae

a)

15 correct

no wrong

Secale

a)

11 correct

3 → Fagus

1 → Tilia

Rumex

15 correct

no wrong

Populus

14 correct

1 → Chenopodium

Salix

15 correct

no wrong

Taxus

15 correct

no wrong

Tilia

14 correct

1 → Poaceae

Ulmus

12 correct

2 → Platanus

1 → Populus

Urtica

14 correct

1 → Platanus

a)

Allergological relevant pollen

General-purpose object recognition in 3D volume data sets using gray-scale invariants - classification of airborne pollen-grains recorded with a confocal laser scanning microscope

Figures

Citations

Feature-based similarity search in 3D object databases

Connected Shape-Size Pattern Spectra for Rotation and Scale-Invariant Classification of Gray-Scale Images

Rotation-Invariant HOG Descriptors Using Fourier Analysis in Polar and Spherical Coordinates

Numerical modelling of pollen dispersion on the regional scale

Affine invariant pattern recognition using multiscale autoconvolution

References

The Nature of Statistical Learning Theory

A Tutorial on Support Vector Machines for Pattern Recognition

An automatic volumetric spore trap

Invariant Features for Gray Scale Images

Computerized identification of pollen grains by texture analysis

Related Papers (5)

Invariant Features for Gray Scale Images

Pollen texture identification using neural networks

The Princeton Shape Benchmark

Automatic detection and classification of grains of pollen based on shape and texture

Rotation invariant spherical harmonic representation of 3D shape descriptors