scispace - formally typeset

Book ChapterDOI

3D invariants with high robustness to local deformations for automated pollen recognition

12 Sep 2007-pp 425-435

TL;DR: A new technique for the extraction of features from 3D volumetric data sets based on group integration is presented, which is robust to local arbitrary deformations and nonlinear gray value changes, but is still sensitive to fine structures.
Abstract: We present a new technique for the extraction of features from 3D volumetric data sets based on group integration. The features are invariant to translation, rotation and global radial deformations. They are robust to local arbitrary deformations and nonlinear gray value changes, but are still sensitive to fine structures. On a data set of 389 confocally scanned pollen from 26 species we get a precision/recall of 99.2% with a simple 1NN classifier. On volumetric transmitted light data sets of about 180,000 airborne particles, containing about 22,700 pollen grains from 33 species, recorded with a low-cost optic in a fully automated online pollen monitor the mean precision for allergenic pollen is 98.5% (recall: 86.5%) and for the other pollen 97.5% (recall: 83.4%).

Content maybe subject to copyright    Report

3D Invariants with High Robustness to Local
Deformations for Automated Pollen Recognition
Olaf Ronneberger, Qing Wang, and Hans Burkhardt
Albert-Ludwigs-Universit¨at Freiburg, Institut ur Informatik, Lehrstuhl ur
Mustererkennung und Bildverarbeitung, Georges-K¨ohler-Allee Geb. 052,
79110 Freiburg, Deutschland
{ronneber,qwang,burkhardt}@informatik.uni-freiburg.de
Abstract. We present a new technique for the extraction of features
from 3D volumetric data sets based on group integration. The features
are invariant to translation, rotation and global radial deformations.
They are robust to local arbitrary deformations and nonlinear gray value
changes, but are still sensitive to fine structures. On a data set of 389 con-
focally scanned pollen from 26 species we get a precision/recall of 99.2%
with a simple 1NN classifier. On volumetric transmitted light data sets of
about 180,000 airborne particles, containing about 22,700 pollen grains
from 33 species, recorded with a low-cost optic in a fully automated
online pollen monitor the mean precision for allergenic pollen is 98.5%
(recall: 86.5%) and for the other pollen 97.5% (recall: 83.4%).
1 Introduction
Nearly all worldwide pollen forecasts are still based on manual counting of pollen
in air samples under the microscope. Within the BMBF-founded project “OM-
NIBUSS” a first demonstrator of a fully automated online pollen monitor was
developed, that integrates the collection, preparation and microscopic analysis
of air samples. Due to commercial interests, no details of the developed pattern
recognition algorithms were published within the last three years. This is the
first time that we show how this machine works behind the scenes.
Challenges in pollen recognition. Due to the great intra class variability and
only verysubtle inter-class differences, automated pollen recognition is a very chal-
lenging but still largely unsolved problem. As most pollen grains are nearly spher-
ical and the subtle differences are mainly found near the surface, a pollen expert
needs the full 3D information (usually by “focussing through” the transparent
pollen grain). An additional difficulty is that pollen grains are often agglomerated
and that the air samples containlots of other airborne particles. For a reliable mea-
surement of high allergenic pollen (e.g. Artemisia. A few such pollen grains per m
3
of air can already cause allergic reactions) the avoidance of false positives is one
of the most important requirements for a fully automated system.
State of the art. Almost all published articles concerning pollen recognition
deal with very low numbers of pollen grains from only a few species and use
F.A. Hamprecht, C. Schn¨orr, and B. ahne (Eds.): DAGM 2007, LNCS 4713, pp. 425–435, 2007.
c
Springer-Verlag Berlin Heidelberg 2007

426 O. Ronneberger, Q. Wang, and H. Burkhardt
manually prepared pure pollen samples, e.g. [1]. Only [4] used a data set from
real air samples containing a reasonable number of pollen grains (3686) from
27 species. But even on a reduced data set containing only 8 species and dust
particles, the recall was only 64,9% with a precision of 30%.
Main Contribution. In this paper we describe the extension of the Haar-
integration framework [9,6,7,8] (further denoted as “HI framework”) to global
and local deformations. This is achieved by creating synthetic channels con-
taining the segmentation borders and employing special parameterized kernel
functions. Due to the sparsity of non-zero-values in the synthetic channels the
resulting integral features are highly localized in the real space, while the frame-
work automatically guarantees the desired invariance properties.
For efficient computation of these integrals we make use of the sparsity of
the data in the synthetic channels and use a Fourier or spherical harmonics
(“SH”) series expansion (for the desired rotation invariance) to compute multiple
features at the same time.
a) volume rendering of
confocal data set
b) horizontal and vertical
cuts of confocal data set
c) horizontal and vertical cuts
of transmitted light data set
Fig. 1. 3D recordings of Betula pollen grains. In transmitted light microscopy the
recording properties in z-direction (the direction of the optical axis) are significantly
different from those in the xy-direction, because the effects of diffraction, refraction
and absorption depend on the direction of the transmitted light. Furthermore there
is a significant loss of information in z-direction due to the low-pass property of the
optical transfer function.
2MaterialandMethods
Data Sets. To demonstrate the generality of the proposed invariants and com-
pare them to earlier results, we use two different pollen data sets in this article.
Both contain 3D volumetric recordings of pollen grains.
The “confocal data set” contains 389 pollen grains from 26 German pollen
taxa, recorded with a confocal laser scanning microscope (fig 1a,b). For further
details on this data set refer to [6].
The “pollen monitor data set” contains about 180,000 airborne particles in-
cluding about 22,700 pollen grains from air samples that were collected, prepared

3D Invariants with High Robustness to Local Deformations 427
and recorded with transmitted light microscopy from the online pollen monitor
from March to September 2006 in Freiburg and Z¨urich (fig. 1c). All 180,000
particles were manually labeled by pollen experts.
Segmentation. To find the 3D surface of the pollen grains in the confocal data
set, we use the graph cut algorithm described in [2]. The original data were first
scaled down. The edge costs to source and sink were modeled by a Gaussian
distribution relative to the mean and minimum gray value. We added voxel-to-
voxel edges to the 124 neighborhood, where the weight was a Gaussian of the
gray differences. The resulting binary mask was then smoothly scaled up to the
original size.
The first step in processing the pollen monitor data set is the detection of
circular objects with voxel-wise vector based gray-scale invariants, similar to
those in [8]. For each detected circular object the precise border in the sharpest
layer is searched: As parts of the object border are often missing or not clear, we
use snakes to find a smooth and complete border. To avoid the common problem
of snakes being attracted to undesired edges (if plain gradient magnitude is used
asforcefield),wetakethestepsdepictedinfig2.
a) sharpest layer b) found edges c) weighted edges d) final snake
1. Applying modified Canny edge
detection.
As pollen grains have a nearly
round shape, the edges that are
approximately perpendicular to
the radial direction are more rele-
vant. We replace the gradient with
its radial component in the orig-
inal Canny edge detection algo-
rithm.
2. Model-based weighting of the
edges.
The curvatures and relative loca-
tions of the edges are analyzed
and each edge is given a different
weight. Some edges are even elim-
inated. As a result, a much clearer
weighted edge image is obtained.
3. Employing snakes to find the
final border.
The initial contour is chosen to be
the circle found in the detection
step. The external force field is the
so-called “gradient vector flow”
[10] computed from the weighted
edge image
Fig. 2. Segmentation of transmitted light microscopic images
2.1 Construction of Invariants
For the construction of invariants we use the combination of a normalization
and Haar-integration [9,6,7,8](see eq. (1)) over a transformation group con-
taining rotations and deformations (Haar-integration has nothing to do with
Haar wavelets). In contrast to the very general approach in [6], we now use the

428 O. Ronneberger, Q. Wang, and H. Burkhardt
object center and the outer border found in the segmentation step to extract
more distinctive features describing certain regions of the object.
T [f](X):=
G
f(gX)dg
G : transformation group
g : one element of the transformation group
dg : Haar measure
f : nonlinear kernel function
X : n-dim, multi-channel data set
(1)
Invariance to translations. Invariance to translations is achieved by moving
the center of mass of the segmentation mask to the origin. The final features are
quite insensitive to errors in this normalization step, because they are computed
“far” away from this center and only the direction to it is used.
Invariance to rotation. Invariance to rotation around the object center is
achieved by integration over the rotation group. In the confocal data set we can
model a 3D rotation of a real-world object by a 3D rotation of the recorded volu-
metric data set (see fig. 1b). In contrast to this, the transmitted light microscopic
image stacks from the pollen monitor data set show very different characteristics
in xy- and z-direction, (see fig. 1c). A rotation around the x- or y-axis of the
real-world object results in so different gray value distributions, that it is more
reasonable to model only the rotation around the z-axis, resulting in a planar
rotation invariance.
Invariance to global Deformations and Robustness to local Deforma-
tions. The deformation model consists of two parts. The global deformations
are modeled by a simple shift in radial direction e
r
, which depends only on the
angular coordinates (see figure 3a). For full 3D-rotations described in spherical
coordinates x =(x
r
,x
ϕ
,x
ϑ
)thismodelis
x
= x + γ
γ
γ(x)withγ
γ
γ(x)=γ(x
ϕ
,x
ϑ
) · e
r
(x
ϕ
,x
ϑ
) . (2)
For rotations around the z-axis described in cylindrical coordinates x=(x
r
,x
ϕ
,x
z
)
we get
x
= x + γ
γ
γ(x)withγ
γ
γ(x)=γ(x
ϕ
) · e
r
(x
ϕ
) . (3)
Please note, that this deformation is well defined only for r>γ(ϕ), which is
no problem in the present application, because the features are computed “far”
away from the center.
The smaller local deformations are described by an arbitrary displacement
field D(x) such that
x
= x + D(x)(4)
(see fig. 3b). For the later partial Haar-integration [3] over all possible realizations
of this displacement field, it is sufficient to know only the probability for the
occurrence of a certain relative displacement r within this field as
p
D(x + d) D(x)=r
= p
d
(r; d) x, d IR
3
, (5)

3D Invariants with High Robustness to Local Deformations 429
a) Global deformation model (radial) b) Local deformation model (arbitrary)
Fig. 3. Possible realizations of the deformation models
where we select p
d
(r; d) to be a rotationally symmetric Gaussian distribution
with a standard deviation σ = d·σ
d
.
While we achieve full invariance to radial deformations by full Haar-integration
we can only reach robustness to local deformations by partial Haar-integration.
But this non-invariance in the second case is exactly the desired behavior. In com-
bination with appropriate kernel functions this results in a continuous mapping of
objects (with weak or strong local deformations) into the feature space.
The kernel functions. Instead of selecting a certain fixed number of kernel
functions, we introduce parameterized kernel functions here. Embedded into the
HI framework, each new combination of kernel parameters results in a new in-
variant feature. For multiple kernel parameters, we now have a multidimensional
invariant feature array describing the object.
Robustness to gray value transformations. To become robust to gray value trans-
formations the information is split into gradient direction (which is very robust
even under nonlinear gray value transformations) and gradient magnitude. This
was already successfully applied to the HI framework in [8] and to confocal pollen
data sets in [5].
Synthetic channels with segmentation results. To feed the segmentation informa-
tion into the HI framework we simply render the surface (confocal data set) or
the contour of the sharpest layer (transmitted light data set) as delta-peaks into
a new channel S and extend the kernel-function with two additional points that
sense the gray value in this channel. The only condition for this technique is
that the computation of the synthetic channel and the action of transformation
group can be exchanged without the result being changed (i.e., we must get the
same result if we first extract the surface and then rotate and deform the volume
and vice versa).
Resulting kernel function. To achieve the requested properties we construct 4-
point kernels, where 2 points of the kernel a
1
and a
2
sense the segmentation

Citations
More filters

Journal ArticleDOI
Kun Liu1, Henrik Skibbe1, Thorsten Schmidt1, Thomas Blein1  +3 moreInstitutions (1)
TL;DR: This paper presents a method to build rotation-invariant HOG descriptors using Fourier analysis in polar/spherical coordinates, which are closely related to the irreducible representation of the 2D/3D rotation groups.
Abstract: The histogram of oriented gradients (HOG) is widely used for image description and proves to be very effective. In many vision problems, rotation-invariant analysis is necessary or preferred. Popular solutions are mainly based on pose normalization or learning, neglecting some intrinsic properties of rotations. This paper presents a method to build rotation-invariant HOG descriptors using Fourier analysis in polar/spherical coordinates, which are closely related to the irreducible representation of the 2D/3D rotation groups. This is achieved by considering a gradient histogram as a continuous angular signal which can be well represented by the Fourier basis (2D) or spherical harmonics (3D). As rotation-invariance is established in an analytical way, we can avoid discretization artifacts and create a continuous mapping from the image to the feature space. In the experiments, we first show that our method outperforms the state-of-the-art in a public dataset for a car detection task in aerial images. We further use the Princeton Shape Benchmark and the SHREC 2009 Generic Shape Benchmark to demonstrate the high performance of our method for similarity measures of 3D shapes. Finally, we show an application on microscopic volumetric data.

128 citations


Cites methods from "3D invariants with high robustness ..."

  • ...A fundamental method to compute such invariant features is Group Integration (Burkhardt and Siggelkow 2001; Ronneberger et al. 2002, 2007)....

    [...]


Journal ArticleDOI
Luke Mander1, Surangi W. PunyasenaInstitutions (1)
Abstract: Premise of research. Pollen and spores (sporomorphs) are a valuable record of plant life and have provided information on subjects ranging from the nature and timing of evolutionary events to the relationship between vegetation and climate. However, sporomorphs can be morphologically similar at the species, genus, or family level. Studies of extinct plant groups in pre-Quaternary time often include dispersed sporomorph taxa whose parent plant is known only to the class level. Consequently, sporomorph records of vegetation suffer from limited taxonomic resolution and typically record information about plant life at a taxonomic rank above species.Methodology. In this article, we review the causes of low taxonomic resolution, highlight examples where this has hampered the study of vegetation, and discuss the strategies researchers have developed to overcome the low taxonomic resolution of the sporomorph record. Based on this review, we offer our views on how greater taxonomic precision might be attained in f...

35 citations


Book ChapterDOI
31 Aug 2011-
TL;DR: This work utilizes the Harmonic Filter to create a generic rotation invariant object detection system that benefits from both the highly discriminative representation of local image patches in terms of histograms of oriented gradients and an adaptable trainable voting scheme that forms the filter.
Abstract: We present a method for densely computing local spherical histograms of oriented gradients (SHOG) in volumetric images. The descriptors are based on the continuous representation of the orientation histograms in the harmonic domain, which we compute very efficiently via spherical tensor products and the Fast Fourier Transformation. Building upon these local spherical histogram representations, we utilize the Harmonic Filter to create a generic rotation invariant object detection system that benefits from both the highly discriminative representation of local image patches in terms of histograms of oriented gradients and an adaptable trainable voting scheme that forms the filter. We exemplarily demonstrate the effectiveness of such dense spherical 3D descriptors in a detection task on biological 3D images. In a direct comparison to existing approaches, our new filter reveals superior performance.

24 citations


Proceedings ArticleDOI
Henrik Skibbe1, Marco Reisert1Institutions (1)
02 May 2012-
TL;DR: A new system for generic rotation invariant 2D object detection based on circular Fourier HOG features with advantages of a dense voting scheme as it is used in the Holomorphic Filter framework with features based on local orientation statistics is presented.
Abstract: In this paper we present a new system for generic rotation invariant 2D object detection based on circular Fourier HOG features. Our system combines the advantages of a dense voting scheme as it is used in the Holomorphic Filter framework with features based on local orientation statistics. Experiments on two different biological datasets show superior detection performance over four state-of-the-art reference approaches.

23 citations


Cites background from "3D invariants with high robustness ..."

  • ...In the first experiment we aim at detecting porates in pollen grains [9] , small pores on the surface of the grain which are crucial for the determination of the species....

    [...]


Proceedings ArticleDOI
Olaf Ronneberger1, Qing Wang1, Hans Burkhardt1Institutions (1)
14 May 2008-
TL;DR: The system is based on a voting procedure that finds the centers and radii of the particles and a subsequent precise segmentation with an active contour approach to meet the demands of an online pollenmonitor for high speed and low memory consumption.
Abstract: In this article we present an approach for a precise segmentation of spherical particles in transmitted light image stacks. A main goal was its fast operation and a high robustness to occlusions and agglomerations of the particles. The system is based on a voting procedure that finds the centers and radii of the particles and a subsequent precise segmentation with an active contour approach. To meet the demands of an online pollenmonitor for high speed and low memory consumption a multi-scale approach was applied. The proposed techniques successfully segmented the pollen grains in a vast amount of different air samples (about 2.7TB of raw data). The results on one of the most cluttered samples are presented in this paper.

16 citations


Cites background or methods from "3D invariants with high robustness ..."

  • ...Our work with this dataset was first presented in [1], where the emphasis was on the derivation of invariant features....

    [...]

  • ...The good results of the pollen recognition [1] is also an indication of the success of the segmentation....

    [...]


References
More filters

Journal ArticleDOI
Yuri Boykov1, Vladimir KolmogorovInstitutions (1)
TL;DR: This paper compares the running times of several standard algorithms, as well as a new algorithm that is recently developed that works several times faster than any of the other methods, making near real-time performance possible.
Abstract: Minimum cut/maximum flow algorithms on graphs have emerged as an increasingly useful tool for exactor approximate energy minimization in low-level vision. The combinatorial optimization literature provides many min-cut/max-flow algorithms with different polynomial time complexity. Their practical efficiency, however, has to date been studied mainly outside the scope of computer vision. The goal of this paper is to provide an experimental comparison of the efficiency of min-cut/max flow algorithms for applications in vision. We compare the running times of several standard algorithms, as well as a new algorithm that we have recently developed. The algorithms we study include both Goldberg-Tarjan style "push -relabel" methods and algorithms based on Ford-Fulkerson style "augmenting paths." We benchmark these algorithms on a number of typical graphs in the contexts of image restoration, stereo, and segmentation. In many cases, our new algorithm works several times faster than any of the other methods, making near real-time performance possible. An implementation of our max-flow/min-cut algorithm is available upon request for research purposes.

4,298 citations


Journal ArticleDOI
Chenyang Xu1, Jerry L. Prince1Institutions (1)
TL;DR: This paper presents a new external force for active contours, which is computed as a diffusion of the gradient vectors of a gray-level or binary edge map derived from the image, and has a large capture range and is able to move snakes into boundary concavities.
Abstract: Snakes, or active contours, are used extensively in computer vision and image processing applications, particularly to locate object boundaries. Problems associated with initialization and poor convergence to boundary concavities, however, have limited their utility. This paper presents a new external force for active contours, largely solving both problems. This external force, which we call gradient vector flow (GVF), is computed as a diffusion of the gradient vectors of a gray-level or binary edge map derived from the image. It differs fundamentally from traditional snake external forces in that it cannot be written as the negative gradient of a potential function, and the corresponding snake is formulated directly from a force balance condition rather than a variational formulation. Using several two-dimensional (2-D) examples and one three-dimensional (3-D) example, we show that GVF has a large capture range and is able to move snakes into boundary concavities.

3,977 citations


Book ChapterDOI
Yuri Boykov1, Vladimir Kolmogorov2Institutions (2)
03 Sep 2001-
TL;DR: The goal of this paper is to provide an experimental comparison of the efficiency of min-cut/max flow algorithms for applications in vision, comparing the running times of several standard algorithms, as well as a new algorithm that is recently developed.
Abstract: After [10, 15, 12, 2, 4] minimum cut/maximum flow algorithms on graphs emerged as an increasingly useful tool for exact or approximate energy minimization in low-level vision. The combinatorial optimization literature provides many min-cut/max-flow algorithms with different polynomial time complexity. Their practical efficiency, however, has to date been studied mainly outside the scope of computer vision. The goal of this paper is to provide an experimental comparison of the efficiency of min-cut/max flow algorithms for energy minimization in vision. We compare the running times of several standard algorithms, as well as a new algorithm that we have recently developed. The algorithms we study include both Goldberg-style "push-relabel" methods and algorithms based on Ford-Fulkerson style augmenting paths. We benchmark these algorithms on a number of typical graphs in the contexts of image restoration, stereo, and interactive segmentation. In many cases our new algorithm works several times faster than any of the other methods making near real-time performance possible.

2,967 citations


Journal ArticleDOI
TL;DR: This system was developed to classify 12 categories of particles found in human urine; it achieves a 93.2% correct classification rate in this application and this performance is considered good.
Abstract: A simple and general-purpose system to recognize biological particles is presented. It is composed of four stages: First (if necessary) promising locations in the image are detected and small regions containing interesting samples are extracted using a feature finder. Second, differential invariants of the brightness are computed at multiple scales of resolution. Third, after point-wise non-linear mappings to a higher dimensional feature space, this information is averaged over the whole region thus producing a vector of features for each sample that is invariant with respect to rotation and translation. Fourth, each sample is classified using a classifier obtained from a mixture-of-Gaussians generative model. This system was developed to classify 12 categories of particles found in human urine; it achieves a 93.2% correct classification rate in this application. It was subsequently trained and tested on a challenging set of images of airborne pollen grains where it achieved an 83% correct classification rate for the three categories found during one month of observation. Pollen classification is challenging even for human experts and this performance is considered good.

109 citations


Book ChapterDOI
Hanns Schulz-Mirbach1Institutions (1)
13 Sep 1995-
TL;DR: This paper considers image rotations and translations and presents algorithms for constructing invariant features and develops algorithms for recognizing several objects in a single scene without the necessity to segment the image beforehand.
Abstract: Invariant features are image characteristics which remain unchanged under the action of a transformation group. We consider in this paper image rotations and translations and present algorithms for constructing invariant features. After briefly sketching the theoretical background we develop algorithms for recognizing several objects in a single scene without the necessity to segment the image beforehand. The objects can be rotated and translated independently. Moderate occlusions are tolerable. Furthermore we show how to use these techniques for the recognition of articulated objects. The methods work directly with the gray values and do not rely on the extraction of geometric primitives like edges or corners in a preprocessing step. All algorithms have been implemented and tested both on synthetic and real image data. We present some illustrative experimental results.

82 citations


Performance
Metrics
No. of citations received by the Paper in previous years
YearCitations
20191
20181
20153
20145
20132
20124