scispace - formally typeset
Open AccessProceedings ArticleDOI

Learning to Detect Basal Tubules of Nematocysts in SEM Images

Reads0
Chats0
TLDR
While locally optimal CRF inference may be sufficient for images of natural scenes, the results demonstrate that CRF with graph cuts performs poorly on the nematocyst images, and that HC-Search outperforms CRF without graph cuts, suggesting biological images of flexible objects present new challenges requiring further ad- vances of, or alternatives to existing methods.
Abstract
This paper presents a learning approach for detecting nematocysts in Scanning Electron Microscope (SEM) images. The image dataset was collected and made available to us by biologists for the purposes of morphological studies of corals, jellyfish, and other species in the phylum Cnidaria. Challenges for computer vision presented by this biological domain are rarely seen in general images of natural scenes. We formulate nematocyst detection as labeling of a regular grid of image patches. This structured prediction problem is specified within two frameworks: CRF and HC-Search. The CRF uses graph cuts for inference. The HC-Search approach is based on search in the space of outputs. It uses a learned heuristic function (H) to uncover high-quality candidate labelings of image patches, and then uses a learned cost function (C) to select the final prediction among the candidates. While locally optimal CRF inference may be sufficient for images of natural scenes, our results demonstrate that CRF with graph cuts performs poorly on the nematocyst images, and that HC-Search outperforms CRF with graph cuts. This suggests biological images of flexible objects present new challenges requiring further ad- vances of, or alternatives to existing methods.

read more

Content maybe subject to copyright    Report

Learning to Detect Basal Tubules of Nematocysts in SEM images
Michael Lam, Janardhan Rao Doppa, Xu Hu, Sinisa Todorovic, and Thomas Dietterich
Oregon State University
Department of EECS
{lamm,doppa,huxu,sinisa,tgd}@eecs.oregonstate.edu
Abigail Reft and Marymegan Daly
Ohio State University
Department of Evolution, Ecology and Organismal Biology
{reft.1,daly.66}@osu.edu
Abstract
This paper presents a learning approach for detecting
nematocysts in Scanning Electron Microscope (SEM) im-
ages. The image dataset was collected and made avail-
able to us by biologists for the purposes of morphological
studies of corals, jellyfish, and other species in the phylum
Cnidaria. Challenges for computer vision presented by this
biological domain are rarely seen in general images of nat-
ural scenes. We formulate nematocyst detection as labeling
of a regular grid of image patches. This structured pre-
diction problem is specified within two frameworks: CRF
and HC-Search. The CRF uses graph cuts for inference.
The HC-Search approach is based on search in the space of
outputs. It uses a learned heuristic function (H ) to uncover
high-quality candidate labelings of image patches, and then
uses a learned cost function (C) to select the final prediction
among the candidates. While locally optimal CRF inference
may be sufficient for images of natural scenes, our results
demonstrate that CRF with graph cuts performs poorly on
the nematocyst images, and that HC-Search outperforms
CRF with graph cuts. This suggests biological images of
flexible objects present new challenges requiring further ad-
vances of, or alternatives to existing methods.
1. Introduction
This paper addresses the problem of object detection in
scanning electron microscope (SEM) images for the pur-
poses of morphological characterization of cnidae. This
work focuses on nematocysts, one kind of cnida, illustrated
in Figure
1.
A cnida (plural cnidae) is an explosive sub-cellular cap-
sule that fires toxins when it discharges. It is produced by
a special cell called a cnidocyte. Because cnidae mani-
Figure 1: Example images of nematocysts from our dataset.
Detecting textured, elongated, highly deformable basal
tubules of nematocysts (marked yellow) against background
clutter is very challenging.
fest both extreme morphological cell-level simplicity and
wide biological diversity, cnidae provide a great opportunity
to investigate fundamental questions in biology, including
constraints and convergence in morphology [
1]. Of particu-
lar interest is a morphological characterization of the basal
tubules of nematocysts, marked yellow in the images shown
in Figure
1. This is because surfaces of the basal tubules are
characterized by spines whose shapes, lengths, and density
of placement along the surface represent important phone-
mic characters for evolutionary studies [
7].
Biological studies of nematocyst images are currently
conducted by visual inspection and manual annotation, tak-
ing prohibitive amounts of expert time. This, in turn, typi-
cally limits the studies to small image collections of narrow
scope. In this paper, we explore an opportunity for com-
puter vision to help biologists in their analysis of nemato-
cyst images by automatically detecting the basal tubules. As
the image resolution (i.e., pixel size) is calibrated to the real
size of observed specimens, detection of the basal tubules
readily gives information about the size and shape of the
nematocyst useful for morphological studies.
As can be seen in Figure
1, images of nematocysts
2013 IEEE International Conference on Computer Vision Workshops
978-0-7695-5161-6/13 $31.00 © 2013 IEEE
DOI 10.1109/ICCVW.2013.32
190
2013 IEEE International Conference on Computer Vision Workshops
978-1-4799-3022-7/13 $31.00 © 2013 IEEE
DOI 10.1109/ICCVW.2013.32
190

present significant challenges to the state of the art in com-
puter vision. The basal tubules are relatively thin, elon-
gated, and highly deformable objects covered with spines.
They are typically imaged against significant background
clutter, consisting of mucus and cellular debris. The clutter
is unavoidable, since it is extremely difficult to isolate indi-
vidual nematocysts during image acquisition. Thin, elon-
gated particles of debris appear very similar to the basal
tubule. The texture of debris appears very similar to the
texture of spines along the surface of the basal tubule. In ad-
dition, some images may not show the entire basal tubule,
because it may be partially occluded by clutter, or extend
beyond the image frame. Nematocysts are often damaged
naturally and sometimes damaged through preparation, so
that large parts of the basal tubules may not be physically
present in the image. Rarely do we see the aforementioned
challenges in general images of natural scenes.
Related work mostly focuses on image classification for
accelerating biological studies [
6]. In contrast, this paper
focuses on object detection and localization for accelerat-
ing biological studies. We formulate detection of the basal
tubules as binary labeling of a regular grid of image patches.
Patches that fall on the basal tubule are assigned label “1”,
and patches that fall on background are assigned label “0”.
One solution for this problem is to learn a binary classifier
to predict each patch label independently. However, this
approach is limited, since it does not account for relation-
ships among neighboring patches. An alternative is to spec-
ify object detection as a structured prediction problem. To
this end, we employ two state-of-the-art structured predic-
tion frameworks: CRFs (e.g., [
5, 4]), and HC-Search [3, 2].
HC-Search has a number of advantages over CRFs in our
detection problem. For example, HC-Search allows us to
use higher-order features with negligible overhead.
Our evaluation on the nematocyst images demonstrates
that locally-optimal CRF inference produces poor detection
results. This is in contrast to the literature, which usually
reports very good CRF performance on images of outdoor-
and indoor-scenes. Our results demonstrate that HC-Search
outperforms CRFs. Although both CRFs and HC-Search
are considered the most powerful, state-of-the-art frame-
works for structured prediction, their relatively modest per-
formance on the nematocyst images suggests that, in gen-
eral, these kinds of biological images present new chal-
lenges for computer vision.
Our key contributions include: (I) Addressing new vi-
sion challenges in SEM images; and (II) Evaluating the
most powerful structured prediction approaches namely,
CRFs and HC-Search on these images, and identifying
key advantages and weaknesses of each approach.
In the following, we describe approaches that we use for
our detection problem: IID Classifier in Sec. 2.1, CRFs in
Sec. 2.2, and HC-Search in Sec. 2.3. Sec. 3 presents the
dataset of nematocyst images and our results.
2. Technical Approach
In this section, we first state the formal problem setup,
and then describe the different approaches used in this work.
Problem Setup. We are provided with a training set of
input-output pairs {(x, y
)}, where input x X is the
regular grid of patches of a nematocyst image and output
y
Y corresponds to the ground-truth binary labeling of
the patches. Let L be a non-negative loss function such that
L(x, y
, y
) is the loss associated with labeling a particular
input x by output y
when the true output is y
(e.g., Ham-
ming and F1 loss). Our goal is to learn a predictor from
inputs to outputs whose predicted outputs have low loss.
2.1. IID Classifier
A simple baseline approach for our problem is to learn
an IID classifier (e.g., SVM, Logistic Regression) on patch
features, and make independent predictions for every image
patch. This solution is unsatisfactory, as it does not account
for relationships among neighboring patches.
Structured approaches such as Conditional Random
Fields (CRFs) [
5, 4] and HC-Search [3] leverage the struc-
ture in the problem by accounting for relationships between
inputs and outputs. In what follows, we formulate the basal
tubule detection problem within the framework of CRFs and
HC-Search.
2.2. Conditional Random Fields (CRFs)
The CRF is one of the most popular models for struc-
tured learning and inference in computer vision [
5, 4]. A
CRF defines a parametric posterior distribution over the out-
puts (labels), y, given observed image features, x, in a fac-
tored form: P (y|x, w) =
1
Z(x,w)
e
w·φ(x,y)
, where w are the
parameters, Z(x, w) is the partition function, and the fea-
tures, φ(x, y), decompose over the cliques in the underlying
graphical model.
Inference is typically posed as finding the joint MAP as-
signment that maximizes the posterior distribution: ˆy =
arg max
y Y
P (y|x, w), which is generally intractable. Pa-
rameter learning is usually formulated as minimizing the
negative conditional log-likelihood of the data. It in-
volves repeated calls to the inference procedure, and thus is
also generally intractable. Well-known approximate infer-
ence algorithms in vision include Loopy Belief Propagation
(LBP), Iterated Conditional Modes (ICM), and Graph Cuts.
In our model, the patches are organized in a graph,
G = (V, E), where V and E are sets of nodes and edges.
The nodes i = 1, 2, · · · , |V | correspond to patches in the
image, and edges (i, j) E capture their spatial relations
as a regular grid with 4-connected neighbors. Every node
i is described by a 128-dimensional SIFT descriptor vector,
191191

Ψ
u
(x
i
, y
i
), referred to as unary feature. Every edge (i, j)
E is described by a pairwise feature, Ψ
pair
(x
i
, x
j
, y
i
, y
j
),
indicating the compatibility between patches i and j with
the corresponding labeling y
i
and y
j
Ψ
pair
(x
i
, x
j
, y
i
, y
j
) =
0 if y
i
= y
j
,
exp(β|x
i
x
j
|
2
) if y
i
6= y
j
,
(1)
where β is a parameter. Ψ
pair
(x
i
, x
j
, y
i
, y
j
) encourages
neighboring patches to take the same label.
Let the set of all patch descriptors be denoted x = {x
i
:
i = 1, · · · , |V |}, and let the set of all patch labels be de-
noted y = {y
i
: i = 1, · · · , |V |}, where y
i
{0, 1}. We
investigate two different CRF formulations, referred to as
pairwise CRFs and pyramid CRFs, as explained below.
Pairwise CRF. The pairwise CRF, given by (
2), cor-
responds to the formulation that contains the unary and
pairwise features of image patches, with the standard 4-
connected neighborhood of every patch on the image lat-
tice:
w·φ(x, y)=
X
iV
w
u
·Ψ
u
(x
i
, y
i
)+
X
iV
jN
i
w
pair
·Ψ
pair
(x
i
, x
j
, y
i
, y
j
),
(2)
Pyramid CRFs. The pyramid CRF, given by (
3), con-
tains additional pyramid features, Ψ
pyr
(x
i
, x
j
, y
i
, y
j
). The
graphical model now contains a grid of patches from a
downsampled image by a factor of 2, in order to approx-
imate higher-order features. Each node i from the down-
sampled layer is connected to its four corresponding child
nodes k C
i
in the original image.
w·φ(x, y)=
X
iV
w
u
·Ψ
u
(x
i
, y
i
)+
X
iV
jN
i
w
pair
·Ψ
pair
(x
i
, x
j
, y
i
, y
j
)
+
X
iV,kC
i
w
pyr
·Ψ
pyr
(x
i
, x
k
, y
i
, y
k
).
(3)
We investigate these two CRF models combined with the
well-known inference algorithms: ICM, LBP, and Graph-
Cuts.
2.3. HC-Search
The key elements of HC-Search [
3] include the Search
space over complete outputs S
o
; Search strategy A; Heuris-
tic function H : X × Y 7→ to guide the search towards
high-quality outputs; and Cost function C : X × Y 7→
to score the candidate outputs generated by the search pro-
cedure. A high level overview of HC-Search framework is
shown in Figure 2. Below we explain all these elements and
then describe how to learn the heuristic and cost functions.
Search Space. Every state in S
o
consists of an input-
output pair, (x, y), representing the possibility of predict-
ing y as the output for input image x (see Figure 2). Such
a search space is defined in terms of two functions: 1)
Initial state function, I, such that I(x) returns an initial
state for input x; and 2) Successor function, S, such that
for any state (x, y), S((x, y)) returns a set of next states
{(x, y
1
), · · · , (x, y
k
)} that share the same input x.
The specific search space that we investigate leverages
the IID classifier. Our I(x) corresponds to the predictions
made by a logistic regression classifier. S generates a set of
next states by computing a set of image patches where the
classifier has low confidence and generating one successor
for each patch with the corresponding y value flipped. We
use the conditional probability of the logistic regression IID
classifier as the confidence measure. This search space is
similar to the Flipbit space defined in [
2].
The effectiveness of HC-Search depends critically on the
quality of the search space being used. The quality of a
search space can be understood in terms of the expected
number of search steps needed to uncover the target output
y
. For most search procedures, the time required to find
y
will grow as the depth of the target in the search space
increases. Thus, one way to quantify the expected amount
of search, independently of the specific search strategy, is
by considering the expected depth of target outputs y
. In
particular, for a given input-output pair (x, y
), the target
depth d is defined as the minimum depth at which we can
find a state corresponding to the target output y
. By this
definition, the expected target depth of our search space is
equal to the expected number of errors in the output corre-
sponding to the initial state.
Search Strategy. The role of the search procedure is to
uncover high-quality outputs, guided by the heuristic func-
tion H. Prior work [
2, 3] has shown that greedy search
works quite well when used with an effective search space.
We investigate HC-Search with greedy search. Given an in-
put x , greedy search traverses a path of length τ through
the search space, selecting as the next state, the best succes-
sor of the current state according to the heuristic. Specifi-
cally, if s
i
is the state at search step i, greedy search selects
s
i+1
= arg min
sS(s
i
)
H(s), where s
0
= I(x).
Making Predictions. Given an input image x, and a pre-
diction time bound τ , HC-Search traverses the search space
starting at I(x), using the search procedure A, guided by
the heuristic function H, until the time bound is exceeded.
It then scores each visited state s according to C(s) and re-
turns the ˆy of the lowest-cost state as the predicted output.
Let y
H
denote the best output that HC-Search could pos-
sibly return when using H, and let ˆy denote the output that it
actually returns. Also, let Y
H
(x) be the set of candidate out-
puts generated using heuristic H for a given input x. Then,
we define
y
H
= arg min
y Y
H
(x)
L(x, y, y
), ˆy= arg min
y Y
H
(x)
C( x , y).
(4)
192192

Figure 2: A high level overview of HC-Search. Given in-
put x and a search space, S
o
, we first instantiate a search
space over complete outputs. Each search node in this
space consists of a input-output pair (i.e., input image and
basal tubule detection). Next, we run a search procedure A
guided by the heuristic function H for a time bound τ (no.
of search steps). The highlighted nodes correspond to the
search trajectory traversed by the search procedure, in this
case greedy search. We return the least cost output ˆy (basal
tubule detection) that is uncovered during the search as the
prediction for input x.
Heuristic and Cost Function Learning. The error of
HC-Search, ǫ
HC
, for a given H and C can be decomposed
into two parts: 1) Generation error, ǫ
H
, due to H not gener-
ating high-quality outputs; and 2) Selection error, ǫ
C|H
, the
additional error (conditional on H) due to C not selecting
the best loss output generated by H. Guided by the error
decomposition in (
5), the learning approach optimizes the
overall error, ǫ
HC
, in a greedy stage-wise manner by first
training H to minimize ǫ
H
, and then, training C to mini-
mize ǫ
C|H
conditioned on H.
ǫ
HC
= L (x, y
H
, y
)
|
{z }
ǫ
H
+ L ( x , ˆy, y
) L (x, y
H
, y
)
|
{z }
ǫ
C|H
(5)
H is trained by imitating the search decisions made by
the true loss function (available only for training data). We
run the search procedure A for a time bound of τ for in-
put x using a heuristic equal to the true loss function, i.e.
H(x, y) = L(x, y , y
), and record a set of ranking con-
straints that are sufficient to reproduce the search behavior.
For greedy search, at every search step i, we include one
ranking constraint for every node (x, y) C
i
\ (x , y
best
),
such that H(x, y
best
) < H(x, y), where (x, y
best
) is the
best node in the candidate set C
i
(ties are broken by a ran-
dom tie breaker). The aggregate set of ranking examples is
given to a rank learner (e.g., SVM-Rank) to learn H.
C is trained to score the outputs Y
H
(x) generated by H
according to their true losses. Specifically, this training is
formulated as a bi-partite ranking problem to rank all the
best loss outputs Y
best
higher than all the non-best loss out-
puts Y
H
(x) \ Y
best
.
Advantages of HC-Search relative to other structured
prediction approaches, including CRFs, are as follows.
First, it scales gracefully with the complexity of the de-
pendency structure of features. In particular, we are free
to increase the complexity of H and C (e.g., by including
higher-order features) without considering its impact on the
inference complexity. [
2, 3] show that the use of higher-
order features results in significant improvements. Second,
the terms of the error decomposition in (
5) can be easily
measured for a learned (H, C) pair, which allows for an
assessment of which function is more responsible for the
overall error. Third, HC-Search makes minimal assump-
tions about the loss function, requiring only that we have
a “blackbox” evaluation of any candidate output. Theoret-
ically, it can even work with non-decomposable loss func-
tions, such as F1 loss.
3. Experiments and Results
We evaluate IID classifiers (Sec.
2.1), CRFs (Sec. 2.2),
and HC-Search (Sec.
2.3) on a dataset of SEM images con-
taining nematocysts. The image dataset was prepared by an
expert biologist. Fresh specimens of cnidarian tissue were:
(a) Exposed to 1M sodium citrate for 10 minutes; (b) Rinsed
in water; (c) Preserved in 70% ethanol; (d) Dehydrated in
a graded series; (e) Sputter-coated with gold palladium in a
Cressington sputter coater; and, finally, (f) Imaged using a
FEI NOVA nanoSEM microscope. The dataset consists of
130 images, each with resolution of 1024×864 pixels. The
images often show multiple instances of nematocysts within
cluttered background, as illustrated in Figures
1, 4, 5. The
dataset is very challenging. First, the background clutter
consists of mucus and debris. These appear quite similar to
the target basal tubules. Mucus and debris often latch onto
parts of nematocysts, which may partially occlude the basal
tubules or create foreground-background confusion even to
the human eye. Parts of nematocysts may also be physi-
cally missing, or may simply be out of the field of view.
SEM images suffer from low contrast. The ground truth
for each image is manually annotated by dividing the image
into a regular grid of 32x32 pixel patches, and labeling each
patch as belonging to the basal tubule of a nematocyst or
background.
Evaluation Setup and Metrics. We use 80 images
for training, 20 for hold-out validation, and 30 for testing.
Given a test image, our structured prediction assigns one
of the classes to each image patch on a regular grid. Per-
formance is evaluated by precision, recall, and F1 measure,
193193

where true positives are patches that fall on the ground truth
basal tubules. For HC-Search, we evaluate our sensitivity to
the time bound (τ ), the number of greedy search steps that
are allowed before making the final prediction.
Methods. An image is divided into a regular grid of
patches. Each patch is described by a 128-dimensional
SIFT descriptor. Assigning labels to the patches is per-
formed using the following methods. IID Classifier applies
either SVM or Logistic Regression independently on each
image patch. Pairwise CRF is the standard CRF that mod-
els the image using the unary and pairwise potentials of the
image patches. Pyramid CRF augments the pairwise po-
tential with hierarchical relationships between (larger) par-
ent patches and their (embedded smaller) children patches.
The notations w/ ICM, w/ LBP, and w/ GraphCuts indi-
cate that inference of CRF is conducted using ICM, LBP,
or Graph-Cuts algorithms, respectively. HC-Search uses
the following variants: No Global, Max Global, and Sum
Global, which differ in the feature representation for the
heuristic and cost functions. No Global uses only the
unary and pairwise features of image patches, given by
(
1). Max Global additionally uses a higher-order feature
describing the largest connected component of positive de-
tections. Sum Global additionally uses a higher-order term
describing all connected components of positive detections.
The higher-order feature is defined as the standard Bag-of-
Words (BoW) of 300 codewords, found by K-means over
SIFTs of all image patches from the entire dataset.
Table
1 presents the detection results of IID Classifiers,
CRF, and HC-Search. The results of Logistic Regression
are reported for the detection threshold set at the maxi-
mum F1 score. The HC-Search results are obtained for
time bound τ = 100 (greedy search steps). Table
1 shows
that HC-Search outperforms the two types of IID Classi-
fiers, improving upon the initial prediction of logistic re-
gression. Also, HC-Search yields higher recall and F1 than
all variants of CRFs. Interestingly, the CRFs with ICM in-
ference gave better recall and F1 than the CRFs with LBP
and Graph-Cuts inference. From Table
1, the inclusion of
standard higher-order features (BoW) in HC-Search does
not lead to significant performance improvements. This
contrasts with common reports in the literature and requires
further investigation.
We also test sensitivity to (i) Image patch size, (ii)
Choice of the descriptor used for patches, and (iii) Train-
ing time bound τ for HC-Search.
First, for patch sizes of 16x16 and 64x64 pixels, and ap-
propriately adjusted ground truth, all the approaches under-
perform relative to the results presented in Table 1. For all
the approaches, for 16x16 pixels, F1 decreases by 8%–11%,
and, for 64x64 pixels, F1 decreases by 8%–9%. Thus, our
default patch size of 32x32 pixels empirically works best.
Second, when replacing SIFTs with 496-dimensional
(a) IID Classifier Results
Precision Recall F1
SVM .675 .147 .241
Logistic Regression .605 .129 .213
(b) CRF Results
Precision Recall F1
Pairwise w/ ICM .432 .360 .393
Pairwise w/ LBP .545 .091 .156
Pairwise w/ GraphCuts .537 .070 .124
Pyramid w/ ICM .565 .258 .354
Pyramid w/ LBP .500 .013 .025
Pyramid w/ Graph Cuts .732 .013 .026
(c) HC-Search Results
Precision Recall F1
No Global .472 .545 .506
Max Global .445 .508 .475
Sum Global .457 .533 .492
Table 1: Performance on the nematocyst images.
HOG descriptors, the F1 of all the approaches decreases by
2%–4%.
Finally, Figure
3 shows the plots of precision, recall, and
F1 of HC-Search No Global for increasing time bounds
τ. The plots show four types of curves: LL-Search, HL-
Search, LC-Search and HC-Search. LL-Search uses the
loss function as both the heuristic and the cost function, and
thus serves as an upper bound on the performance of the
selected search architecture. HL-Search uses the learned
heuristic function, and the loss function as cost function,
and thus serves to illustrate how well the learned heuris-
tic performs in terms of the quality of generated outputs.
LC-Search uses the loss function as an oracle heuristic, and
learns a cost function to score the outputs generated by the
oracle heuristic. From Figure
3, for HC-Search, we see that
as τ increases, precision drops, but recall and F1 improve up
to a certain point before decreasing. This is understandable,
because as τ increases, the generation error (ǫ
H
) will mono-
tonically decrease, since strictly more outputs will be en-
countered. Simultaneously, difficulty of cost function learn-
ing can increase as τ grows, since it must learn to distin-
guish among a larger set of candidate outputs. In addition,
we can see that the LC-Search curve is very close to the
LL-Search curve, while the HL-Search curve is far below
the LL-Search curve. This suggests that the overall error of
HC-Search, ǫ
HC
, is dominated by the heuristic error ǫ
H
. A
better heuristic is thus likely to lead to better performance
overall.
We also report the error decomposition results of HC-
Search in Table
2. Recall that from Equation 5, we can
194194

Citations
More filters
Proceedings ArticleDOI

Data-Driven Activity Prediction: Algorithms, Evaluation Methodology, and Applications

TL;DR: The authors' experimental results indicate that the activity predictor learned with the approach performs better than the baseline methods, and offers a simple and reliable approach to prediction of activities from sensor data.
Journal ArticleDOI

HC-search: a learning framework for search-based structured prediction

TL;DR: This work introduces a new framework for structured prediction called HC-Search, which significantly outperforms several state-of-the-art methods and is sensitive to the particular loss function of interest and the time-bound allowed for predictions.
Journal ArticleDOI

Learning Activity Predictors from Sensor Data: Algorithms, Evaluation, and Applications

TL;DR: This work forms and solves the activity prediction problem in the framework of imitation learning and reduces it to a simple regression learning problem, and offers a simple approach for predicting activities from sensor data.
Journal ArticleDOI

Structured prediction via output space search

TL;DR: A novel approach to automatically defining an effective search space over structured outputs, which is able to leverage the availability of powerful classification learning algorithms, is described and the limited-discrepancy search space is defined and related to the quality of learned classifiers.
Proceedings ArticleDOI

Prune-and-Score: Learning for Greedy Coreference Resolution

TL;DR: This work proposes a novel search-based approach for greedy coreference resolution, where the mentions are processed in order and added to previous coreference clusters, and shows that the Prune-and-Score approach is superior to using a single scoring function to make both decisions and outperforms several state-of-the-art approaches on multiple benchmark corpora including OntoNotes.
References
More filters
Journal ArticleDOI

Robust higher order potentials for enforcing label consistency

TL;DR: This paper proposes a novel framework for labelling problems which is able to combine multiple segmentations in a principled manner based on higher order conditional random fields and uses potentials defined on sets of pixels generated using unsupervised segmentation algorithms.
Proceedings ArticleDOI

Efficiently selecting regions for scene understanding

TL;DR: This work designs an efficient approach for obtaining the best set of regions in terms of the energy function itself and demonstrates the usefulness of the algorithm on the task of scene segmentation and shows significant improvements over state of the art methods.
Proceedings ArticleDOI

Dictionary-free categorization of very similar objects via stacked evidence trees

TL;DR: This paper addresses an insect categorization problem that is so challenging that even trained human experts cannot readily categorize images of insects considered, and provides a mathematical model showing that voting evidence is better than voting decisions.
Journal ArticleDOI

Morphology, distribution, and evolution of apical structure of nematocysts in hexacorallia.

TL;DR: The apical cap is the plesiomorphic structure for anthozoan cnidae and that apical flaps are a synapomorphy of Actiniaria, and a full survey of nem atocysts from all body structures of two actiniarians demonstrates that a particular type of nematocyst, the microbasic p‐mastigophore of the mesenterial filaments, does not have apicalFlaps.
Related Papers (5)