scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Degraf-Flow: Extending Degraf Features for Accurate and Efficient Sparse-To-Dense Optical Flow Estimation

TL;DR: Evaluation on established real-world benchmark datasets show test performance in an autonomous vehicle setting where DeGraF-Flow shows promising results in terms of accuracy with competitive computational efficiency among non-GPU based methods, including a marked increase in speed over the conceptually similar EpicFlow approach.
Abstract: Modern optical flow methods make use of salient scene feature points detected and matched within the scene as a basis for sparse-to-dense optical flow estimation. Current feature detectors however either give sparse, non uniform point clouds (resulting in flow inaccuracies) or lack the efficiency for frame-rate real-time applications. In this work we use the novel Dense Gradient Based Features (DeGraF) as the input to a sparse-to-dense optical flow scheme. This consists of three stages: 1) efficient detection of uniformly distributed Dense Gradient Based Features (DeGraF) [1]; 2) feature tracking via robust local optical flow [2]; and 3) edge preserving flow interpolation [3] to recover overall dense optical flow. The tunable density and uniformity of DeGraF features yield superior dense optical flow estimation compared to other popular feature detectors within this three stage pipeline. Furthermore, the comparable speed of feature detection also lends itself well to the aim of real-time optical flow recovery. Evaluation on established real-world benchmark datasets show test performance in an autonomous vehicle setting where DeGraF-Flow shows promising results in terms of accuracy with competitive computational efficiency among non-GPU based methods, including a marked increase in speed over the conceptually similar EpicFlow approach [3].

Summary (2 min read)

1. INTRODUCTION

  • Optical flow estimation is the recovery of the motion fields between temporally adjacent images within a sequence.
  • Such matching of image points can accurately recover long range motions, however, as the matches are sparse (only cover a fraction of the image) they can lead to loss of accuracy at motion boundaries [3] in the refined dense optical flow.
  • This shows improved results compared against previous interpolation methods [19] and is currently used as a popular post-processing step in contemporary state-of-the-art dense optical flow estimation methods [20, 21, 22].
  • By contrast, recent work on the use of Dense Gradient Based Features [1] address these above issues.
  • This approach provides spatially tunable feature density and uniformity in addition to sub-pixel accuracy.

2. APPROACH

  • Dense optical flow is recovered from two temporally adjacent images (Figure 2A) using a three step process: Point detection on the first image is carried out by calculation of an even grid of DeGraF points [1] shown in Figure 2B.
  • A sliding window is passed over the image with step size of δ.
  • For an image region I of dimensions w × h containing grayscale pixels, two centroids, Cpos and Cneg are defined which define a gradient vector −−−−−−→ CposCneg for the region.
  • The key-point in each region is taken as the location of the most stable centroid, i.e if Sneg >.
  • The sparse optical flow vectors recovered are shown in Figure 2C.

3. EVALUATION

  • Statistical accuracy is measured using the established End-Point Error (EPE) metric [10, 13].
  • The authors method (DeGraF / DeGraF-Flow) was implemented in C++ and all experiments run on a Core i7 using four CPU cores.
  • All timings reported are for the run-time of the algorithm excluding image input/output and display.
  • For RLOF, the global motion and illumination models are used as per [2] with the adaptive cross based support region described in [29].

3.1. Comparison of Feature Detectors

  • To justify the use of DeGraF points, the authors compare with established feature detectors for dense flow computation on KITTI 2012 [10].
  • As is shown in [26], when using RLOF and interpolation to recover dense optical flow, a uniform grid of points is a superior input compared to other feature point detectors.
  • Here the authors repeat the experiment of [26] but with the addition of DeGraF (Table 1).
  • To allow meaningful comparison, each detector is tuned to ensure a comparable number of points are detected.
  • DeGraF shows the best performing EPE and efficient detection, equal to FAST.

3.2. Benchmark Comparison - KITTI 2012

  • Semi dense (50%) ground truth, calculated using a LiDAR, is provided.
  • The result of standalone Pyramidal Lucas-Kanade (PLK) [24] is shown as a baseline reference.
  • The first two columns give the percentage of estimated flow vectors that have an EPE of more than 3px.
  • DeGraF-Flow shows promising results in terms of balancing computational efficiency (run-time, Table 2) and accuracy.
  • In Table 2 it ranks thirteenth in accuracy (Out-Noc) and is shown to be the fourth fastest CPU method that has a percentage outlier of non occluded pixels of less than 10%.

3.3. Benchmark Comparison - KITTI 2015

  • These exhibit far larger pixel displacements in some areas resulting in lower algorithm performance on KITTI 2015 (Table 3) than on KITTI 2012 (Table 2).
  • Table 3 shows the results of DeGraF-Flow on the test set compared to DeepFlow [18] and EpicFlow [3] (which both use DeepMatch as the sparse matching technique).
  • The percentage of flow vectors with an EPE greater than 3px are shown, with fg and bg referring to foreground objects and the background scene respectively.
  • As with the KITTI 2012 benchmark results their approach shows comparable accuracy with significantly reduced runtime over EpicFlow and DeepFlow.
  • In terms of accuracy it places 13th (out of 22) from CPU methods that take less than than 10 seconds to process an image pair.

4. CONCLUSION

  • A novel optical flow estimation method.the authors.
  • -Flow uses a rapidly computed grid of Dense Gradient based Features and then combines an existing state of the art sparse point tracker (RLOF [2]) and interpolator (EPIC [3]) to recover dense flow.
  • With only minimal impact on accuracy (within 2% of EPICFlow [3] and DeepFlow [18] across all metrics) their approach offers significant gains in computational performance for dense optic flow estimation.
  • On the KITTI 2012 and 2015 benchmarks [10, 13] their method shows competitive run-time and comparable accuracy with other CPU methods.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Durham Research Online
Deposited in DRO:
05 June 2019
Version of attached le:
Accepted Version
Peer-review status of attached le:
Peer-reviewed
Citation for published item:
Stephenson, F. and Breckon, T.P. and Katramados, I. (2019) 'DeGraF-Flow : extending DeGraF features for
accurate and ecient sparse-to-dense optical ow estimation.', in 2019 IEEE International Conference on
Image Processing (ICIP) ; proceedings. Piscataway, NJ: IEEE, pp. 1277-1281.
Further information on publisher's website:
https://doi.org/10.1109/ICIP.2019.8803739
Publisher's copyright statement:
c
2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in
any current or future media, including reprinting/republishing this material for advertising or promotional purposes,
creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of
this work in other works.
Additional information:
Use policy
The full-text may be used and/or reproduced, and given to third parties in any format or medium, without prior permission or charge, for
personal research or study, educational, or not-for-prot purposes provided that:
a full bibliographic reference is made to the original source
a link is made to the metadata record in DRO
the full-text is not changed in any way
The full-text must not be sold in any format or medium without the formal permission of the copyright holders.
Please consult the full DRO policy for further details.
Durham University Library, Stockton Road, Durham DH1 3LY, United Kingdom
Tel : +44 (0)191 334 3042 | Fax : +44 (0)191 334 2971
https://dro.dur.ac.uk

DEGRAF-FLOW: EXTENDING DEGRAF FEATURES FOR ACCURATE AND EFFICIENT
SPARSE-TO-DENSE OPTICAL FLOW ESTIMATION
Felix Stephenson
1
, Toby P. Breckon
1
, Ioannis Katramados
2
Durham University, UK
1
| NHL Stenden University of Applied Sciences, Netherlands
2
ABSTRACT
Modern optical flow methods make use of salient scene fea-
ture points detected and matched within the scene as a ba-
sis for sparse-to-dense optical flow estimation. Current fea-
ture detectors however either give sparse, non uniform point
clouds (resulting in flow inaccuracies) or lack the efficiency
for frame-rate real-time applications. In this work we use
the novel Dense Gradient Based Features (DeGraF) as the
input to a sparse-to-dense optical flow scheme. This con-
sists of three stages: 1) efficient detection of uniformly dis-
tributed Dense Gradient Based Features (DeGraF) [1]; 2) fea-
ture tracking via robust local optical flow [2]; and 3) edge
preserving flow interpolation [3] to recover overall dense op-
tical flow. The tunable density and uniformity of DeGraF
features yield superior dense optical flow estimation com-
pared to other popular feature detectors within this three stage
pipeline. Furthermore, the comparable speed of feature de-
tection also lends itself well to the aim of real-time optical
flow recovery. Evaluation on established real-world bench-
mark datasets show test performance in an autonomous ve-
hicle setting where DeGraF-Flow shows promising results in
terms of accuracy with competitive computational efficiency
among non-GPU based methods, including a marked increase
in speed over the conceptually similar EpicFlow approach [3].
Index Terms optical flow, Dense Gradient Based Fea-
tures, DeGraF, automotive vision, feature points
1. INTRODUCTION
Optical flow estimation is the recovery of the motion fields
between temporally adjacent images within a sequence. Since
its conception over 35 years ago [4, 5] it remains an area of
intense interest in computer vision in terms of accurate and
efficient computation for numerous applications [6].
Dense optical flow estimation aims to accurately recover
per-pixel motion vectors from every pixel in a video frame to
the corresponding locations in the subsequent (or previous)
image frame in the sequence. These vector fields form the
basis for applications such as scene segmentation [7], object
detection and tracking [8], structure from motion and visual
odometry [9]. The development of autonomous vehicles has
revealed the necessity for real-time scene understanding [10].
This has lead to increased pressure to improve both the quality
and computational efficiency of dense optical flow estimation.
To date, recent progress in this area has been driven
by the introduction of increasingly challenging benchmark
Fig. 1. Dense optical flow results and error maps from KITTI
2012 [10] (left) and KITTI 2015 [13] (right).
datasets [11, 12, 10, 13] providing accurate ground truth op-
tical flow for comparison. Approaches which are robust to
scene discontinuities (occlusions, motion boundaries) and
appearance changes (illumination, chromacity) have shown
strong results under static scene conditions [11]. However,
the recent KITTI optical flow benchmarks [10, 13], which
comprise video sequences of dynamic real-world urban driv-
ing scenarios, presents additional challenges. In particular,
significant motion vectors between subsequent video frames
due to vehicle velocity through the scene, present a key prob-
lem in accurate optical flow estimation [14] for such large
displacement vectors.
To cope with such challenges, contemporary optical flow
methods use a sparse-to-dense estimation scheme, whereby a
sparse set of points on a video frame are matched to points
in the subsequent frame. The sparse set of optical flow vec-
tors recovered from this matching are then used as input to a
refinement scheme to recover dense optical flow [14, 15, 16,
3, 17, 18]. Such matching of image points can accurately re-
cover long range motions, however, as the matches are sparse
(only cover a fraction of the image) they can lead to loss of
accuracy at motion boundaries [3] in the refined dense optical
flow. To address this issue, the recent work of [3] (EpicFlow)
incorporates a novel state-of-the-art interpolator, Edge Pre-
serving Interpolation of Correspondences (EPIC), which re-
covers dense flow from sparse matches using edge detection
to preserve accuracy at motion boundaries. This shows im-
proved results compared against previous interpolation meth-
ods [19] and is currently used as a popular post-processing
step in contemporary state-of-the-art dense optical flow esti-
mation methods [20, 21, 22].
Many of these sparse matching techniques are highly ac-
curate but are notably incapable of real-time performance re-

quired for applications such as vehicle autonomy [3, 17, 18,
23]. By contrast, computational efficiency is more readily
achievable via sparse point tracking whereby flow estimation
only takes place on a fraction of the image. Interestingly,
the seminal Lucas-Kanade [5] sparse point tracker proposed
over 30 years ago still forms the basis for many contemporary
state-of-the-art sparse flow techniques [24, 25, 2]. Robust Lo-
cal Optical Flow (RLOF) [2] is one such derivative that shows
state-of-the-art accuracy on the KITTI benchmark [10, 13].
The most recent work on RLOF [26] demonstrates that com-
bining sparse flow field with interpolators can achieve both
efficient and accurate dense optical flow.
All methods that employ sparse feature tracking require a
well defined feature set upon which matching can take place.
Furthermore, sparse flow vectors with uniform spatial cover-
age is an ideal for accurate dense optical flow recovery [26]
making uniform feature distribution across the scene a key
conduit to success. Common feature choices are Harris [27]
or FAST key-points [28] due to their relative speed. How-
ever these detectors do not guarantee uniform spatial feature
distribution as they locate points only on highly textured im-
age regions (e.g corners and edges). To address this issue,
a current state of the art sparse flow method, denoted Fast
Semi Dense Epipolar Flow (FSDEF) [25], forces uniformity
of FAST key-point feature by use of a block-wise selection;
this however adversely effects saliency causing increased er-
roneous matches. Further to this, FAST points do not provide
sub-pixel precision so further refinement to the point matches
is required to recover accurate optical flow. By contrast, re-
cent work on the use of Dense Gradient Based Features (De-
GraF) [1] address these above issues. This approach provides
spatially tunable feature density and uniformity in addition to
sub-pixel accuracy. DeGraF has comparable speed to FAST
and has been shown to be superior in terms of noise and illu-
mination invariance [1].
Motivated by these desirable attributes, our proposed
method, DeGraF-Flow, takes a spatially uniform grid of De-
GraF feature points as an input to a sparse-to-dense optical
flow estimation scheme. Given two temporally adjacent im-
ages in a sequence, DeGraF points are detected in the first
image and then efficiently tracked to the subsequent image
using RLOF [2]. Finally dense optical flow is recovered using
the established EPIC interpolation approach [3].
2. APPROACH
Dense optical flow is recovered from two temporally adjacent
images (Figure 2A) using a three step process:
Point detection on the first image is carried out by calcu-
lation of an even grid of DeGraF points [1] shown in Figure
2B. A sliding window is passed over the image with step size
of δ. A key-point is detected within the window at each step
as follows.
For an image region I of dimensions w × h containing
grayscale pixels, two centroids, C
pos
and C
neg
are defined
Fig. 2. An overview of the DeGraF-Flow pipeline
which define a gradient vector
C
pos
C
neg
for the region. C
pos
is computed as the spatially weighted average pixel value:
C
pos
(x
pos
, y
pos
) = C
pos
h1
P
i=0
w1
P
j=0
iI(i,j)
S
pos
,
h1
P
i=0
w1
P
j=0
jI(i,j)
S
pos
(1)
where S
pos
=
h1
P
i=0
w1
P
j=0
I(i, j). The negative centroid
C
neg
is similarly defined as the weighted average of inverted
pixel values:
C
neg
(x
neg
, y
neg
) = C
neg
h1
P
i=0
w1
P
j =0
i(1+mI(i,j))
S
neg
,
h1
P
i=0
w1
P
j =0
j(1+mI(i,j))
S
neg
(2)
where S
neg
=
h1
P
i=0
w1
P
j=0
(1+mI(i, j)) and m = max
(i,j)
I(i, j).
Inverted pixel values are normalised (1 256) to avoid divi-
sion by zero.
The key-point in each region is taken as the location of the
most stable centroid, i.e if S
neg
> S
pos
then the key-point is
at (x
neg
, y
neg
) and vice versa. This choice is made because
the larger value from S
neg
and S
pos
is less sensitive to noise
and so the corresponding centroid is more robust.
Sparse Point tracking of each DeGraF point to the sub-
sequent image is carried out using the Robust Local Optical
Flow approach of [2]. The sparse optical flow vectors recov-
ered are shown in Figure 2C.

Interpolation of sparse vectors to recover the dense flow
field in Figure 2D is achieved using EPIC (Edge Preserving
Interpolation of Correspondences) [3]. An affine transforma-
tion is fitted to the k nearest support flow vectors, estimated
using a geodesic distance which penalises crossing of image
edges. This work uses image gradients instead of structured
edge maps for defining image edges. The result is a dense
optical flow field estimation as illustrated in Figure 2D.
3. EVALUATION
Evaluation is carried out on the KITTI optic flow estimation
benchmark data sets (denoted as KITTI 2012 [10] and KITTI
2015 [13]).
Statistical accuracy is measured using the established
End-Point Error (EPE) metric [10, 13]. For a predicted opti-
cal flow vector u
p
at every pixel with corresponding ground
flow truth vector u
g t
, the EPE is then defined as the average
difference between the predicted and ground truth vectors
over the image:
EPE =
1
N
X
i
ku
p
i
u
g t
i
k
2
, (3)
where N is the number of pixels and EPE is hence mea-
sured in pixels.
Our method (DeGraF / DeGraF-Flow) was implemented
in C++ and all experiments run on a Core i7 using four CPU
cores. All timings reported are for the run-time of the algo-
rithm excluding image input/output and display.
Parameters: DeGraF window size, w = h = 3 and step
size, δ = 9. For RLOF, the global motion and illumina-
tion models are used as per [2] with the adaptive cross based
support region described in [29]. This algorithm is termed
RLOF(IM-GM) in reported results in Table 2. For EPIC we
use k = 128 [3].
3.1. Comparison of Feature Detectors
To justify the use of DeGraF points, we compare with estab-
lished feature detectors for dense flow computation on KITTI
2012 [10].
As is shown in [26], when using RLOF and interpolation
to recover dense optical flow, a uniform grid of points is a su-
perior input compared to other feature point detectors. Here
we repeat the experiment of [26] but with the addition of De-
GraF (Table 1).
Table 1 shows the EPE on a KITTI image pair from using
popular feature detectors for the first stage of optical flow es-
timation. To allow meaningful comparison, each detector is
tuned to ensure a comparable number of points are detected.
DeGraF shows the best performing EPE and efficient detec-
tion, equal to FAST.
3.2. Benchmark Comparison - KITTI 2012
The KITTI 2012 benchmark [10] comprises 194 training and
194 test image pairs (1240 × 376 pixels) depicting static road
Point Detector # Points EPE
Detection
Time(s)
DeGraF [1] 5400 1.34 0.07
SURF [30] 5282 1.43 0.90
SIFT [31] 5400 1.77 1.80
AGAST [32] 5624 1.92 0.11
FAST [28] 5562 2.88 0.07
ORB [33] 5400 5.60 0.28
Table 1. Comparison of differing point detectors for dense
optical flow computation on KITTI 2012 example #164 with
each tuned to detect approximately 5400 points per image.
scenes. Semi dense (50%) ground truth, calculated using a
LiDAR, is provided.
Table 2 shows the results on the KITTI test set for CPU
methods that can process an image pair in under 20 seconds.
All such methods that perform better than DeGraf-Flow in
terms of EPE are included in the table. Not all less accurate
methods are shown. Below the dividing line the best sparse
flow methods are shown. Note how, although these methods
have very low error, this error is only reported over a greatly
reduced density of points. The result of standalone Pyrami-
dal Lucas-Kanade (PLK) [24] is shown as a baseline refer-
ence. The first two columns give the percentage of estimated
flow vectors that have an EPE of more than 3px. The next
two columns give average EPE values over the test set. Noc
denotes statistics on only the non occluded pixels, where oc-
cluded pixels are those which appear in the first image but
not in the second. Methods are ranked in order of increasing
non-occluded outlier percentage (Out-Noc).
DeGraF-Flow shows promising results in terms of balanc-
ing computational efficiency (run-time, Table 2) and accuracy.
In Table 2 it ranks thirteenth in accuracy (Out-Noc) and is
shown to be the fourth fastest CPU method that has a per-
centage outlier of non occluded pixels of less than 10%. Other
faster methods such as PCA-Flow and DIS-Fast show signifi-
cantly higher EPE . PCA-Flow and PCA-Layers both employ
an interpolation scheme for computing dense flow. The supe-
rior results of DeGraF-Flow show that the EPIC interpolator
is the correct choice, which agrees with the findings in [26].
At the time of testing, DeGraF-Flow was the 18th fastest over-
all, including GPU methods with an overall accuracy rank of
56 from a total of 95 submissions within KITTI 2012.
EpicFlow and RLOF(IM-GM) are highlighted in Table 2
as these are the two constituent components of our method.
DeGraF-Flow is outperformed by EpicFlow but runs almost
five times faster. RLOF(IM-GM) represents near state of the
art in sparse optical flow, making it an excellent candidate for
tracking DeGraF points. RLOF is second only to FSDEF [25]
which is comtemporary work to that presented here.
3.3. Benchmark Comparison - KITTI 2015
The KITTI 2015 benchmark [13] comprises 200 training and
200 test image pairs (1242 × 375 pixels) with the increased

Method Out-Noc Out-All Avg-Noc Avg-All Density Run-time Environment
SPS-Fl [23] 3.38 % 10.06 % 0.9 px 2.9 px 100.00 % 11 s 1 core @ 3.5 Ghz
SDF [34] 3.80 % 7.69 % 1.0 px 2.3 px 100.00 % TBA 1 core @ 2.5 Ghz
MotionSLIC [35] 3.91 % 10.56 % 0.9 px 2.7 px 100.00 % 11 s 1 core @ 3.0 Ghz
RicFlow [36] 4.96 % 13.04 % 1.3 px 3.2 px 100.00 % 5 s 1 core @ 3.5 Ghz
CPM2 [37] 5.60 % 13.52 % 1.3 px 3.3 px 100.00 % 4 s 1 core @ 2.5 Ghz
CPM-Flow [37] 5.79 % 13.70 % 1.3 px 3.2 px 100.00 % 4.2s 1 core @ 3.5 Ghz
MEC-Flow [38] 6.95 % 17.91 % 1.8 px 6.0 px 100.00 % 3 s 1 core @ 2.5 Ghz
DeepFlow [18] 7.22 % 17.79 % 1.5 px 5.8 px 100.00 % 17 s 1 core @ 3.6Ghz
RecSPy+ [39] 7.51 % 15.96 % 1.6 px 3.6 px 100.00 % 0.16 s 1 core @ 2.5 Ghz
RDENSE(anon) 7.72 % 14.02 % 1.9 px 4.6 px 100.00 % 0.5 s 4 cores @ 2.5 Ghz
EpicFlow [3] 7.88 % 17.08 % 1.5 px 3.8 px 100.00 % 15 s 1 core @ 3.6 Ghz
SparseFlow [17] 9.09 % 19.32 % 2.6 px 7.6 px 100.00 % 10 s 1 core @ 3.5 Ghz
DeGraF-Flow 9.41 % 16.93 % 2.5 px 8.4 px 100.00 % 3.2 s 4 cores @ 2.5 Ghz
PCA-Layers [19] 12.02 % 19.11 % 2.5 px 5.2 px 100.00 % 3.2 s 1 core @ 2.5 Ghz
PCA-Flow [19] 15.67 % 24.59 % 2.7 px 6.2 px 100.00 % 0.19 s 1 core @ 2.5 Ghz
DB-TV-L1 [40] 30.87 % 39.25 % 7.9 px 14.6 px 100.00 % 16 s 1 core @ 2.5 Ghz
DIS-FAST [41] 38.58 % 46.21 % 7.8 px 14.4 px 100.00 % 0.023s 1 core @ 4 Ghz
FSDEF [25] 1.07 % 1.17 % 0.7 px 0.7 px 41.81 % 0.26s 4 cores @ 3.5 Ghz
RLOF(IM-GM) [2] 2.48 % 2.64 % 0.8 px 1.0 px 11.84 % 3.7 s 4 core @ 3.4 Ghz
RLOF [42] 3.14 % 3.39 % 1.0 px 1.2 px 14.76 % 0.488 s GPU @ 700 Mhz
BERLOF [43] 3.31 % 3.60 % 1.0 px 1.2 px 15.26 % 0.231 s GPU @ 700 Mhz
PLK [24] 27.44 % 31.04 % 11.3 px 17.3 px 92.33 % 1.3 s 4 cores @ 3.5 Ghz
Table 2. KITTI 2012 Benchmark Results - comparison of the best CPU methods that can process an image pair in under 20
seconds. Below the line = sparse flow algorithms; Noc = pixels that are not occluded in the second image; First two columns =
percentage of flow vectors that have an EPE of greater than 3px; methods are ordered by Out-Noc. Full table of results can be
found on the KITTI benchmark website [44].
Rank Method Fl-bg Fl-fg Fl-all Run-time Environment
65 EpicFlow [3] 25.81 % 28.69 % 26.29 % 15 s 1 core @ 3.6 Ghz
67 DeepFlow [18] 27.96 % 31.06 % 28.48 % 17 s 1 core @ 3.5 Ghz
68 DeGraF-Flow 28.78 % 29.69 % 28.94 % 3.2 s 4 cores @ 2.5 Ghz
Table 3. KITTI 2015 Benchmark Results - percentage of pixels with EPE 3 pixels are given; f g and bg refer to motion of
foreground moving vehicles and the static background scene respectively.
challenges of dynamic scene objects (vehicles). These ex-
hibit far larger pixel displacements in some areas resulting in
lower algorithm performance on KITTI 2015 (Table 3) than
on KITTI 2012 (Table 2).
Table 3 shows the results of DeGraF-Flow on the test set
compared to DeepFlow [18] and EpicFlow [3] (which both
use DeepMatch as the sparse matching technique). The per-
centage of flow vectors with an EPE greater than 3px are
shown, with fg and bg referring to foreground objects and
the background scene respectively.
As with the KITTI 2012 benchmark results our approach
shows comparable accuracy with significantly reduced run-
time over EpicFlow and DeepFlow. Over all submissions to
the KITTI 2015 benchmark, DeGraF-Flow reports the 10th
fastest CPU method. In terms of accuracy it places 13th (out
of 22) from CPU methods that take less than than 10 seconds
to process an image pair. Over the total of 90 submissions
DeGraF-Flow places 68 in terms of Fl-all and 20 in terms of
run-time.
4. CONCLUSION
In this paper we present DeGraF-Flow, a novel optical flow
estimation method. DeGraF-Flow uses a rapidly computed
grid of Dense Gradient based Features (DeGraF) and then
combines an existing state of the art sparse point tracker
(RLOF [2]) and interpolator (EPIC [3]) to recover dense flow.
With only minimal impact on accuracy (within 2% of EPIC-
Flow [3] and DeepFlow [18] across all metrics) our approach
offers significant gains in computational performance for
dense optic flow estimation.
We show that the invariability, density and uniformity of
DeGraF points yield superior dense flow results compared to
other popular point detectors. The rapid end-to-end optic flow
estimation time is also conducive to real-time applications
such as scene understanding for future autonomous vehicle
applications.
On the KITTI 2012 and 2015 benchmarks [10, 13] our
method shows competitive run-time and comparable accuracy
with other CPU methods. Future work will exploit the track-
ing of DeGraF features for applications in an autonomous ve-
hicle setting.

Citations
More filters
Posted Content
TL;DR: In this article, the authors propose to estimate the traffic participants using instance-level segmentation and use the epipolar constraints that govern each independent motion for faster and more accurate estimation.
Abstract: We tackle the problem of estimating optical flow from a monocular camera in the context of autonomous driving. We build on the observation that the scene is typically composed of a static background, as well as a relatively small number of traffic participants which move rigidly in 3D. We propose to estimate the traffic participants using instance-level segmentation. For each traffic participant, we use the epipolar constraints that govern each independent motion for faster and more accurate estimation. Our second contribution is a new convolutional net that learns to perform flow matching, and is able to estimate the uncertainty of its matches. This is a core element of our flow estimation pipeline. We demonstrate the effectiveness of our approach in the challenging KITTI 2015 flow benchmark, and show that our approach outperforms published approaches by a large margin.

10 citations

References
More filters
J.-Y. Bouguet1
01 Jan 1999
TL;DR: It is essential to define the notion of similarity in a 2D neighborhood sense and the image velocity d is defined as being the vector that minimizes the residual function defined as follows.
Abstract: 1 Problem Statement Let I and J be two 2D grayscaled images. The two quantities I(x) = I(x, y) and J(x) = J(x, y) are then the grayscale value of the two images are the location x = [x y] , where x and y are the two pixel coordinates of a generic image point x. The image I will sometimes be referenced as the first image, and the image J as the second image. For practical issues, the images I and J are discret function (or arrays), and the upper left corner pixel coordinate vector is [0 0] . Let nx and ny be the width and height of the two images. Then the lower right pixel coordinate vector is [nx − 1 ny − 1] . Consider an image point u = [ux uy] on the first image I. The goal of feature tracking is to find the location v = u + d = [ux+dx uy +dy] on the second image J such as I(u) and J(v) are “similar”. The vector d = [dx dy] is the image velocity at x, also known as the optical flow at x. Because of the aperture problem, it is essential to define the notion of similarity in a 2D neighborhood sense. Let ωx and ωy two integers. We define the image velocity d as being the vector that minimizes the residual function defined as follows:

1,613 citations


"Degraf-Flow: Extending Degraf Featu..." refers background in this paper

  • ...The result of standalone Pyramidal Lucas-Kanade (PLK) [24] is shown as a baseline reference....

    [...]

  • ...Interestingly, the seminal Lucas-Kanade [5] sparse point tracker proposed over 30 years ago still forms the basis for many contemporary state-of-the-art sparse flow techniques [24, 25, 2]....

    [...]

Journal ArticleDOI
TL;DR: A way to approach the problem of dense optical flow estimation by integrating rich descriptors into the variational optical flow setting, while reaching out to new domains of motion analysis where the requirement of dense sampling in time is no longer satisfied is presented.
Abstract: Optical flow estimation is classically marked by the requirement of dense sampling in time. While coarse-to-fine warping schemes have somehow relaxed this constraint, there is an inherent dependency between the scale of structures and the velocity that can be estimated. This particularly renders the estimation of detailed human motion problematic, as small body parts can move very fast. In this paper, we present a way to approach this problem by integrating rich descriptors into the variational optical flow setting. This way we can estimate a dense optical flow field with almost the same high accuracy as known from variational optical flow, while reaching out to new domains of motion analysis where the requirement of dense sampling in time is no longer satisfied.

1,429 citations


"Degraf-Flow: Extending Degraf Featu..." refers background or methods in this paper

  • ...The sparse set of optical flow vectors recovered from this matching are then used as input to a refinement scheme to recover dense optical flow [14, 15, 16, 3, 17, 18]....

    [...]

  • ...In particular, significant motion vectors between subsequent video frames due to vehicle velocity through the scene, present a key problem in accurate optical flow estimation [14] for such large displacement vectors....

    [...]

Proceedings ArticleDOI
01 Dec 2013
TL;DR: This work proposes a descriptor matching algorithm, tailored to the optical flow problem, that allows to boost performance on fast motions, and sets a new state-of-the-art on the MPI-Sintel dataset.
Abstract: Optical flow computation is a key component in many computer vision systems designed for tasks such as action detection or activity recognition. However, despite several major advances over the last decade, handling large displacement in optical flow remains an open problem. Inspired by the large displacement optical flow of Brox and Malik, our approach, termed Deep Flow, blends a matching algorithm with a variational approach for optical flow. We propose a descriptor matching algorithm, tailored to the optical flow problem, that allows to boost performance on fast motions. The matching algorithm builds upon a multi-stage architecture with 6 layers, interleaving convolutions and max-pooling, a construction akin to deep convolutional nets. Using dense sampling, it allows to efficiently retrieve quasi-dense correspondences, and enjoys a built-in smoothing effect on descriptors matches, a valuable asset for integration into an energy minimization framework for optical flow estimation. Deep Flow efficiently handles large displacements occurring in realistic videos, and shows competitive performance on optical flow benchmarks. Furthermore, it sets a new state-of-the-art on the MPI-Sintel dataset.

1,099 citations


"Degraf-Flow: Extending Degraf Featu..." refers background or methods in this paper

  • ...Table 3 shows the results of DeGraF-Flow on the test set compared to DeepFlow [18] and EpicFlow [3] (which both use DeepMatch as the sparse matching technique)....

    [...]

  • ...With only minimal impact on accuracy (within 2% of EPICFlow [3] and DeepFlow [18] across all metrics) our approach offers significant gains in computational performance for dense optic flow estimation....

    [...]

  • ...[18] P. Weinzaepfel, Z. Harchaoui, and C. Schmid, “DeepFlow : Large displacement optical flow with deep matching,” Int....

    [...]

  • ...As with the KITTI 2012 benchmark results our approach shows comparable accuracy with significantly reduced runtime over EpicFlow and DeepFlow....

    [...]

  • ...The sparse set of optical flow vectors recovered from this matching are then used as input to a refinement scheme to recover dense optical flow [14, 15, 16, 3, 17, 18]....

    [...]

Proceedings ArticleDOI
07 Jun 2015
TL;DR: In this article, an edge-aware geodesic distance is used to handle occlusions and motion boundaries for optical flow estimation in large displacements with significant occlusion.
Abstract: We propose a novel approach for optical flow estimation, targeted at large displacements with significant occlusions. It consists of two steps: i) dense matching by edge-preserving interpolation from a sparse set of matches; ii) variational energy minimization initialized with the dense matches. The sparse-to-dense interpolation relies on an appropriate choice of the distance, namely an edge-aware geodesic distance. This distance is tailored to handle occlusions and motion boundaries - two common and difficult issues for optical flow computation. We also propose an approximation scheme for the geodesic distance to allow fast computation without loss of performance. Subsequent to the dense interpolation step, standard one-level variational energy minimization is carried out on the dense matches to obtain the final flow estimation. The proposed approach, called Edge-Preserving Interpolation of Correspondences (EpicFlow) is fast and robust to large displacements. It significantly outperforms the state of the art on MPI-Sintel and performs on par on Kitti and Middlebury.

804 citations

Book ChapterDOI
05 Sep 2010
TL;DR: It is shown how the accelerated segment test, which underlies FAST, can be significantly improved by making it more generic while increasing its performance, by finding the optimal decision tree in an extended configuration space, and demonstrating how specialized trees can be combined to yield an adaptive and generic accelerated segments test.
Abstract: The efficient detection of interesting features is a crucial step for various tasks in Computer Vision. Corners are favored cues due to their two dimensional constraint and fast algorithms to detect them. Recently, a novel corner detection approach, FAST, has been presentedwhich outperforms previous algorithms in both computational performance and repeatability. We will show how the accelerated segment test, which underlies FAST, can be significantly improved by making it more generic while increasing its performance.We do so by finding the optimal decision tree in an extended configuration space, and demonstrating how specialized trees can be combined to yield an adaptive and generic accelerated segment test. The resulting method provides high performance for arbitrary environments and so unlike FAST does not have to be adapted to a specific scene structure. We will also discuss how different test patterns affect the corner response of the accelerated segment test.

512 citations

Frequently Asked Questions (11)
Q1. What contributions have the authors mentioned in the paper "Degraf-flow: extending degraf features for accurate and efficient sparse-to-dense optical flow estimation" ?

In this work the authors use the novel Dense Gradient Based Features ( DeGraF ) as the input to a sparse-to-dense optical flow scheme. Furthermore, the comparable speed of feature detection also lends itself well to the aim of real-time optical flow recovery. Evaluation on established real-world benchmark datasets show test performance in an autonomous vehicle setting where DeGraF-Flow shows promising results in terms of accuracy with competitive computational efficiency among non-GPU based methods, including a marked increase in speed over the conceptually similar EpicFlow approach [ 3 ]. 

Future work will exploit the tracking of DeGraF features for applications in an autonomous vehicle setting. 

sparse flow vectors with uniform spatial coverage is an ideal for accurate dense optical flow recovery [26] making uniform feature distribution across the scene a key conduit to success. 

Evaluation is carried out on the KITTI optic flow estimation benchmark data sets (denoted as KITTI 2012 [10] and KITTI 2015 [13]). 

The KITTI 2015 benchmark [13] comprises 200 training and 200 test image pairs (1242 × 375 pixels) with the increasedchallenges of dynamic scene objects (vehicles). 

Dense optical flow is recovered from two temporally adjacent images (Figure 2A) using a three step process:Point detection on the first image is carried out by calculation of an even grid of DeGraF points [1] shown in Figure 2B. 

To cope with such challenges, contemporary optical flow methods use a sparse-to-dense estimation scheme, whereby a sparse set of points on a video frame are matched to points in the subsequent frame. 

This choice is made because the larger value from Sneg and Spos is less sensitive to noise and so the corresponding centroid is more robust. 

For a predicted optical flow vector up at every pixel with corresponding ground flow truth vector ugt, the EPE is then defined as the average difference between the predicted and ground truth vectors over the image:EPE = 1N ∑ i ‖upi − u gt i ‖ 2, (3)where N is the number of pixels and EPE is hence measured in pixels. 

Given two temporally adjacent images in a sequence, DeGraF points are detected in the first image and then efficiently tracked to the subsequent image using RLOF [2]. 

The percentage of flow vectors with an EPE greater than 3px are shown, with fg and bg referring to foreground objects and the background scene respectively.