scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Embedded tracking algorithm based on multi-feature crowd fusion and visual object compression

01 Jan 2017-Eurasip Journal on Embedded Systems (Springer International Publishing)-Vol. 2016, Iss: 1, pp 16
TL;DR: An embedded tracking algorithm based on multi-feature fusion and visual object compression is proposed to improve the positioning accuracy and the quality of tracking service and has high precision, low energy consumption, and low delay.
Abstract: The accuracy and poor real-time performance of moving objects in a dynamic range complex environment become the bottleneck problem of the target location and tracking. In order to improve the positioning accuracy and the quality of tracking service, we propose an embedded tracking algorithm based on multi-feature fusion and visual object compression. On the hand, according to the feature of the target, the optimal feature matching method is selected, and the multi-feature crowd fusion location model is proposed. On the other hand, to reduce the dimension of the multidimensional space composed of the moving object visual frame and the compression of the visual object, the embedded tracking algorithm is established. Experimental results show that the proposed tracking algorithm has high precision, low energy consumption, and low delay.

Content maybe subject to copyright    Report

RES E AR C H Open Access
Embedded tracking algorithm based on
multi-feature crowd fusion and visual
object compression
Zheng Wenyi
1,2*
and Dong Decun
1
Abstract
The accuracy and poor real-time performance of moving objects in a dynamic range complex environment become
the bottleneck problem of the target location and tracking. In order to improve the positioning accuracy and the
quality of tracking service, we propose an embedded tracking algorithm based on multi-feature fusion and visual
object compression. On the hand, according to the feature of the target, the optimal feature matching method is
selected, and the multi-feature crowd fusion location model is proposed. On the other hand, to reduce the dimension
of the multidimensional space composed of the moving object visual frame and the compression of the visual object,
the embedded tracking algorithm is established. Experimental results show that the proposed tracking algorithm has
high precision, low energy consumption, and low delay.
Keywords: Visual object compression, Embedded tracking, Crowd fusion, Multi-feature systems
1 Introduction
Moving target tracking is one of the most active areas in
the development of science and technology. The target
tracking algorithms have been widely valued by all coun-
tries in the world [1]. With the performance of continu -
ous improvement and expansion, location and tracking
algorithm for successful application in industry, agricul-
ture, health care, and service industry [2], in the urban
security, defense and space exploration have dangerous
situations [3] is to show their talents.
The low-complexity and high-accuracy algorithm was
presented in [4], for reducing the computational load of the
traditional data-fusion algorithm with heterogeneous obser-
vations for location tracking. Trogh et al. [5] presented a
radio frequency-based location tracking system, which
could improve the performance by eliminating the shadow-
ing. In [6], Liang and Krause proposed the proof-of-
concept system based on a sensor fusion approach, which
was built with considerations for lower cost, and higher
mobility, deplorability, and portability, by combining the
drift velocities of anchor nodes. The scheme of [7] could es-
timate the drift velocity of the tracked node by using spatial
correlation of ocean current. The distributed multi-human
location algorithm was researched by Y ang et al. [8] for a
binary piezoelectric infrared sensor tracking system.
The model-based approach was presented in [9], which
is used to predict the geometric structure of an object
using its visual hull. The task-dependent codebook com-
pression framework was proposed to learn a compression
function and adapt the codebook compression [10]. Ji et.al
[11] proposed a novel compact bag-of-patterns descriptor
with an application to low bit rate mobile landmark search.
The blocks were flagged by lying in the object regions flag-
ging compression blocks and an object tree would be
added in each coding tree unit to describe the objects
shape in its additional object tree [12].
However, how to provide guarantee for the high precision
tracking of moving objects in the dynamic range and com-
plex environment and the complexity of the optimization al-
gorithm is one of the most difficult problems. Based on the
results of the above researches, the embedded tracking algo-
rithm based on multi-feature crowd fusion and visual object
compression was proposed for mobile object tracking.
The rest of the paper is organized as follows. Section 2
describes the location model based on multi-feature
* Correspondence: weyzheng@sina.com
1
The Key Laboratory of Road and Traffic Engineer ing, Ministry of Education,
Tongji University, Shanghai 201804, China
2
School of Computer Science and Communication Engineering, Jiangsu
University, Zhenjiang 212013, Jiangsu, China
EURASIP Journal on
Embedded Systems
© 2016 The Author(s). Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to
the Creative Commons license, and indicate if changes were made.
Wenyi and Decun EURASIP Journal on Embedded Systems (2016) 2016:16
DOI 10.1186/s13639-016-0053-7

crowd fusion. In Section 3, we show the embedded track-
ing algorithm for visual object compre ssi on. We ana-
lyzed and evaluated the proposed scheme in Section 4.
Finally, we conclude the paper in Section 5.
2 Location model based on multi-feature crowd
fusion
Generally, the target location is divided into two stages.
In the first phase, the feature of moving target is ex-
tracted. Feature extraction would be completed with the
following steps.
(1)Capture moving target image frame sequence.
(2)The characteristics of real-time target image frames
would be extracted.
(3)From the current image frame to the still image
frame between the targe t, search and the extracted
features of the image frame the most similar to the
target motion characteristics.
The second stage involves the characteristics of the
moving target matching.
Choosing different features, according to the charac-
teristics of the target, is selecting the best feature match-
ing scheme.
Theabovelocalizationschemehasthefollowingdefects:
(1)The extracted features are single. Such feature
extraction is difficult to locate for the complex
moving objects with multiple states.
(2)The change features of the moving object in complex
scenes such C
def
as various deformation, C
lgf
(light),
C
siz
(size), and C
col
(color), making the single feature
matching success rate SR
FM
very low, as shown in
formula (
1).
f IF
i
¼ ifðÞ¼
X
N
i¼1
G
i
; h if; h
i1
;
X
i
j¼1
h
j
! !
if C
def
; C
lgf

h if; h
i1
;
X
i
j¼1
h
j
!
¼ if
i
C
siz
; C
col
ðÞ
αjjh
i1
α;
X
i
j¼1
h
j
!
SR
FM
¼
X
M
i¼1
f if
i
; h
i
ðÞ
1
X
N
j¼1
f ðif
j
Þ
f if
M
; h
M
ðÞ
N
8
>
>
>
>
>
>
>
>
>
>
>
>
>
>
<
>
>
>
>
>
>
>
>
>
>
>
>
>
>
:
ð1Þ
Here, if represents the image frames. IF is the representa-
tion of image frame sequence. G is the vector representa-
tion of image frame matrix. H is the function used to solve
the image frame characteristics and frame similarity. N rep-
resents the captured motion target image frame sequence
length. M represents the frames of image feature matching.
From formula (1), it is found that the upper bound of the
matching success rate is
f if
M
;h
M
ðÞ
N
. But the success rate of
image frames captured is inversely proportional to the
number of captured image frames. The conclusion shows
that the captured image frames will restrict the single fea-
ture matching characteristic.
(3)The accuracy and robustness of target motion in
real time are poor, as shown in formula (
2). In order
to improve the accuracy and robustness, a single
feature set is tracking the feature series, but the
complexity of the transition algorithm is too high, as
showninformula(
3).
A
TR
¼
X
M
i¼1
sinβα
i

1
N
þ
X
M
i¼1
h
i
ffiffi
α
p
RUS
TR
¼ ρf IF
M
ðÞSR
FM
8
>
>
>
<
>
>
>
:
ð2Þ
Here, A
TR
indicates the positioning accuracy. RUS
TR
indi-
cates the location robustness. β is the included angle between
adjacent image frames. ρ denotes the expressed error vector.
CLE
TSA
¼ IF
i
ρ sinβ
jj
M gh
i
; αðÞ
gh
i
; αðÞ
X
M
i¼1
h
i
if C
def
; C
lgf
; C
siz
; C
col

M if
M
C
def
; C
lgf
; C
siz
; C
col

8
>
>
>
<
>
>
>
:
ð3Þ
Here, CLE
TSA
represents the complexity of the transi-
tion algorit hm. The function g(h
i
, α) represents transi-
tion algorithm. It ca n be found that the comple x image
frames are proportional to the degree and feature
matching. This shows that more space, time, and com-
putation must be paid in order to get more features to
match the image frames.
In order to solve the above problems, we propose a
multi-feature crowd fusion location model. The model
analyzes the dynamic motion of the target, the moving
track, and the structure parameters of the image frame.
The state characteristics of different targets are captured;
the composition of multiple feature vectors such as for-
mula (4) is presented. This vector integrates the character-
istics of motion state and deformation, light, size, and
color and can effectively improve the low matching suc-
cess rate of single feature extraction, such as formula (5).
ML
F
¼
"
v
mot
if
11
v
mot
f
1L
v
mot
K
f
K1
v
mot
K
f
KL
#
v
mot
¼
M
N
X
i¼1
NM
fif
i
; h
i
ðÞ
8
>
>
>
>
>
>
>
<
>
>
>
>
>
>
>
:
ð4Þ
Here, v
mot
is the target motion traje ctory fitting func-
tion. K is the representation of time series features. L is
the representation of spatial sequence features.
Wenyi and Decun EURASIP Journal on Embedded Systems (2016) 2016:16 Page 2 of 8

SR
FM
¼
αrank ML
F
fg
tanβ; β < α
jj
β > ρ
1; αβρ
ð5Þ
Here, rank {ML
F
} is the rank of multi-feature vector.
From formula (5), we can see that the high matching
success rate can be guaranteed as long as the multiple
feature vectors are solved correctly.
In order to further reduce the complexity and improve
the accuracy and reliability of multi-feature matching, this
model combines the multi-feature fusion mechanism
based on crowd feature analysis. Multi-feature vector and
the target motion curve of the knowledge combination
are shown in Fig. 1. In the crowd analysis characteristics,
the cur ve and the multiple features are relatively inde-
pendent, the relative independence between the arc and
the multiple features. By crowd analysis, multi-feature vec-
tors are optimized. Multi-features in this vector are not
mutually exclusive. Multi features in this vector are not
mutually exclusive for improving the performance of multi
feature fusion. This can reduce the amount of fusion oper-
ations, as shown in formula (6).
fu
comp
¼
1
M
2
X
K
i¼1
X
L
j¼1
Gi; jðÞML
Fi;jðÞ
Gi; jðÞ¼
f IF
i
ðÞf IF
j

hf; ML
Fi
; ML
Fj

8
>
>
>
>
<
>
>
>
>
:
ð6Þ
In summary,the multi feature crowd fusion algorithm
is shown in Fig. 2.
3 Embedded tracking algorithm for visual object
compression
The visual frame of the moving object constitutes the
multidimensional space of Q dimension. The visual
frame in this space is denoted as x
Q
. The visual frame of
Motion
object
Single feature
Single feature
Single feature
MLF
C
def
C
col
C
siz
C
lgf
fusion
Crowd analyzing model
Feature extraction
SR
FM
A
TR
fu
comp
Fig. 2 Multi-feature crowd fusion location algorithm and workflow
Fig. 1 Crowd fusion between multiple features
Wenyi and Decun EURASIP Journal on Embedded Systems (2016) 2016:16 Page 3 of 8

the internal elements is used to form a visua l matrix
VM. The element of the visual matrix receives the inter-
ference of the multidimensional space, and the informa-
tion is easy to be distorted. In order to solve this
problem, the visual matrix VM can be compressed. The
visual frame of the L-dimensional space tracking system
is shown in Fig. 3. Here, the VM is selected as the object
of the center visual frame VF, perpendicular to the co-
ordinate axes of the L-dimensional space. The angle be-
tween the vertical line is denoted as θ
1
, , θ
Q
.MO
represents a moving target. In the moving process of ob-
jects, φ is the angle between the VF point and the motion
direction. The VF point can collect the visual targets with
different degrees. These targets are used to update the ele-
ments of the visual matrix VM. The fusion results of VF
and VM are mapped into the MO plane. The compression
matrix must satisfy the omni-directional low-dimensional
characteristics, as shown in formula (7). After the matrix
is compressed, the characteristics of the distribution of the
visual frame must be satisfied, as shown in formula (8).
F θ ; φðÞ¼VM
jj
H
sinθ cosφ
2d
D
eff
¼
d
2π
Q
X
Q
i¼1
sinθ
i
cosφ
θ
i

min
e
8
>
>
>
<
>
>
>
:
ð7Þ
Here, F(θ, φ) represents the direction function of visual
frames in L-dimensional space. D represents the distance
between the VF point and the origin of the L-dimensional
space. D
eff
represents the effective dimension of visual
tracking space. The parameter value is obviously less than
L, which can effectively reduce the dimension and im-
prove the compression efficiency of the visual frame.
features
1L
Q dimensional space
1Q
crowd
compresion
Location
fusion
VF
U
Normal plane
embedded
Tar
g
et ob
j
ect trackin
g
Fig. 4 Embedded tracking algorithm architecture with visual frame target compression
Fig. 3 L-dimensional space tracking system for visual frame
Wenyi and Decun EURASIP Journal on Embedded Systems (2016) 2016:16 Page 4 of 8

f VFðÞ¼
VM
jj
H
D
eff
; θ
1
< φ < θ
D
eff
VM
jj
H
Q
; φ ¼ θ
1
; θ
Q
D
eff
fg
i¼1;; VM
jj
H
max
; φ > θ
D
eff
jj
φ < θ
1
ð8Þ
8
>
>
>
>
>
>
>
<
>
>
>
>
>
>
>
:
Here, f(VF) is the visual frame distribution density
function. D
eff
fg
i¼1;; VM
jj
H
max
represents the largest spatial
dimension in the VM rank of the visual matrix.
After the visual matrix VM is compressed and recon-
structed by the visual frame, the tracking signal is shown
in formula (9).
vf
D
eff
¼ VM
sinθ cosφ
X
VMjj
H
i¼1
x
i
ð9Þ
The visual object compression method can obtain the
mapping relationship between the visual frame and the
moving object from the L dimension or D
eff
dimensional
space by choosing the D
eff
fg
i¼1;; VM
jj
H
min
and realize the
target motion prediction. The specific steps of the visual
target compression algorithm are as follows:
(1)The visual matrix of the core visual frame-oriented
migration: The VFC state of the visual frame is
{θ
C
, φ
C
, D
eff_C
}, which is captured by the current
moving target. The visual frames propagate along
the direction of the F(θ
C
, φ
C
). The new state
{θ
U
, φ
U
, D
eff_U
} of the visual frame is obtained after
spreading on the dimensional space D
eff_C
.
(2)The moving object, the current state of the visual
frame, and the diffusion of the visual frame form a
compressed plane PCV: The compressed point set PT
is formed in the plane. Arbitrary two points PT
j
and
PT
i
in the plane into a visual line: PT
i
normal vector is
NV
i
=sinθ
C
cos φ
U
PT
j
. The normal vector of any
point PT
j
on the plane is NV
j
=sinθ
C
cos φ
U
PT
i
.
(3)The included angle between the normal plane of the
method vector NV
i
and the normal plane of the
normal vector NV
j
: The relation between the plane
angle and the direction arc is sinγ ¼ NV
i
NV
j
PT
kk
arctan
θ
C
θ
U
jj
φ
C
φ
U
jj

. The vector mapping relation
between the plane angle and the direction field of
the moving object is shown in formula (
10).
M
γ
¼
M
PT
i
M
PT
j
M
VF
C
M
VF
U

M
i;jðÞ
¼ F sinθ
C
; cosφ
U
ðÞPT
i
kkPT
j
X
VMjj
H
i¼1
x
i
8
>
>
>
>
<
>
>
>
>
:
ð10Þ
The mapping matrix M
γ
is divided into 4 submatrices.
Each submatrix M
(i,j)
is obtained by solving the direction
function, compressing the visual frame point and the
signal strength.
(4)The 4 submatrices of the moving object multi-
feature fusion mapping matri x are obtained by the
visual frame analysis. Tracking matrix MT is
obtained through the target compression.
M
T
¼
ML
Fi;jðÞ
fu
comp
SR
FM
rank ML
F
fgtanγ
M if
M
ðC
def
; C
lgf
; C
siz
; C
col
Þ
X
M
i¼1
sinγα
i

1
N
#
"
200 400 600 800
5
10
15
20
25
30
Number of visual frame samples
)%(rorrenoitacoL
Single feature
Multiple features fusion
Fig. 5 Location error
Table 1 Delay with 1000 visual frame samples
Normal plane
radian
Location delay with
single feature
Location delay with multiple
features and fusion
30 25.7 ms 1.9 ms
50 89.4 ms 2.0 ms
110 345.2 ms 1.8 ms
Wenyi and Decun EURASIP Journal on Embedded Systems (2016) 2016:16 Page 5 of 8

Citations
More filters
Journal ArticleDOI
TL;DR: A multi-feature fusion VR panoramic image shadow elimination algorithm, which uses HSV color features and LBP / LSFP texture features to obtain shadow detection results and then obtains the final detection results by fusion, and the final saliency map is obtained through linear fusion.
Abstract: VR panoramic image is an image technology that covers a wide range of scenes. Its imaging range is much larger than that of traditional imaging systems, and it can fully reflect all the information of the imaging space. Although the multi-feature fusion method has been studied for a long time, the methods of multi-feature extraction, fusion and overall optimization have not been widely studied. In view of the shadow problem in VR panoramic images, this paper proposes a multi-feature fusion VR panoramic image shadow elimination algorithm, which uses HSV color features and LBP (Local Binary Pattern) / LSFP (Local Five Similarity Pattern) texture features to obtain shadow detection results and then obtains the final detection results by fusion. The experimental results prove that while ensuring a low missed detection rate, the false detection rate is greatly reduced. The comprehensive evaluation index Avg in this paper is improved by 3.4% compared with the shadow elimination algorithm based on a single feature. This paper proposes an image saliency detection algorithm and image detail enhancement algorithm based on multi-feature fusion. The final saliency map is obtained through linear fusion. Experiments prove that the image detail enhancement algorithm based on multi-feature fusion mentioned in this paper has achieved excellent results. In this paper, the performance of single feature fusion algorithm and multi-feature fusion algorithm are compared. The results show that the accuracy rate of multi-feature fusion algorithm based on HSV, LBP and LSFP is 93.39%, and the effect of multi-feature fusion is better than that of single feature.

8 citations


Cites background from "Embedded tracking algorithm based o..."

  • ...In [23]–[25], to improve positioning accuracy and tracking service quality, the author proposes an embedded tracking algorithm based onmulti-feature fusion and visual object compression....

    [...]

Journal ArticleDOI
TL;DR: The experimental results prove that the proposed target detection and tracking algorithm has good real-time and robustness, and improves the success rate of target detectionand tracking.
Abstract: At this stage of augmented reality, simple feature descriptions are mainly used in camera real-time motion tracking, but this is prone to the problem of unstable camera motion tracking. Aiming at the balance between real-time performance and stability, a new method model of real-time camera motion tracking based on mixed features was proposed. By comprehensively using feature points and feature lines as scene features, feature extraction, optimization, and fusion are used to construct hybrid features, and the hybrid features are unified for real-time camera parameter estimation. An image feature optimization method based on scene structure analysis is proposed to meet the computing constraints of mobile terminals. An iterative feature line-screening method is proposed to calculate a stable feature line set, and based on the scene feature composition and feature geometry, a hybrid feature is adaptively constructed to improve the tracking stability of the camera. Based on improved SIFT feature matching target detection and tracking algorithm, a hybrid feature point detection operator detection algorithm is used to achieve rapid feature point extraction, and the speed of descriptor generation is reduced by reducing the feature descriptor vector dimension. The experimental results prove that the proposed target detection and tracking algorithm has good real-time and robustness, and improves the success rate of target detection and tracking.

3 citations

References
More filters
Journal ArticleDOI
22 Dec 2015-Sensors
TL;DR: A model of real-time kinematic decimeter-level positioning with BeiDou Navigation Satellite System triple-frequency signals over medium distances with relatively high accuracy and high fixing rate is developed, displaying significant advantage comparing to traditional carrier-smoothed code differential positioning method.
Abstract: Many applications, such as marine navigation, land vehicles location, etc., require real time precise positioning under medium or long baseline conditions. In this contribution, we develop a model of real-time kinematic decimeter-level positioning with BeiDou Navigation Satellite System (BDS) triple-frequency signals over medium distances. The ambiguities of two extra-wide-lane (EWL) combinations are fixed first, and then a wide lane (WL) combination is reformed based on the two EWL combinations for positioning. Theoretical analysis and empirical analysis is given of the ambiguity fixing rate and the positioning accuracy of the presented method. The results indicate that the ambiguity fixing rate can be up to more than 98% when using BDS medium baseline observations, which is much higher than that of dual-frequency Hatch-Melbourne-Wubbena (HMW) method. As for positioning accuracy, decimeter level accuracy can be achieved with this method, which is comparable to that of carrier-smoothed code differential positioning method. Signal interruption simulation experiment indicates that the proposed method can realize fast high-precision positioning whereas the carrier-smoothed code differential positioning method needs several hundreds of seconds for obtaining high precision results. We can conclude that a relatively high accuracy and high fixing rate can be achieved for triple-frequency WL method with single-epoch observations, displaying significant advantage comparing to traditional carrier-smoothed code differential positioning method.

382 citations

Journal ArticleDOI
TL;DR: This paper proposes to learn a compression function to map an originally high-dimensional codebook into a compact codebook while maintaining its visual discriminability, and introduces a label constraint kernel (LCK) into the compression loss function that can model heterogeneous kinds of supervision.
Abstract: A visual codebook serves as a fundamental component in many state-of-the-art computer vision systems. Most existing codebooks are built based on quantizing local feature descriptors extracted from training images. Subsequently, each image is represented as a high-dimensional bag-of-words histogram. Such highly redundant image description lacks efficiency in both storage and retrieval, in which only a few bins are nonzero and distributed sparsely. Furthermore, most existing codebooks are built based solely on the visual statistics of local descriptors, without considering the supervise labels coming from the subsequent recognition or classification tasks. In this paper, we propose a task-dependent codebook compression framework to handle the above two problems. First, we propose to learn a compression function to map an originally high-dimensional codebook into a compact codebook while maintaining its visual discriminability. This is achieved by a codeword sparse coding scheme with Lasso regression, which minimizes the descriptor distortions of training images after codebook compression. Second, we propose to adapt our codebook compression to the subsequent recognition or classification tasks. This is achieved by introducing a label constraint kernel (LCK) into our compression loss function. In particular, our LCK can model heterogeneous kinds of supervision, i.e., (partial) category labels, correlative semantic annotations, and image query logs. We validated our codebook compression in three computer vision tasks: 1) object recognition in PASCAL Visual Object Class 07; 2) near-duplicate image retrieval in UKBench; and 3) web image search in a collection of 0.5 million Flickr photographs. Our compressed codebook has shown superior performances over several state-of-the-art supervised and unsupervised codebooks.

99 citations


"Embedded tracking algorithm based o..." refers methods in this paper

  • ...The task-dependent codebook compression framework was proposed to learn a compression function and adapt the codebook compression [10]....

    [...]

Journal ArticleDOI
TL;DR: The proposed distributed multi-human location algorithm has improved the locating accuracy of multiple human targets with low computational cost in infrared sensor tracking system.
Abstract: This paper presents a distributed multi-human location algorithm for a binary pyroelectric infrared sensor tracking system. The tracking space of our system is divided into many uniform static sub-regions. A two-level regional location, static partitioning and dynamic partitioning, is proposed. A Naive Bayes classifier is used to simplify the human location in a static sub-region, and we achieve the initial location of human by fusing all the internal measuring points of infrared sensors. Taking this initial location as the center, a new secondary dynamic sub-region is defined and all its internal measuring points of infrared sensors are fused again to get the ultimate human location. The simulation and experimental results demonstrate that the proposed method has improved the locating accuracy of multiple human targets with low computational cost in infrared sensor tracking system.

56 citations


"Embedded tracking algorithm based o..." refers background in this paper

  • ...[8] for a binary piezoelectric infrared sensor tracking system....

    [...]

Journal ArticleDOI
Rongrong Ji1, Ling-Yu Duan1, Jie Chen1, Tiejun Huang1, Wen Gao1 
TL;DR: This paper proposes a novel compact bag-of-patterns (CBoPs) descriptor with an application to low bit rate mobile landmark search, and comes up with compact image description by introducing a CBoPs descriptor.
Abstract: Visual patterns, i.e., high-order combinations of visual words, contributes to a discriminative abstraction of the high-dimensional bag-of-words image representation. However, the existing visual patterns are built upon the 2D photographic concurrences of visual words, which is ill-posed comparing with their real-world 3D concurrences, since the words from different objects or different depth might be incorrectly bound into an identical pattern. On the other hand, designing compact descriptors from the mined patterns is left open. To address both issues, in this paper, we propose a novel compact bag-of-patterns (CBoPs) descriptor with an application to low bit rate mobile landmark search. First, to overcome the ill-posed 2D photographic configuration, we build up a 3D point cloud from the reference images of each landmark, therefore more accurate pattern candidates can be extracted from the 3D concurrences of visual words. A novel gravity distance metric is then proposed to mine discriminative visual patterns. Second, we come up with compact image description by introducing a CBoPs descriptor. CBoP is figured out by sparse coding over the mined visual patterns, which maximally reconstructs the original bag-of-words histogram with a minimum coding length. We developed a low bit rate mobile landmark search prototype, in which CBoP descriptor is directly extracted and sent from the mobile end to reduce the query delivery latency. The CBoP performance is quantized in several large-scale benchmarks with comparisons to the state-of-the-art compact descriptors, topic features, and hashing descriptors. We have reported comparable accuracy to the million-scale bag-of-words histogram over the million scale visual words, with high descriptor compression rate (approximately 100-bits) than the state-of-the-art bag-of-words compression scheme.

52 citations


"Embedded tracking algorithm based o..." refers background in this paper

  • ...al [11] proposed a novel compact bag-of-patterns descriptor with an application to low bit rate mobile landmark search....

    [...]

Journal ArticleDOI
TL;DR: A blind online drift calibration framework based on subspace projection and sparse recovery for sensor networks in general-purpose monitoring and Temporal sparse Bayesian learning is used in the proposed method to estimate the sensor drift from under-sampled observations.
Abstract: The lifetime of wireless sensor networks (WSNs) has been significantly extended, while in long-term large-scale WSN applications, the increasing sensor drift has become a key problem affecting the reliability of sensory data. In this paper, we propose a blind online drift calibration framework based on subspace projection and sparse recovery for sensor networks in general-purpose monitoring. Temporal sparse Bayesian learning is used in the proposed method to estimate the sensor drift from under-sampled observations. The proposed method needs neither dense deployment nor the presence of a prior data model. Both simulated and real-world data set are used to evaluate the proposed method. Experimental results demonstrate that the proposed method can detect and recover the sensor drift when the number of drifted sensors are less than 20%, and when 40% sensors are drifted, the recovery rate is 80%.

43 citations