scispace - formally typeset
Open AccessJournal ArticleDOI

6G-Enabled Short-Term Forecasting for Large-Scale Traffic Flow in Massive IoT Based on Time-Aware Locality-Sensitive Hashing

TLDR
A big data-driven and nonparametric model aided by 6G is proposed in this article to extract similar traffic patterns over time for accurate and efficient short-term traffic flow prediction in massive IoT, which is mainly based on time-aware locality-sensitive hashing (LSH).
Abstract
With the advent of the Internet of Things (IoT) and the increasing popularity of the intelligent transportation system, a large number of sensing devices are installed on the road for monitoring traffic dynamics in real time. These sensors can collect streaming traffic data distributed across different traffic sites, which constitute the main source of big traffic data. Analyzing and mining such big traffic data in massive IoT can help traffic administrations to make scientific and reasonable traffic scheduling decisions, so as to avoid prospective traffic congestions in the future. However, the above traffic decision making often requires frequent and massive data transmissions between distributed sensors and centralized cloud computing centers, which calls for lightweight data integrations and accurate data analyses based on large-scale traffic data. In view of this challenge, a big data-driven and nonparametric model aided by 6G is proposed in this article to extract similar traffic patterns over time for accurate and efficient short-term traffic flow prediction in massive IoT, which is mainly based on time-aware locality-sensitive hashing (LSH). We design a wide range of experiments based on a real-world big traffic data set to validate the feasibility of our proposal. Experimental reports demonstrate that the prediction accuracy and efficiency of our proposal are increased by 32.6% and 97.3%, respectively, compared with the other two competitive approaches.

read more

Content maybe subject to copyright    Report

JOURNAL OF L
A
T
E
X CLASS FILES 1
6G-enabled short-term forecasting for large-scale
traffic flow in massive IoT based on time-aware
Locality-Sensitive Hashing
Fan Wang, Min Zhu, Maoli Wang*, Mohammad R. Khosravi, Qiang Ni, Shui Yu, and Lianyong Qi
Abstract—With the advent of the Internet of Things (IoT)
and the increasing popularity of the Intelligent Transportation
System, a large number of sensing devices are installed on the
road for monitoring traffic dynamics in real-time. These sensors
can collect streaming traffic data distributed across different
traffic sites, which constitute the main source of big traffic data.
Analyzing and mining such a big traffic data in massive IoT can
help traffic administrations to make scientific and reasonable
traffic scheduling decisions, so as to avoid prospective traffic
congestions in the future. However, the above traffic decision-
making often requires frequent and massive data transmissions
between distributed sensors and centralized cloud computing
centers, which calls for lightweight data integrations and accurate
data analyses based on large-scale traffic data. In view of this
challenge, a big data-driven and non-parametric model aided by
6G is proposed in this paper to extract similar traffic patterns
over time for accurate and efficient short-term traffic flow
prediction in massive IoT, which is mainly based on time-aware
LSH (Locality-Sensitive Hashing). We design a wide range of
experiments based on a real-world big traffic dataset to validate
the feasibility of our proposal. Experimental reports demonstrate
that the prediction accuracy and efficiency of our proposal are
increased by 32.6% and 97.3%, respectively, compared with the
other two competitive approaches.
Index Terms—Short-term traffic forecasting, Intelligent Trans-
portation System, time-aware LSH, massive Internet of Things,
6G, large-scale traffic management.
I. INTRODUCTION
T
HE improvement of people’s living standards has led
to the expansion of data scale [1] and the growth of
the number of vehicles. In response to this situation, the
F. Wang is with School of Computer Science, Qufu Normal University,
China. (email: fanwang1997@gmail.com)
M. Zhu is with Facility Horticulture Laboratory of Universities in Shan-
dong, WeiFang University of Science and Technology, ShouGuang, China.
(email: zhumin@wfust.edu.cn)
M. Wang is with School of Cyber Science and Engineering, Qufu Normal
University, China. (email: wangml@qfnu.edu.cn) [corresponding author]
M. R. Khosravi is with Department of Computer Engineering, Persian
Gulf University, Bushehr 7516913817, Iran, and Department of Electrical and
Electronic Engineering, Shiraz University of Technology, Shiraz 71557-13876,
Iran. (email: mohammadkhosravi@acm.org)
Q. Ni is with School of Computing and Communications, Lancaster
University, UK. (email: q.ni@lancaster.ac.uk)
S. Yu is with Faculty of Engineering and Information Technology, Univer-
sity of Technology Sydney, Australia. (email: Shui.Yu@uts.edu.au)
L. Qi is with School of Computer Science and Engineering, Qufu Normal
University, China. (email: lianyongqi@qfnu.edu.cn)
———————————————————————————————–
Copyright (c) 2020 IEEE. Personal use of this material is permitted. However,
permission to use this material for any other purposes must be obtained from
the IEEE by sending a request to pubs-permissions@ieee.org.
development of the Internet of Things (IoT) and mobile com-
munication technologies render real-time traffic management
feasible [2] [3]. First, a large number of devices (e.g., sensors)
are installed on the road to monitor traffic dynamics in real-
time. Afterwards, 6G technology enables frequent but stable
traffic data transmission between these distributed sensors and
the cloud platform. Finally, the large-scale traffic sensing data
can be integrated to provide an effective reference for traffic
management.
Nevertheless, congestions and queues occur more and more
frequently nowadays, which requires traffic managers to devel-
op more effective traffic management strategies based on the
large-scale sensor data and anticipate flow breakdowns in the
future, especially during peak hours. A promising way is to
forecast traffic conditions accurately and timely from a short-
term perspective and allow traffic managers to understand
potential traffic variations instantly. Therefore, as a decision
support tool, the short-term traffic flow forecasting model for
large-scale traffic data in massive IoT is expected to make a
high contribution to active traffic management.
Due to the significance of predicting potential traffic volume
in advance, many researchers have devoted themselves to the
study of this topic in recent years [4]. Generally, a robust
traffic forecast algorithm requires excellent response time and
high accuracy. However, the explosive growth of data size
makes it difficult to forecast the expected volume timely.
Moreover, the prediction is generally based on sampled data
with small scales, which decrease the prediction precision to
some extent. As the inherent ills of the data-driven traffic
forecasting approach, these problems have become a major
obstacle to enhance the effectiveness of large-scale traffic
management.
In light of the issues above, we propose a 6G-enabled
short-term traffic flow forecasting algorithm in a large-scale
traffic environment based on time-aware Locality-Sensitive
Hashing (LSH) technology, named T racF ore
timeLSH
. LSH
technology is a fast nearest-neighbor search technology for
massive high-dimensional data, which identifies whether the
data points are neighbors by mapping them into some buckets.
Traditional LSH is usually applied to privacy protection issues
in service recommendation scenarios [5] [6] [7]. Furthermore,
our T racF ore
timeLSH
is a data-driven prediction approach
implemented on real historical sensor data, where the traffic
pattern of each sensor is aggregated in 15-min intervals
and traffic data transmission between distributed sensors and
centralized cloud computing platform is guaranteed by 6G

JOURNAL OF L
A
T
E
X CLASS FILES 2
technology. In summary, we make the following contributions
in this paper.
(1) We propose a novel short-term traffic flow forecasting
model based on time-aware Locality-Sensitive Hashing to pur-
sue a more accurate real-time prediction in massive IoT. To the
best of our knowledge, this is the first work that incorporates
time-aware LSH technology into large-scale traffic forecasting.
(2) We conduct a wide range of experiments based on
a large scale real-world Intelligent Transportation Systems
(ITS) dataset collected from Nanjing city of China to validate
the performance of our proposal. The experimental results
show that our T racF ore
timeLSH
outperforms the other two
approaches in terms of response time and forecast accuracy.
The rest of the paper is organized as follows. We review
related work following this introductory section. Following
the related work, the motivation of our research is presented.
This is followed by a detailed discussion about how our
T racF ore
timeLSH
takes effect as well as the corresponding
experimental results. In the last section, we conclude the whole
paper and indicate some potential directions in our future
work.
II. RELATED WORK
Nowadays, many researchers are devoting themselves to
technologies in the context of the Internet of Things (IoT)
[8] [9] [10]. Based on IoT, Intelligent Transportation Systems
(ITS) is emerged as a novel paradigm to manage urban
traffic and bring convenience to the lives of residents [11].
As a vital element of ITS, short-term traffic flow forecasting
is a crucial topic that forecasts traffic patterns over a few
seconds to a few hours. As classified in [12], the traffic flow
forecasting approaches can be divided into three categories:
naive, parametric, and non-parametric methods. Considering
the diversity of short-term traffic flow prediction conditions,
we will discuss these three categories in detail according to
different traffic contexts.
Naive methods denote the traffic forecasting models based
on mathematical statistics, e.g., historical average and cluster-
ing approaches. Although Naive methods are with simplicity
and efficiency characteristics, they cannot reflect the uncer-
tainty and nonlinearity of traffic dynamics.
Parametric methods utilize the overall distribution of data to
estimate a set of parameter values and forecast future traffic
patterns. Some typical methods include ARIMA as well as
its variation SARIMA based on time series analysis [13],
macroscopic traffic flow analysis model for better accuracy
[14] to name just a few. Although this kind of methods are with
high prediction accuracy, they have a complicated parameter
estimation process and have been proven to be unfriendly to
unstable traffic environments.
Most of the non-parametric methods are data-driven and
free of restriction regarding the data distribution, including
neural networks, pattern recognition methods, and so on. In
recent years, due to the characteristics of adaptive ability and
flexibility, neural networks have received extensive attention
from scholars [15] [16]. Li et al. [17] utilize bayesian networks
to implement multiple measures chaotic time series prediction
approach. Besides, the recurrent neural network (RNN) is very
aggressive in processing time series corresponding to traffic
patterns, but it is prone to vanishing gradient problems. In
this situation, the variants of RNN, long short-term memory
(LSTM), and Gated recurrent units (GRU) can better alleviate
the issue [18]. Dai et al. [19] develop a gated recurrent
units (GRU) model based on traffic information to predict
traffic flow in short-term. Ma et al. [20] propose an LSTM
model for predicting the time cost during travel in urban.
However, the above studies only take time series instead of
more comprehensive contexts into account. To overcome their
drawbacks, Zhang et al. [21] employ convolutional neural
networks (CNN) to combine time and space information to
analyze traffic flow data. Nevertheless, all the above neural
network methods suffer from common shortcomings that are
without high interpretability and really depend on data scale.
On the other hand, K-nearest neighbor (K-NN) methods con-
duct short-term traffic flow forecasting by extracting valuable
characteristics in the dataset. Thus, we can draw an under-
standing of the prediction results from the execution of K-NN.
For instance, Lin et al. [22] combine K-NN with local linear
wavelet neural network to predict short term traffic flow. Zhang
et al. [23] propose an improved K-NN for short-term traffic
flow prediction. However, the data-driven K-NN technology
consumes a lot of time and its precision is not high enough.
Although researchers have made different enhancements on
the basis of K-NN, the accuracy of the enhanced K-NN has
not been greatly improved and the time consumption continued
to increase.
In general, since existing researches are often conducted in
various contexts, it is difficult to define whether a method is
the best. However, compared with parametric methods, a large
number of researchers have concluded that non-parametric
methods are better because of their powerful self-learning
functions and adaptive capabilities. Thus, we also propose a
data-driven non-parametric method, which can achieve more
accurate training results in a fairly short response time. Our
experimental results demonstrate that our proposal can be
easily incorporated into an online traffic control system and
achieve better performance.
III. MOTIVATION
Fig. 1. Traffic dynamics: an example.

JOURNAL OF L
A
T
E
X CLASS FILES 3
Fig. 3. The technical architecture of our T racF ore
timeLSH
.
Fig. 2. Graphical representation of our T racF ore
timeLSH
.
We employ Fig. 1 to illustrate the motivation of this
paper vividly. As shown in Fig. 1, queues and increasingly
frequent congestions nowadays require more extensive traffic
monitoring. Therefore, the corresponding departments have
installed a great deal of devices on the traffic networks, such
as sensors in Fig. 1. Based on these sensors, all the real-
time traffic data will be provided to the cloud for processing,
during which the advanced 6G technology guarantees the
efficiency, stability, and integrity of the extensive distributed
data transmission. It can be said that implementing 6G-enabled
short-term traffic flow forecasting based on the integrated data
collected from all sensors is a promising way to provide
traffic managers with strategies to anticipate flow breakdowns
in the future. However, two issues arise in the traditional
short-term traffic flow forecasting methods: (1) The continuous
sensors as well as their observed big traffic data render the
instant response to variations in traffic conditions infeasible.
(2) Only a small portion of sampled data is utilized for
traffic flow prediction, which causes the predictive result not
accurate enough. Generally, a more effective road capacity
management strategy adopted from the forecasting algorithm
requires shorter response time and higher accuracy. In light of
this situation, we propose an efficient and accurate algorithm
named T racF ore
timeLSH
, which will be introduced in the
subsequent section.
IV. TRAFFIC FLOW FORECASTING BASED ON LSH:
T racF ore
timeLSH
In this paper, we perform 6G-enabled traffic flow fore-
casting based on similar flow rate sequences in historical
traffic patterns recorded by sensors, where the historical traffic
patterns are the search spaces from which we obtain valuable
information. Our algorithm is based on the hypothesis that
if a previous profile is similar to the current profile, then the
subsequent values of the previous profile is similar to the future
values of the target profile. Hence, as graphically shown in
Fig. 2, given an incomplete traffic sequence of the target day
(depicted in solid red line in Fig. 2), i.e., the subject profile
desired to be forecasted, our algorithm aims to recognize
similar neighbors (depicted in solid black line and blue line
in Fig. 2) for it accurately and efficiently from a pool of
archived datasets. Concretely, we exploit the sequences in a
time window (denoted as lag duration) to determine similar
candidates. Then we aggregate the flow rate of the similar
profiles in some form and draw the future traffic volume of
the subject profile (depicted in broken green line in Fig. 2).
To facilitate the discussion of our proposed
T racF ore
timeLSH
, we define several symbols as below:
(1) S = {s
1
, . . . , s
m
}: the set of sensors that record the
traffic dynamics.
(2) D = {d
1
, . . . , d
p
}: the set of dates in the archived
datasets that sensors monitor traffic dynamics.

JOURNAL OF L
A
T
E
X CLASS FILES 4
Fig. 4. Traffic flow representation in three-dimensional space.
(3) T = {t
1
, . . . , t
n
}: the set of time slices in the lag
duration with fixed time step, where the size of the set is
determined by the number of time steps included in the lag
duration, e.g., if the lag duration is 1 h and the time step is
15 minutes, then n = 4 (1 [h] * 60 [min/h] / 15 [min]).
(4) f
i,j,k
: the traffic flow of the sensor s
i
(1 i m) in the
j
th
time slice t
j
(1 j n) of the k
th
day d
k
(1 k p).
Then, we will introduce our accurate traffic flow forecast-
ing approach with quick response time based on time-aware
LSH, named T racF ore
timeLSH
. Fig. 3 shows the technical
framework of our methods with 4 steps.
A. Step-1: Data formalization and preprocessing
1) Data formalization: As shown in Fig. 4, the archived
traffic profile can be visualized as a three-dimensional space
consists of sensor (i), time slice (j), and date (k), where f
i,j,k
is a point representing the traffic flow in a specific space-time.
In this situation, we aggregate the volume of traffic for sensor
s
i
every 15 minutes (i.e., a time slice) and formalize it as a
matrix specified in (1). In this matrix, each row represents the
flow of n time slices observed by s
i
at a specific date, and each
column represents the flow of a certain time slice observed by
s
i
in p days. It is worth noting that we only utilize the traffic
flow of time slices in the lag duration with 15-min intervals to
construct the matrix in (1) and perform index table generation
as well as similar dates determination subsequently. Here, the
number of columns in (1) is the time steps included in the lag
duration, i.e., n.
F (s
i
) =
f
i,1,1
· · · f
i,n,1
.
.
.
.
.
.
.
.
.
f
i,1,p
· · · f
i,n,p
(1)
2) data preprocessing: Inevitably, there is some noise in the
dataset that harms similar profile recognition and thus results
in a bad prediction. To dampen the effect of noise, we first
take advantage of boxplots to identify outliers, and then apply
winsorization on the abnormal data. Boxplot is a statistical
chart based on distance measurement that shows a set of data
Fig. 5. The composition of classical boxplots.
dispersion. Both [24] and [25] proposed functional boxplots
methodologies as informative exploratory tool for outlier de-
tection. Inspired by them, we also employ boxplots to provide
a global analysis suitable for the whole data and conduct
outlier identification. Fig. 5 introduces the components of the
classical boxplots.
Specifically, we perform outlier processing on each row and
column of the matrix in (1). In order to ensure the conciseness
of this paper, we only introduce the detailed processing of
one row of the matrix. Assuming the k
th
row of the matrix
is given by
F (s
i
)
k
, where
F (s
i
)
k
= (f
i,1,k
, . . . , f
i,n,k
), we
first arrange the values of
F (s
i
)
k
in descending order and
denote it as
c , where
c = (c
i,1,k
, . . . , c
i,n,k
). As shown
in Fig. 5, the upper border of the box (enveloped by a blue
line) indicates upper quartile value U which is the value at the
25% position of the vector
c , i.e., U = c
i,
n
4
,k
. Likewise, the
lower border of the box indicates the lower quartile value L
which is the value at the 75% position of the vector
c , i.e.,
L = c
i,
3n
4
,k
. Then the difference between the upper quartile
U and the lower quartile L is defined as IQR (inter-quartile
range) to represent the 50% central region of the curves,
i.e., IQR = U L = c
i,
n
4
,k
c
i,
3n
4
,k
. Actually, IQR is a
robust expression of data characteristics because it covers the
50% central range of the data, which will not be affected by
outliers. The whisker of the boxplot is the black vertical lines
extending from the edge of the box in Fig. 5, which indicates
the maximum range of data except for outliers. Now, we begin
to detect outliers. We first extend the range of the 50% central
range by 1.5 times to obtain the upper and lower bounds of
the data. Formally, we define the upper bound as SUP , where
SU P = U + 1.5IQR, and the lower bound as INF , where
INF = L 1.5IQR. We regard the points outside these
two bounds as potential outliers. Here, the coefficient 1.5 is
suggested by [24] as well as [25], and can be proved by the
standard normal distribution. The reader can refer to these
two researches for more detailed discussion. Afterwards, add
data points larger than the upper bound SUP to the point
set SO t, where SOt = {c
i,j,k
|c
i,j,k
SUP }, and add data
points smaller than the lower bound INF to the point set
IOt, where IOt = {c
i,j,k
|c
i,j,k
IN F }. In this situation,
SOt IOt is the set of all outliers. Finally, we use Eq. (2)
to perform winsorization on identified outliers by replacing

JOURNAL OF L
A
T
E
X CLASS FILES 5
the abnormal data points with the closest values in the normal
range.
c
i,j,k
=c
i,
n
4
,k
+1.5(c
i,
n
4
,k
c
i,
3n
4
,k
) c
i,j,k
SOt
c
i,j,k
=c
i,
3n
4
,k
1.5(c
i,
n
4
,k
c
i,
3n
4
,k
) c
i,j,k
IOt
(2)
B. Step-2: Building a sensor index table.
In this subsection, we focus on how to build a time-aware
sensor index table based on LSH by using the matrix in
(1). Generally, the cosine distance is of great significance
in spaces that have multi-dimensions. Due to that vehicles
passing through the sensor s
i
in different time slices may
construct a multi-dimensional vector, thus we utilize time-
aware LSH technology corresponding to the cosine distance
to achieve similarity computation between traffic profiles of
different days. Concretely, for the k
th
row
F (s
i
)
k
of matrix
F (s
i
), where
F (s
i
)
k
= (f
i,1,k
, . . . , f
i,n,k
), we first transform
it into a hash value h(
F (s
i
)
k
) using the LSH function in (3).
Here,
v is an n-dimensional vector (v
1
, . . . , v
n
) that randomly
generated in the space of [-1, 1], where v
j
is a random value
in the range [-1, 1]; The symbol · denotes the dot product
operation of two vectors.
h(
F (s
i
)
k
) =
(
1 if
F (s
i
)
k
·
v > 0
0 if
F (s
i
)
k
·
v 0
(3)
After performing the hash mapping in (3) on the k
th
row
of matrix F (s
i
), the row vector representing the traffic flow
on the k
th
day is mapped to a Boolean value. Repeat this
process for each row in (1) until all rows are mapped, i.e., a
p-dimensional Boolean vector h(F (s
i
)) is obtained in (4).
h(F (s
i
)) = (h(
F (s
i
)
1
), . . . , h(
F (s
i
)
p
)
T
(4)
Through the above process, the traffic characteristics of each
date in the matrix F (s
i
) will be transformed into a unique
Boolean value. However, LSH is a probability-based similar
candidate identification technique, and hash values mapped
by only one hash function in (3) can’t guarantee an accurate
expression of the traffic characteristics. To address this issue,
hash functions h
1
(·), . . . , h
r
(·) randomly generated by (2) are
employed to achieve r transformations from F (s
i
) in (1) to
h(F (s
i
)) in (4). Now, we can obtain a p r Boolean matrix
H(F (s
i
)) in (5), i.e., the time-aware sensor index reflecting
the traffic pattern of s
i
.
H(F (s
i
)) =
h
1
(
F (s
i
)
1
) · · · h
r
(
F (s
i
)
1
)
.
.
.
.
.
.
.
.
.
h
1
(
F (s
i
)
p
) · · · h
r
(
F (s
i
)
p
)
(5)
Repeat the above process for each sensor in set S to build its
time-aware index matrix H(F (s
i
)) in (5), and we can finally
obtain a sensor index table denoted as T able
index
, which
contains traffic characteristics of all sensors.
Algorithm 1: T racF ore
timeLSH
Input:
s
target
: the target sensor
S = {s
1
, . . . , s
m
}: sensor set
D = {d
1
, . . . , d
p
}: date set
T = {t
1
, . . . , t
n
}: time slice set
f
i,j,k
: traffic flow of sensor s
i
in time slice t
j
of date d
k
Output:
f
target,J,k
1
: traffic flow of sensor s
i
in desired time slice
t
J
of date d
k
1
.
1 for x = 1 to r do
2 for j = 1 to n do
3 v
j
= random [-1,1]
4 h
x
(·) = (v
1
, . . . , v
n
)
5 for each s
i
S do
6 generate time matrix F (s
i
) in Eq.(1)
7 preprocess data in F (s
i
)
8 for k = 1 to p do
9 h(
F (s
i
)
k
) =
F (s
i
)
k
h
x
(·)
10 h(F (s
i
)) = (h(
F (s
i
)
1
), . . . , h(
F (s
i
)
p
))
T
11 for each s
i
S do
12 generate H(F (s
i
)) using Eq. (5)
13 generate sensor index table for all sensors
14 for each s
i
S do
15 for k = 1 to p do
16 decimal conversion from H(F (s
i
))
k
to A
k
(s
i
)
17 if A
k
1
(s
i
) = A
k
2
(s
i
) then
18 sim
k
1
,k
2
= 1
19 else
20 sim
k
1
,k
2
= 0
21 Generate a hash table H
table
based on
s
i
sim matrix(s
i
) mappings
22 Repeat the above process to generate L hash tables
H table
1
, . . . , H talbe
L
23 Set a date d
k
1
and a time slice t
J
in which the traffic
flow needs to be predicted
24 Calculate SIM
M(s
target
) using Eq.(8)
25 Select top K similar dates into List(d
k
1
)
26 Set a similarity threshold
27 for k
2
= 1 to p do
28 if k
2
List(d
k
1
) , k
2
6= k
1
and
sim
k
1
,k
2
(s
target
) threshold then
29 f
target,J,k
1
is predicted by Eq.(10)
30 return f
target,J,k
1
C. Step-3: Determination of similar dates
In this subsection, we will define the similarity between
different dates of sensors based on the sensor index table
T able
index
. Although T able
index
contains index matrix of all
sensors, we only consider the date similarity calculation of
sensor s
i
as an example. Since each row of matrix H(F (s
i
)) in

Citations
More filters
Journal ArticleDOI

Digital Twin-Assisted Real-Time Traffic Data Prediction Method for 5G-Enabled Internet of Vehicles

TL;DR: In this paper , a digital twin-assisted real-time traffic data prediction method is proposed by analyzing the traffic flow and velocity data monitored by IoV sensors and transmitted through 5G.
Journal ArticleDOI

From 5G to 6G Technology: Meets Energy, Internet-of-Things and Machine Learning: A Survey

TL;DR: A thorough review of 370 papers on the application of energy, IoT and machine learning in 5G and 6G from three major libraries: Web of Science, ACM Digital Library, and IEEE Explore is presented.
Journal ArticleDOI

Privacy-aware Traffic Flow Prediction based on Multi-party Sensor Data with Zero Trust in Smart City

TL;DR: This work puts forward an accurate LSH (locality-sensitive hashing)-based traffic flow prediction approach with the ability to protect privacy, and demonstrates the feasibility of the proposal in terms of prediction accuracy and efficiency while guaranteeing sensor data privacy.
Journal ArticleDOI

Fdsa-STG: Fully Dynamic Self-Attention Spatio-Temporal Graph Networks for Intelligent Traffic Flow Prediction

TL;DR: A novel framework entitled Fully dynamic self-attention Spatio-Temporal Graph Networks (Fdsa-STG) is proposed by improving the attention mechanism using Graph Attention Networks (GATs) by dynamically integrate the correlations of spatial dimension, time dimension, and periodic characteristics for highly-accurate prediction.
References
More filters
Journal ArticleDOI

Long short-term memory neural network for traffic speed prediction using remote microwave sensor data

TL;DR: A comparison with different topologies of dynamic neural networks as well as other prevailing parametric and nonparametric algorithms suggests that LSTM NN can achieve the best prediction performance in terms of both accuracy and stability.
Journal ArticleDOI

Modeling and Forecasting Vehicular Traffic Flow as a Seasonal ARIMA Process: Theoretical Basis and Empirical Results

TL;DR: The theoretical basis for modeling univariate traffic condition data streams as seasonal autoregressive integrated moving average processes as well as empirical results using actual intelligent transportation system data are presented and found to be consistent with the theoretical hypothesis.
Journal ArticleDOI

Adaptive Kalman filter approach for stochastic short-term traffic flow rate prediction and uncertainty quantification

TL;DR: Empirical comparisons using real world traffic flow data aggregated at 15-min interval showed that the adaptive Kalman filter approach can generate workable level forecasts and prediction intervals and demonstrates improved adaptability when traffic is highly volatile.
Journal ArticleDOI

Short-term traffic flow rate forecasting based on identifying similar traffic patterns

TL;DR: This research provides strong evidence suggesting that the proposed non-parametric and data-driven approach for short-term traffic forecasting provides promising results and can be easily incorporated with real-time traffic control for proactive freeway traffic management.
Journal ArticleDOI

CNN-RNN Based Intelligent Recommendation for Online Medical Pre-Diagnosis Support

TL;DR: A so-called DP-CRNN algorithm is developed with a newly designed neural network structure, to extract and highlight the combination of semantic and sequential features in terms of patient's inquiries in order to deal with the situation that patients’ online inquiries are usually not very long.
Related Papers (5)
Frequently Asked Questions (12)
Q1. What contributions have the authors mentioned in the paper "6g-enabled short-term forecasting for large-scale traffic flow in massive iot based on time-aware locality-sensitive hashing" ?

In view of this challenge, a big data-driven and non-parametric model aided by 6G is proposed in this paper to extract similar traffic patterns over time for accurate and efficient short-term traffic flow prediction in massive IoT, which is mainly based on time-aware LSH ( Locality-Sensitive Hashing ). 

It assists traffic managers in developing proactive traffic management strategies and anticipating flow breakdowns in the future. In future work, the authors will include more traffic conditions as a valuable supplement to their study. Furthermore, privacy concerns as an important factor in traffic scenes will also be treated in their future research [ 29 ] [ 30 ]. 

In recent years, due to the characteristics of adaptive ability and flexibility, neural networks have received extensive attention from scholars [15] [16]. 

As classified in [12], the traffic flow forecasting approaches can be divided into three categories: naive, parametric, and non-parametric methods. 

Some typical methods include ARIMA as well as its variation SARIMA based on time series analysis [13], macroscopic traffic flow analysis model for better accuracy [14] to name just a few. 

In addition to the traffic patterns in the archived data, complex application contexts, e.g., weather, incident, and road work, also play a significant role in prediction performance. 

MAPE provides a better perspective in measuring traffic forecast accuracy, which is because MAPE normalizes errors by considering the percentage between forecast error and the observed value. 

Naive methods denote the traffic forecasting models based on mathematical statistics, e.g., historical average and clustering approaches. 

their TracForetime−LSH can provide a lower forecast error in short-term traffic flow prediction, especially during high traffic levels and peak hours. 

two issues arise in the traditional short-term traffic flow forecasting methods: (1) The continuous sensors as well as their observed big traffic data render the instant response to variations in traffic conditions infeasible. 

The reason is that most of the work in their proposal (e.g., hash table creation and similarity calculation) can be completed offline and the remaining work (e.g., similar dates search and flow forecasts) can be finishedquite efficiently based on the stored information. 

It is worth noting that the authors only utilize the traffic flow of time slices in the lag duration with 15-min intervals to construct the matrix in (1) and perform index table generation as well as similar dates determination subsequently.