scispace - formally typeset
Search or ask a question

Showing papers on "Network traffic simulation published in 2019"


Proceedings ArticleDOI
10 Jun 2019
TL;DR: GeoSparkSim is a scalable traffic simulator which extends Apache Spark to generate large-scale road network traffic datasets with microscopic traffic simulation and seamlessly integrates with a Spark-based spatial data management system, GeoSpark, to deliver a holistic approach that allows data scientists to simulate, analyze and visualize largescale urban traffic data.
Abstract: Road network traffic data has been widely studied by researchers and practitioners in different areas such as urban planning, traffic prediction, and spatial-temporal databases. For instance, researchers use such data to evaluate the impact of road network changes. Unfortunately, collecting large-scale high-quality urban traffic data requires tremendous efforts because participating vehicles must install GPS receivers and administrators must continuously monitor these devices. There has been a number of urban traffic simulators trying to generate such data with different features. However, they suffer from two critical issues (1) scalability: most of them only offer single-machine solution which is not adequate to produce large-scale data. Some simulators can generate traffic in parallel but do not well balance the load among machines in a cluster. (2) granularity: many simulators do not consider microscopic traffic situations including traffic lights, lane changing, car following. In the paper, we propose GeoSparkSim, a scalable traffic simulator which extends Apache Spark to generate large-scale road network traffic datasets with microscopic traffic simulation. The proposed system seamlessly integrates with a Spark-based spatial data management system, GeoSpark, to deliver a holistic approach that allows data scientists to simulate, analyze and visualize largescale urban traffic data. To implement microscopic traffic models, GeoSparkSim employs a simulation-aware vehicle partitioning method to partition vehicles among different machines such that each machine has a balanced workload. The experimental analysis shows that GeoSparkSim can simulate the movements of 200 thousand vehicles over a very large road network (250 thousand road junctions and 300 thousand road segments).

10 citations


Proceedings ArticleDOI
19 Aug 2019
TL;DR: GeoSparkSim is a scalable traffic simulator which extends Apache Spark to generate large-scale road network traffic datasets with microscopic traffic simulation and seamlessly integrates with a Spark-based spatial data management system, GeoSpark, to deliver a holistic approach that allows data scientists to simulate, analyze and visualize large- scale urban traffic data.
Abstract: Road network traffic data has been widely studied by researchers and practitioners in different areas such as urban planning, traffic prediction and spatial-temporal databases. The existing urban traffic simulators suffer from two critical issues (1) scalability: most of them only offer single-machine solutions which are not adequate to produce large-scale data. Some simulators can generate traffic in parallel but do not well balance the load among machines in a cluster. (2) granularity: many simulators do not consider microscopic traffic situations including traffic lights, lane changing, and car following. In the paper, we propose GeoSparkSim, a scalable traffic simulator which extends Apache Spark to generate large-scale road network traffic datasets with microscopic traffic simulation. The proposed system seamlessly integrates with a Spark-based spatial data management system, GeoSpark, to deliver a holistic approach that allows data scientists to simulate, analyze and visualize large-scale urban traffic data. To implement microscopic traffic models, GeoSparkSim employs a simulation-aware vehicle partitioning method to partition vehicles among different machines such that each machine has a balanced workload. A full-fledged prototype of GeoSparkSim is implemented in Apache Spark. In this demonstration, we will show the attendees how to issue GeoSparkSim simulation tasks via the user interface, visualize simulated vehicle movements, and monitor the backend Spark cluster status.

8 citations


Journal ArticleDOI
TL;DR: Keddah, a toolchain for capturing, modelling, and reproducing Hadoop traffic, is presented for use with network simulators to better capture the behaviour of Hadooper, enabling reproducible Hadoops research in more realistic scenarios.
Abstract: As a distributed system, Hadoop heavily relies on the network to complete data-processing jobs. While the traffic generated by Hadoop jobs is critical for job execution performance, the actual behaviour of Hadoop network traffic is still poorly understood. This lack of understanding greatly complicates research relying on Hadoop workloads. In this article, we explore Hadoop traffic through empirical traces. We analyse the generated traffic of multiple types of MapReduce jobs, with varying input sizes, and cluster configuration parameters. We present Keddah, a toolchain for capturing, modelling, and reproducing Hadoop traffic, for use with network simulators to better capture the behaviour of Hadoop. By imitating the Hadoop traffic generation process and considering the YARN resource allocation, Keddah can be used to create Hadoop traffic workloads, enabling reproducible Hadoop research in more realistic scenarios.

1 citations


01 Jan 2019
TL;DR: GeoSparkSim is proposed, a scalable traffic simulator which extends Apache Spark to generate large-scale road network traffic datasets with microscopic traffic simulation and seamlessly integrates with a Spark-based spatial data management system, GeoSpark, to deliver a holistic approach that allows data scientists to simulate, analyze and visualize large- scale urban traffic data.
Abstract: Researchers and practitioners have widely studied road network traffic data in different areas such as urban planning, traffic prediction and spatial-temporal databases. For instance, researchers use such data to evaluate the impact of road network changes. Unfortunately, collecting large-scale high-quality urban traffic data requires tremendous efforts because participating vehicles must install Global Positioning System(GPS) receivers and administrators must continuously monitor these devices. There have been some urban traffic simulators trying to generate such data with different features. However, they suffer from two critical issues (1) Scalability: most of them only offer single-machine solution which is not adequate to produce large-scale data. Some simulators can generate traffic in parallel but do not well balance the load among machines in a cluster. (2) Granularity: many simulators do not consider microscopic traffic situations including traffic lights, lane changing, car following. This paper proposed GeoSparkSim, a scalable traffic simulator which extends Apache Spark to generate large-scale road network traffic datasets with microscopic traffic simulation. The proposed system seamlessly integrates with a Spark-based spatial data management system, GeoSpark, to deliver a holistic approach that allows data scientists to simulate, analyze and visualize large-scale urban traffic data. To implement microscopic traffic models, GeoSparkSim employs a simulation-aware vehicle partitioning method to partition vehicles among different machines such that each machine has a balanced workload. The experimental analysis shows that GeoSparkSim can simulate the movements of 200 thousand cars over an extensive road network (250 thousand road junctions and 300 thousand road segments).

1 citations


Patent
Li Wei, Chen Tianba, Hu Shengjie, Di Wang, Yunchun Li 
31 Dec 2019
TL;DR: In this paper, a network simulation system based on a Spark computing framework network transmission process is presented, which comprises a load generation module (1), a network topology configuration module (2), a visualization module (3), a scheduling module (4), and a data tracking module (5), respectively arranged on the working simulation node and the driving simulation node.
Abstract: The invention discloses a network simulation system based on a Spark computing framework network transmission process. The network simulation system comprises a load generation module (1), a network topology configuration module (2), a visualization module (3), a scheduling module (4) and a data tracking module (5). And the scheduling module (4) and the data tracking module (5) are respectively arranged on the working simulation node and the driving simulation node. The network transmission process of the Spark computing framework is simulated on the basis of the Spark computing framework in combination with a container virtualization technology and a message driving mechanism. A Spark computing cluster simulation node is established by adopting a container virtualization technology, and areal network data packet is transmitted in a simulation process, so that the effect of an experimental scheme in a real environment can be better reflected.