scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Towards adaptable and tunable cloud-based map-matching strategy for GPS trajectories

29 Dec 2016-Journal of Zhejiang University Science C (Zhejiang University Press)-Vol. 17, Iss: 12, pp 1305-1319
TL;DR: Real-time map-matching (called RT-MM), which is a fully adaptive map- Matching strategy based on cloud to address the key challenge of SPQs in a map- matching process for real-time GPS trajectories is proposed.
Abstract: Smart cities have given a significant impetus to manage traffic and use transport networks in an intelligent way. For the above reason, intelligent transportation systems (ITSs) and location-based services (LBSs) have become an interesting research area over the last years. Due to the rapid increase of data volume within the transportation domain, cloud environment is of paramount importance for storing, accessing, handling, and processing such huge amounts of data. A large part of data within the transportation domain is produced in the form of Global Positioning System (GPS) data. Such a kind of data is usually infrequent and noisy and achieving the quality of real-time transport applications based on GPS is a difficult task. The map-matching process, which is responsible for the accurate alignment of observed GPS positions onto a road network, plays a pivotal role in many ITS applications. Regarding accuracy, the performance of a map-matching strategy is based on the shortest path between two consecutive observed GPS positions. On the other extreme, processing shortest path queries (SPQs) incurs high computational cost. Current map-matching techniques are approached with a fixed number of parameters, i.e., the number of candidate points (N CP) and error circle radius (ECR), which may lead to uncertainty when identifying road segments and either low-accurate results or a large number of SPQs. Moreover, due to the sampling error, GPS data with a high-sampling period (i.e., less than 10 s) typically contains extraneous datum, which also incurs an extra number of SPQs. Due to the high computation cost incurred by SPQs, current map-matching strategies are not suitable for real-time processing. In this paper, we propose real-time map-matching (called RT-MM), which is a fully adaptive map-matching strategy based on cloud to address the key challenge of SPQs in a map-matching process for real-time GPS trajectories. The evaluation of our approach against state-of-the-art approaches is performed through simulations based on both synthetic and real-world datasets.
Citations
More filters
Journal ArticleDOI
31 May 2018-Sensors
TL;DR: An enhanced hidden Markov map matching (EHMM) model is proposed by adopting explicit topological expressions, using historical FCD information and introducing traffic rules to address issues of low matching accuracy and complied with traffic regulations better than the reference models.
Abstract: The map matching (MM) model plays an important role in revising the locations of floating car data (FCD) on a digital map. However, most existing MM models have multiple shortcomings, such as a low matching accuracy for complex roads, long running times, an inability to take full advantage of historical FCD information, and challenges in maintaining the topological adjacency and obeying traffic rules. To address these issues, an enhanced hidden Markov map matching (EHMM) model is proposed by adopting explicit topological expressions, using historical FCD information and introducing traffic rules. The EHMM model was validated against areal ground dataset at various sampling intervals and compared with the spatial and temporal matching model and the ordinary hidden Markov matching model. The empirical results reveal that the matching accuracy of the EHMM model is significantly higher than that of the reference models regarding real FCD trajectories at medium and high sampling rates. The running time of the EHMM model was notably shorter than those of the reference models. The matching results of the EHMM model retained topological adjacency and complied with traffic regulations better than the reference models.

13 citations


Cites background or methods from "Towards adaptable and tunable cloud..."

  • ...The second popular model is the STM model, which includes two modules, namely, the spatial analysis and the temporal analysis, as discussed in the literature [13]....

    [...]

  • ...[15,16] improved the STM model based on the locality of a road network to obtain the locality-based matching (LBM) model....

    [...]

  • ...The running time of the former was found to be shorter than that of the latter [15,16]....

    [...]

  • ...The matching results indicate that the LBM model can reduce the total number of shortest path queries against the STM [15,16]....

    [...]

  • ...However, the improvement in the matching accuracy of the LBM model relative to STM is insignificant....

    [...]

Proceedings ArticleDOI
01 Jul 2017
TL;DR: A TCD model is designed to characterize not only the traffic flow evolving process in time domain but also the propagation process of TFI through road networks in space domain and experimental results show that the TCD approach performs best in comparison with its competitors.
Abstract: Traffic congestion is a spatio-temporal state of speeds beyond the capacity of road design and congestion may propagate through road networks. Characterizing the diffusion process is of great importance both in congestion relief and traffic condition prediction. Traffic congestion diffusion (TCD) in road networks can be observed, but literature lacks accurate models for characterizing the process. In this paper, we define a concept of Traffic Flow Influence (TFI) as a base for congestion diffusion. A TCD model is designed to characterize not only the traffic flow evolving process in time domain but also the propagation process of TFI through road networks in space domain. The model is for traffic networks in a city, which is divided into grids and each grid is modeled by traffic status of congested or smooth. Different from other diffusion models, the grid status depends on not only its current condition, but also the relative traffic flow from and to its neighbors. We use a gradient descent approach to quantify the traffic flow and TFI intensity of road networks. To the best of our knowledge, this should be the first model for a metro-city scale. The TCD model with TFI is able to predict grid status with an accuracy as high as 89%. Experimental results based on real-world taxi trajectory data in a metro-city show that the TCD approach performs best in comparison with its competitors.

8 citations


Cites background from "Towards adaptable and tunable cloud..."

  • ...A GPS point after map matched [10], [11] is denoted as φm = (xm, ym, sm, cm, τm), where m is taxi ID, an object identifier to uniquely identify the taxis, twotuple (x, y) represents the taxi’s current location, c indicates the current passenger carried state (1 means the taxi is occupied by passenger and 0 means the taxi is vacant), s is the current speed, τ represents the reporting time....

    [...]

Journal ArticleDOI
TL;DR: This article builds a traffic congestion diffusion model to capture traffic flow influence (TFI) spreading over traffic road networks, and proposes an influence spreading based method to find the dynamic changed traffic bottlenecks, where the influence caused by bott lenecks is maximal.
Abstract: Traffic bottlenecks dynamically change with the variance of traffic demand. Identifying traffic bottlenecks plays an important role in traffic planning and provides decision making. However, traffic bottlenecks are difficult to identify because of the complexity of traffic road networks and many other factors. In this article, we propose an influence spreading based method to find the dynamic changed traffic bottlenecks, where the influence caused by bottlenecks is maximal. We first build a traffic congestion diffusion (TCD) model to capture traffic flow influence (TFI) spreading over traffic road networks. The bottlenecks identification problem based on TCD is modeled as an influence maximization problem, that is, selecting the most influential nodes such that the deterioration of traffic condition is maximal. With the proof of the submodularity of TFI spreading over traffic networks, a provably near‐optimal algorithm is used to solve the NP‐hard problem. With the exploration of unique properties of TFI spread, an approximate influence maximization method for TCD (TCD‐AIM) is proposed. To the best of our knowledge, this should be the first model for a metro‐city scale from the influence perspective. Experimental results show that TCD‐AIM finds bottlenecks with up to 130% congestion density increase in the future.

3 citations

Journal ArticleDOI
TL;DR: The pre-processing technique is introduced; the road network graph and processing the Single Source Shortest Path in synchronize parallel processing in the Hadoop environment enables the map-matching schemes efficient to align the GPS points on the digital road networks.
Abstract: This study towards the Map-Matching process that is useful to align a location of Global Positioning System (GPS) of vehicles on the digital road networks. Today’s GPS-enabled vehicles in developed countries generate a big volume of GPS data. On the other hand, the development of new roads in the city enables the road network very complex and difficult to match the vehicles’ location. So therefore, different techniques (i.e., pre-processing techniques) may be applied before the map-matching process is a recent concern of the Intelligent Transport System (ITS) research community. In this paper, we introduce the pre-processing technique; splitting the road network graph and processing the Single Source Shortest Path (SSSP) in synchronize parallel processing in the Hadoop environment. The proposed technique enables the map-matching schemes efficient to align the GPS points on the digital road networks. In the experimental work, the results of the map-matching schemes (i.e., found in the literature review) incorporated with our proposed pre-processing technique shows better performance in aspect to the response time.

2 citations


Cites background or methods from "Towards adaptable and tunable cloud..."

  • ...1 shows an SSSP example for a six-node network graph connected with seven edges [15]....

    [...]

  • ...In this paper, we used our modified SSSP function approaching BSP parallel computing model proposed in our previous work [15]....

    [...]

  • ...The complete pseudo-code of the modified SSSP function approaching BSP parallel model is clearly explained in our previous work [15]....

    [...]

Proceedings ArticleDOI
01 Dec 2019
TL;DR: In this paper, a traffic congestion diffusion (TCD) model with traffic flow influence (TFI) was introduced to capture the traffic dynamics and give a panoramic view for the city by cross domain data fusion.
Abstract: Traffic bottlenecks identification plays an important role in traffic planning and provides decision-making for prevention of traffic congestion. Although traffic bottlenecks widely exist, they are difficult to predict because of the changing traffic condition and traffic demand. In this paper, we introduce a traffic congestion diffusion (TCD) model with traffic flow influence (TFI) to capture the traffic dynamics and give a panoramic view for the city by cross domain data fusion. We proposed novel definition of bottleneck from the perspective of influence spread under TCD. The bottlenecks identification problem is modeled as an influence maximization problem, i.e., selecting the top K influential nodes in road networks under certain traffic conditions. We establish the submodularity of influence spread and solve the NP-hard optimal seed selection problem by using an efficient heuristic algorithm (TCD-IM) with provable near-optimal performance guarantees. To the best of our knowledge, this should be the first model for a metro-city scale from the influence perspective. The TCD-IM model is able to identify the dynamic traffic bottlenecks.

1 citations

References
More filters
Journal ArticleDOI
TL;DR: A tree is a graph with one and only one path between every two nodes, where at least one path exists between any two nodes and the length of each branch is given.
Abstract: We consider n points (nodes), some or all pairs of which are connected by a branch; the length of each branch is given. We restrict ourselves to the case where at least one path exists between any two nodes. We now consider two problems. Problem 1. Constrnct the tree of minimum total length between the n nodes. (A tree is a graph with one and only one path between every two nodes.) In the course of the construction that we present here, the branches are subdivided into three sets: I. the branches definitely assignec~ to the tree under construction (they will form a subtree) ; II. the branches from which the next branch to be added to set I, will be selected ; III. the remaining branches (rejected or not yet considered). The nodes are subdivided into two sets: A. the nodes connected by the branches of set I, B. the remaining nodes (one and only one branch of set II will lead to each of these nodes), We start the construction by choosing an arbitrary node as the only member of set A, and by placing all branches that end in this node in set II. To start with, set I is empty. From then onwards we perform the following two steps repeatedly. Step 1. The shortest branch of set II is removed from this set and added to

22,704 citations


"Towards adaptable and tunable cloud..." refers methods in this paper

  • ...Basically, SSSP is a classical function which has been well solved based on the Dijkstra algorithm (Dijkstra, 1959)....

    [...]

  • ...5b. Basically, SSSP is a classical function which has been well solved based on the Dijkstra algorithm (Dijkstra, 1959)....

    [...]

Journal ArticleDOI
Jeffrey Dean1, Sanjay Ghemawat1
06 Dec 2004
TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.
Abstract: MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Many real world tasks are expressible in this model, as shown in the paper. Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. The run-time system takes care of the details of partitioning the input data, scheduling the program's execution across a set of machines, handling machine failures, and managing the required inter-machine communication. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a large distributed system. Our implementation of MapReduce runs on a large cluster of commodity machines and is highly scalable: a typical MapReduce computation processes many terabytes of data on thousands of machines. Programmers find the system easy to use: hundreds of MapReduce programs have been implemented and upwards of one thousand MapReduce jobs are executed on Google's clusters every day.

20,309 citations

Journal ArticleDOI
Jeffrey Dean1, Sanjay Ghemawat1
TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.
Abstract: MapReduce is a programming model and an associated implementation for processing and generating large datasets that is amenable to a broad variety of real-world tasks. Users specify the computation in terms of a map and a reduce function, and the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks. Programmers find the system easy to use: more than ten thousand distinct MapReduce programs have been implemented internally at Google over the past four years, and an average of one hundred thousand MapReduce jobs are executed on Google's clusters every day, processing a total of more than twenty petabytes of data per day.

17,663 citations

Proceedings ArticleDOI
06 Jun 2010
TL;DR: A model for processing large graphs that has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier.
Abstract: Many practical computing problems concern large graphs. Standard examples include the Web graph and various social networks. The scale of these graphs - in some cases billions of vertices, trillions of edges - poses challenges to their efficient processing. In this paper we present a computational model suitable for this task. Programs are expressed as a sequence of iterations, in each of which a vertex can receive messages sent in the previous iteration, send messages to other vertices, and modify its own state and that of its outgoing edges or mutate graph topology. This vertex-centric approach is flexible enough to express a broad set of algorithms. The model has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier. Distribution-related details are hidden behind an abstract API. The result is a framework for processing large graphs that is expressive and easy to program.

3,840 citations


"Towards adaptable and tunable cloud..." refers background or methods in this paper

  • ...Regarding the offline efforts, the shortest path distance and temporal/speed constraints are computed by following the parallel computing paradigm, i.e., BSP (Malewicz et al., 2010), in a cloud environment to reduce the pre-processing time....

    [...]

  • ...To compute the shortest path distances and temporal/speed constraint, we propose an extension of the single source shortest path (SSSP) function (Seo et al., 2010) following the bulk synchronous parallel (BSP) paradigm (Malewicz et al., 2010) in the cloud environment....

    [...]

  • ...Malewicz et al. (2010) in Google Inc. introduced an alternative model, called Pregel, which is based on the BSP parallel paradigm....

    [...]

  • ..., BSP (Malewicz et al., 2010), in a cloud environment to reduce the pre-processing time....

    [...]

Journal ArticleDOI
TL;DR: The concept of urban computing is introduced, discussing its general framework and key challenges from the perspective of computer sciences, and the typical technologies that are needed in urban computing are summarized into four folds.
Abstract: Urbanization's rapid progress has modernized many people's lives but also engendered big issues, such as traffic congestion, energy consumption, and pollution. Urban computing aims to tackle these issues by using the data that has been generated in cities (e.g., traffic flow, human mobility, and geographical data). Urban computing connects urban sensing, data management, data analytics, and service providing into a recurrent process for an unobtrusive and continuous improvement of people's lives, city operation systems, and the environment. Urban computing is an interdisciplinary field where computer sciences meet conventional city-related fields, like transportation, civil engineering, environment, economy, ecology, and sociology in the context of urban spaces. This article first introduces the concept of urban computing, discussing its general framework and key challenges from the perspective of computer sciences. Second, we classify the applications of urban computing into seven categories, consisting of urban planning, transportation, the environment, energy, social, economy, and public safety and security, presenting representative scenarios in each category. Third, we summarize the typical technologies that are needed in urban computing into four folds, which are about urban sensing, urban data management, knowledge fusion across heterogeneous data, and urban data visualization. Finally, we give an outlook on the future of urban computing, suggesting a few research topics that are somehow missing in the community.

1,290 citations