scispace - formally typeset
Journal ArticleDOI

Arterial Path-Level Travel-Time Estimation Using Machine-Learning Techniques

01 May 2017-Journal of Computing in Civil Engineering (American Society of Civil Engineers (ASCE))-Vol. 31, Iss: 3, pp 04016070

...read more


Citations
More filters
Journal ArticleDOI

[...]

TL;DR: The new concept of consensual 3D speed maps allows the essence out of large amounts of link speed observations and reveals a global and previously mostly hidden picture of traffic dynamics at the whole city scale, which may be more regular and predictable than expected.
Abstract: In this paper, we investigate the day-to-day regularity of urban congestion patterns. We first partition link speed data every 10 min into 3D clusters that propose a parsimonious sketch of the congestion pulse. We then gather days with similar patterns and use consensus clustering methods to produce a unique global pattern that fits multiple days, uncovering the day-to-day regularity. We show that the network of Amsterdam over 35 days can be synthesized into only 4 consensual 3D speed maps with 9 clusters. This paves the way for a cutting-edge systematic method for travel time predictions in cities. By matching the current observation to historical consensual 3D speed maps, we design an efficient real-time method that successfully predicts 84% trips travel times with an error margin below 25%. The new concept of consensual 3D speed maps allows us to extract the essence out of large amounts of link speed observations and as a result reveals a global and previously mostly hidden picture of traffic dynamics at the whole city scale, which may be more regular and predictable than expected.

202 citations

DOI

[...]

27 Feb 2020
TL;DR: This thesis develops a series of efficient data-driven methods for extracting the mobility patterns of large-scale metropolitan networks and explores some of their applications.
Abstract: Cities are complex, dynamic and ever-evolving. We need to understand how these cities work in order to predict, control or optimize their operations. We have identified some open issues related to network and data complexity that need to be solved to build feasible methods for these purposes. To this end, we first build multiscale graphs automatically to address a problem that is becoming increasingly relevant in the age of big data, where reducing the network complexity could easily determine the viability of the research in real-world applications. Next, we propose different methods from different fields to extract the essence of network dynamics from the vast amount of spatiotemporal traffic data. One such method is a new way of looking at traffic patterns, combining the field of pattern recognition - with a focus on computer vision - with the traffic domain. The inspiration comes from the fact that humans are the most sophisticated pattern recognizer in the world and we use specific visual features to recognize different complex patterns and we explore if these features can also be used to recognize traffic patterns. Finally, we explore different applications of such mobility patterns such as revealing the unknown correlation between supply and demand patterns, evaluating the scalability of the proposed approach by applying the method to the entire Dutch highway network and transferability by building similar network patterns for public transport networks. Thus, this thesis develops a series of efficient data-driven methods for extracting the mobility patterns of large-scale metropolitan networks and explore some of their applications. With the increasing availability of data in the transport domain, the Achilles heel is not data scarcity anymore but rather extracting insights from this massive amount of data. This thesis is a step forward in solving this complex problem by leveraging the increased acceptance of using machine learning as a worthy and effective method for network-wide analysis of traffic patterns.

8 citations


Cites background from "Arterial Path-Level Travel-Time Est..."

  • [...]

Journal ArticleDOI

[...]

Kejun Long, Wukai Yao, Jian Gu, Wei Wu, Lee D. Han 
TL;DR: The Artificial Fish Swarm algorithm is applied to optimize the SVM model parameters, which include the kernel parameter σ, non-sensitive loss function parameter ε, and penalty parameter C, and the results show that the accuracy of the optimized S VM model is 17.27% and 16.44% higher than those of the BP neural network model and the common SVM models, respectively.
Abstract: Freeway travel time is influenced by many factors including traffic volume, adverse weather, accidents, traffic control, and so on. We employ the multiple source data-mining method to analyze freeway travel time. We collected toll data, weather data, traffic accident disposal logs, and other historical data from Freeway G5513 in Hunan Province, China. Using the Support Vector Machine (SVM), we proposed the travel time predicting model founded on these databases. The new SVM model can simulate the nonlinear relationship between travel time and those factors. In order to improve the precision of the SVM model, we applied the Artificial Fish Swarm algorithm to optimize the SVM model parameters, which include the kernel parameter σ, non-sensitive loss function parameter e, and penalty parameter C. We compared the new optimized SVM model with the Back Propagation (BP) neural network and a common SVM model, using the historical data collected from freeway G5513. The results show that the accuracy of the optimized SVM model is 17.27% and 16.44% higher than those of the BP neural network model and the common SVM model, respectively.

6 citations

Journal ArticleDOI

[...]

TL;DR: The performance of the proposed Advanced Time-Space Discterization (AdTSD) method was evaluated with real field data and compared with existing approaches and results show that AdTSD approach was able to perform better than historical average approach with an advantage up to 11% and 5% compared to Base Time Space Discretization (B TSD) approach.
Abstract: Travel time is a variable that varies over both time and space. Hence, an ideal formulation should be able to capture its evolution over time and space. A mathematical representation capturing such variations was formulated from first principles, using the concept of conservation of vehicles. The availability of position and speed data obtained from GPS enabled buses provide motivation to rewrite the conservation equation in terms of speed alone. As the number of vehicles is discrete, the speed-based equation was discretized using Godunov scheme and used in the prediction scheme that was based on the Kalman filter. With a limited fleet size having an average headway of 30 min, availability of travel time data at small interval that satisfy the requirement of stability of numerical solution possess a big challenge. To address this issue, a continuous speed fill matrix spatially and temporally was developed with the help of historic data and used in this study. The performance of the proposed Advanced Time-Space Discterization (AdTSD) method was evaluated with real field data and compared with existing approaches. Results show that AdTSD approach was able to perform better than historical average approach with an advantage up to 11% and 5% compared to Base Time Space Discretization (BTSD) approach. Also, from the results it was observed that the maximum deviation in prediction was in the range of 2–3 min when it is predicted 10 km ahead and the error is close to zero when it is predicted a section ahead i.e. when the bus is close to a bus stop, indicating that the prediction accuracy achieved is suitable for real field implementation.

1 citations

Proceedings ArticleDOI

[...]

05 Jan 2021
TL;DR: In this paper, a reliable structure for forecasting travel time on Indian urban arterials using data from Wi-Fi/ Bluetooth sensors was developed to assist with real-time traffic control strategies.
Abstract: Travel time is one of the elementary traffic stream parameters in both users’ and transport planners’ perspective. Conventional travel time estimation methods have performed out of sorts for Indian urban traffic conditions characterized by heterogeneity in transport modes and lack of lane discipline. Robust to these limitations, Media Access Control (MAC) matching is perceived to be a reliable alternative for travel time estimation. To assist with real-time traffic control strategies, this study aims at developing a reliable structure for forecasting travel time on Indian urban arterials using data from Wi-Fi/ Bluetooth sensors. The data collected on an urban arterial in Chennai has been used as a case study to explain the value of such data and to explore its applicability in implementing various prediction models. To this end, this study examines and compares three different machine learning algorithms k-Nearest Neighbour (kNN), Random Forest (RDF), Naive Bayes, and Kalman filtering technique for prediction. The performance of each model is evaluated to understand its suitability.

References
More filters
Journal ArticleDOI

[...]

01 Oct 2001
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Abstract: Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, aaa, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.

58,232 citations

[...]

01 Jan 2007
TL;DR: random forests are proposed, which add an additional layer of randomness to bagging and are robust against overfitting, and the randomForest package provides an R interface to the Fortran programs by Breiman and Cutler.
Abstract: Recently there has been a lot of interest in “ensemble learning” — methods that generate many classifiers and aggregate their results. Two well-known methods are boosting (see, e.g., Shapire et al., 1998) and bagging Breiman (1996) of classification trees. In boosting, successive trees give extra weight to points incorrectly predicted by earlier predictors. In the end, a weighted vote is taken for prediction. In bagging, successive trees do not depend on earlier trees — each is independently constructed using a bootstrap sample of the data set. In the end, a simple majority vote is taken for prediction. Breiman (2001) proposed random forests, which add an additional layer of randomness to bagging. In addition to constructing each tree using a different bootstrap sample of the data, random forests change how the classification or regression trees are constructed. In standard trees, each node is split using the best split among all variables. In a random forest, each node is split using the best among a subset of predictors randomly chosen at that node. This somewhat counterintuitive strategy turns out to perform very well compared to many other classifiers, including discriminant analysis, support vector machines and neural networks, and is robust against overfitting (Breiman, 2001). In addition, it is very user-friendly in the sense that it has only two parameters (the number of variables in the random subset at each node and the number of trees in the forest), and is usually not very sensitive to their values. The randomForest package provides an R interface to the Fortran programs by Breiman and Cutler (available at http://www.stat.berkeley.edu/ users/breiman/). This article provides a brief introduction to the usage and features of the R functions.

12,765 citations

Journal ArticleDOI

[...]

TL;DR: In this paper, the authors generalize the results of [4] and modify the algorithm presented there to obtain a better rate of convergence, which is the same as in this paper.
Abstract: In this paper we generalize the results of [4] and modify the algorithm presented there to obtain a better rate of convergence.

2,163 citations

Journal ArticleDOI

[...]

TL;DR: The OpenStreetMap project is a knowledge collective that provides user-generated street maps that follow the peer production model that created Wikipedia; its aim is to create a set of map data that's free to use, editable, and licensed under new copyright schemes.
Abstract: The OpenStreetMap project is a knowledge collective that provides user-generated street maps. OSM follows the peer production model that created Wikipedia; its aim is to create a set of map data that's free to use, editable, and licensed under new copyright schemes. A considerable number of contributors edit the world map collaboratively using the OSM technical infrastructure, and a core group, estimated at approximately 40 volunteers, dedicate their time to creating and improving OSM's infrastructure, including maintaining the server, writing the core software that handles the transactions with the server, and creating cartographical outputs. There's also a growing community of software developers who develop software tools to make OSM data available for further use across different application domains, software platforms, and hardware devices. The OSM project's hub is the main OSM Web site.

2,116 citations

Book

[...]

02 Aug 1995
TL;DR: In this paper, the authors present a review of computer applications of statistics, including one-sample t statistic, two-way analysis of variance, and repeated-measures analysis for variance nonparametric tests.
Abstract: Getting started: why study satistics? basic concepts and ideas. Descriptive statistics: frequency distributions and graphs summary measures relative measures and the normal curve linear correlation linear regression. Concepts of inferential statistics: sampling distributions logic of hypothesis testing. Methods of inferential statistics: one-sample t statistic - when a t ratio is not practical two-sample t tests analysis of variance two-way analysis of variance repeated-measures analysis of variance nonparametric tests bringing it all together. Appendices: statistical tables answers to selected review questions computer applications of statistics.

1,667 citations