scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Predicting Taxi–Passenger Demand Using Streaming Data

TL;DR: A novel methodology for predicting the spatial distribution of taxi-passengers for a short-term time horizon using streaming data and demonstrates that the proposed framework can provide effective insight into the spatiotemporal distribution of Taxi-passenger demand for a 30-min horizon.
Abstract: Informed driving is increasingly becoming a key feature for increasing the sustainability of taxi companies. The sensors that are installed in each vehicle are providing new opportunities for automatically discovering knowledge, which, in return, delivers information for real-time decision making. Intelligent transportation systems for taxi dispatching and for finding time-saving routes are already exploring these sensing data. This paper introduces a novel methodology for predicting the spatial distribution of taxi-passengers for a short-term time horizon using streaming data. First, the information was aggregated into a histogram time series. Then, three time-series forecasting techniques were combined to originate a prediction. Experimental tests were conducted using the online data that are transmitted by 441 vehicles of a fleet running in the city of Porto, Portugal. The results demonstrated that the proposed framework can provide effective insight into the spatiotemporal distribution of taxi-passenger demand for a 30-min horizon.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: This paper is one of the first DL studies to forecast the short-term passenger demand of an on-demand ride service platform by examining the spatio-temporal correlations and the FCL-Net achieves the better predictive performance than traditional approaches.
Abstract: Short-term passenger demand forecasting is of great importance to the on-demand ride service platform, which can incentivize vacant cars moving from over-supply regions to over-demand regions. The spatial dependencies, temporal dependencies, and exogenous dependencies need to be considered simultaneously, however, which makes short-term passenger demand forecasting challenging. We propose a novel deep learning (DL) approach, named the fusion convolutional long short-term memory network (FCL-Net), to address these three dependencies within one end-to-end learning architecture. The model is stacked and fused by multiple convolutional long short-term memory (LSTM) layers, standard LSTM layers, and convolutional layers. The fusion of convolutional techniques and the LSTM network enables the proposed DL approach to better capture the spatio-temporal characteristics and correlations of explanatory variables. A tailored spatially aggregated random forest is employed to rank the importance of the explanatory variables. The ranking is then used for feature selection. The proposed DL approach is applied to the short-term forecasting of passenger demand under an on-demand ride service platform in Hangzhou, China. The experimental results, validated on the real-world data provided by DiDi Chuxing, show that the FCL-Net achieves the better predictive performance than traditional approaches including both classical time-series prediction models and state-of-art machine learning algorithms (e.g., artificial neural network, XGBoost, LSTM and CNN). Furthermore, the consideration of exogenous variables in addition to the passenger demand itself, such as the travel time rate, time-of-day, day-of-week, and weather conditions, is proven to be promising, since they reduce the root mean squared error (RMSE) by 48.3%. It is also interesting to find that the feature selection reduces 24.4% in the training time and leads to only the 1.8% loss in the forecasting accuracy measured by RMSE in the proposed model. This paper is one of the first DL studies to forecast the short-term passenger demand of an on-demand ride service platform by examining the spatio-temporal correlations.

507 citations


Cites background from "Predicting Taxi–Passenger Demand Us..."

  • ...Moreira-Matias et al. (2013) proposed a data stream ensemble framework which incorporated time varying passion model and ARIMA, to predict the spatial distribution of taxi passenger demand....

    [...]

  • ...However, heterogeneous and exogenous factors in reality, e.g., asymmetric information, short-term fluctuations, may make it difficult to guarantee the spatial distribution of taxis matching the passenger demand all the time (Moreira-Matias et al., 2013)....

    [...]

  • ..., asymmetric information, short-term fluctuations, may make it difficult to guarantee the spatial distribution of taxis matching the passenger demand all the time (Moreira-Matias et al., 2013)....

    [...]

Journal ArticleDOI
TL;DR: The structural principle, the characteristics, and some kinds of classic models of deep learning, such as stacked auto encoder, deep belief network, deep Boltzmann machine, and convolutional neural network are described.

408 citations

Proceedings ArticleDOI
Zhe Xu1, Li Zhixin1, Guan Qingwen1, Zhang Dingshui1, Qiang Li1, Junxiao Nan1, Chunyang Liu1, Wei Bian1, Jieping Ye1 
19 Jul 2018
TL;DR: A novel order dispatch algorithm in large-scale on-demand ride-hailing platforms that is designed to provide a more efficient way to optimize resource utilization and user experience in a global and more farsighted view is presented.
Abstract: We present a novel order dispatch algorithm in large-scale on-demand ride-hailing platforms. While traditional order dispatch approaches usually focus on immediate customer satisfaction, the proposed algorithm is designed to provide a more efficient way to optimize resource utilization and user experience in a global and more farsighted view. In particular, we model order dispatch as a large-scale sequential decision-making problem, where the decision of assigning an order to a driver is determined by a centralized algorithm in a coordinated way. The problem is solved in a learning and planning manner: 1) based on historical data, we first summarize demand and supply patterns into a spatiotemporal quantization, each of which indicates the expected value of a driver being in a particular state; 2) a planning step is conducted in real-time, where each driver-order-pair is valued in consideration of both immediate rewards and future gains, and then dispatch is solved using a combinatorial optimizing algorithm. Through extensive offline experiments and online AB tests, the proposed approach delivers remarkable improvement on the platform's efficiency and has been successfully deployed in the production system of Didi Chuxing.

311 citations

Journal ArticleDOI
TL;DR: In this paper, a general framework to describe ridesourcing systems is proposed, which can aid understanding of the interactions between endogenous and exogenous variables, their changes in response to platforms' operational strategies and decisions, multiple system objectives, and market equilibria in a dynamic manner.
Abstract: With the rapid development and popularization of mobile and wireless communication technologies, ridesourcing companies have been able to leverage internet-based platforms to operate e-hailing services in many cities around the world. These companies connect passengers and drivers in real time and are disruptively changing the transportation industry. As pioneers in a general sharing economy context, ridesourcing shared transportation platforms consist of a typical two-sided market. On the demand side, passengers are sensitive to the price and quality of the service. On the supply side, drivers, as freelancers, make working decisions flexibly based on their income from the platform and many other factors. Diverse variables and factors in the system are strongly endogenous and interactively dependent. How to design and operate ridesourcing systems is vital—and challenging—for all stakeholders: passengers/users, drivers/service providers, platforms, policy makers, and the general public. In this paper, we propose a general framework to describe ridesourcing systems. This framework can aid understanding of the interactions between endogenous and exogenous variables, their changes in response to platforms’ operational strategies and decisions, multiple system objectives, and market equilibria in a dynamic manner. Under the proposed general framework, we summarize important research problems and the corresponding methodologies that have been and are being developed and implemented to address these problems. We conduct a comprehensive review of the literature on these problems in different areas from diverse perspectives, including (1) demand and pricing, (2) supply and incentives, (3) platform operations, and (4) competition, impacts, and regulations. The proposed framework and the review also suggest many avenues requiring future research.

303 citations

Posted Content
TL;DR: Wang et al. as discussed by the authors proposed a Deep Multi-View Spatial-Temporal Network (DMVST-Net) framework to model both spatial and temporal relations, which can help the city pre-allocate resources to meet travel demand and to reduce empty taxis on streets which waste energy and worsen the traffic congestion.
Abstract: Taxi demand prediction is an important building block to enabling intelligent transportation systems in a smart city. An accurate prediction model can help the city pre-allocate resources to meet travel demand and to reduce empty taxis on streets which waste energy and worsen the traffic congestion. With the increasing popularity of taxi requesting services such as Uber and Didi Chuxing (in China), we are able to collect large-scale taxi demand data continuously. How to utilize such big data to improve the demand prediction is an interesting and critical real-world problem. Traditional demand prediction methods mostly rely on time series forecasting techniques, which fail to model the complex non-linear spatial and temporal relations. Recent advances in deep learning have shown superior performance on traditionally challenging tasks such as image classification by learning the complex features and correlations from large-scale data. This breakthrough has inspired researchers to explore deep learning techniques on traffic prediction problems. However, existing methods on traffic prediction have only considered spatial relation (e.g., using CNN) or temporal relation (e.g., using LSTM) independently. We propose a Deep Multi-View Spatial-Temporal Network (DMVST-Net) framework to model both spatial and temporal relations. Specifically, our proposed model consists of three views: temporal view (modeling correlations between future demand values with near time points via LSTM), spatial view (modeling local spatial correlation via local CNN), and semantic view (modeling correlations among regions sharing similar temporal patterns). Experiments on large-scale real taxi demand data demonstrate effectiveness of our approach over state-of-the-art methods.

302 citations

References
More filters
Journal Article
TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.
Abstract: Copyright (©) 1999–2012 R Foundation for Statistical Computing. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the R Core Team.

272,030 citations


"Predicting Taxi–Passenger Demand Us..." refers background in this paper

  • ...Finally, the results that are achieved are presented and discussed....

    [...]

Book ChapterDOI
TL;DR: This paper provides a concise overview of time series analysis in the time and frequency domains with lots of references for further reading.
Abstract: Any series of observations ordered along a single dimension, such as time, may be thought of as a time series. The emphasis in time series analysis is on studying the dependence among observations at different points in time. What distinguishes time series analysis from general multivariate analysis is precisely the temporal order imposed on the observations. Many economic variables, such as GNP and its components, price indices, sales, and stock returns are observed over time. In addition to being interested in the contemporaneous relationships among such variables, we are often concerned with relationships between their current and past values, that is, relationships over time.

9,919 citations


"Predicting Taxi–Passenger Demand Us..." refers background or methods in this paper

  • ...2....

    [...]

  • ...To do so, well-known time-series forecasting techniques were used and adapted to this problem, such as the time-varying Poisson model [15] and the autoregressive integrated moving average (ARIMA) [16]....

    [...]

  • ...Third, they presented an improved ARIMA depending both on time and day type....

    [...]

  • ...The ARIMA model (p, d, and q values, and seasonality) was first set (and updated each 24 h) by learning/detecting the underlying model (i.e., autocorrelation and partial autocorrelation analysis) running on the historical time-series curve of each stand during the last two weeks (i.e., period t− 2θ, t)....

    [...]

  • ...Despite their good results, this approach comparatively has the following three weak points to the one presented: 1) it just uses the most immediate historical data, discarding the mid- and long-term memory of the system; 2) in their testbed, the authors use minimum aggregation periods of 60 min over offline historical data (i.e., the next value prediction task on a time series is easier as long as the aggregation period is increased), whereas we use short-term periods of 30 min; and 3) the work does not clearly describe how the authors update both the ARIMA model and the weights that are used by it....

    [...]

Journal ArticleDOI
10 Mar 2008-Nature
TL;DR: In this article, the authors study the trajectory of 100,000 anonymized mobile phone users whose position is tracked for a six-month period and find that the individual travel patterns collapse into a single spatial probability distribution, indicating that humans follow simple reproducible patterns.
Abstract: The mapping of large-scale human movements is important for urban planning, traffic forecasting and epidemic prevention. Work in animals had suggested that their foraging might be explained in terms of a random walk, a mathematical rendition of a series of random steps, or a Levy flight, a random walk punctuated by occasional larger steps. The role of Levy statistics in animal behaviour is much debated — as explained in an accompanying News Feature — but the idea of extending it to human behaviour was boosted by a report in 2006 of Levy flight-like patterns in human movement tracked via dollar bills. A new human study, based on tracking the trajectory of 100,000 cell-phone users for six months, reveals behaviour close to a Levy pattern, but deviating from it as individual trajectories show a high degree of temporal and spatial regularity: work and other commitments mean we are not as free to roam as a foraging animal. But by correcting the data to accommodate individual variation, simple and predictable patterns in human travel begin to emerge. The cover photo (by Cesar Hidalgo) captures human mobility in New York's Grand Central Station. This study used a sample of 100,000 mobile phone users whose trajectory was tracked for six months to study human mobility patterns. Displacements across all users suggest behaviour close to the Levy-flight-like pattern observed previously based on the motion of marked dollar bills, but with a cutoff in the distribution. The origin of the Levy patterns observed in the aggregate data appears to be population heterogeneity and not Levy patterns at the level of the individual. Despite their importance for urban planning1, traffic forecasting2 and the spread of biological3,4,5 and mobile viruses6, our understanding of the basic laws governing human motion remains limited owing to the lack of tools to monitor the time-resolved location of individuals. Here we study the trajectory of 100,000 anonymized mobile phone users whose position is tracked for a six-month period. We find that, in contrast with the random trajectories predicted by the prevailing Levy flight and random walk models7, human trajectories show a high degree of temporal and spatial regularity, each individual being characterized by a time-independent characteristic travel distance and a significant probability to return to a few highly frequented locations. After correcting for differences in travel distances and the inherent anisotropy of each trajectory, the individual travel patterns collapse into a single spatial probability distribution, indicating that, despite the diversity of their travel history, humans follow simple reproducible patterns. This inherent similarity in travel patterns could impact all phenomena driven by human mobility, from epidemic prevention to emergency response, urban planning and agent-based modelling.

5,514 citations

BookDOI
TL;DR: In this article, a survey of elementary applications of probability theory can be found, including the following: 1. Plausible reasoning 2. The quantitative rules 3. Elementary sampling theory 4. Elementary hypothesis testing 5. Queer uses for probability theory 6. Elementary parameter estimation 7. The central, Gaussian or normal distribution 8. Sufficiency, ancillarity, and all that 9. Repetitive experiments, probability and frequency 10. Advanced applications: 11. Discrete prior probabilities, the entropy principle 12. Simple applications of decision theory 15.
Abstract: Foreword Preface Part I. Principles and Elementary Applications: 1. Plausible reasoning 2. The quantitative rules 3. Elementary sampling theory 4. Elementary hypothesis testing 5. Queer uses for probability theory 6. Elementary parameter estimation 7. The central, Gaussian or normal distribution 8. Sufficiency, ancillarity, and all that 9. Repetitive experiments, probability and frequency 10. Physics of 'random experiments' Part II. Advanced Applications: 11. Discrete prior probabilities, the entropy principle 12. Ignorance priors and transformation groups 13. Decision theory: historical background 14. Simple applications of decision theory 15. Paradoxes of probability theory 16. Orthodox methods: historical background 17. Principles and pathology of orthodox statistics 18. The Ap distribution and rule of succession 19. Physical measurements 20. Model comparison 21. Outliers and robustness 22. Introduction to communication theory References Appendix A. Other approaches to probability theory Appendix B. Mathematical formalities and style Appendix C. Convolutions and cumulants.

4,641 citations

Journal ArticleDOI
TL;DR: Experimental results with real data sets indicate that the combined model can be an effective way to improve forecasting accuracy achieved by either of the models used separately.

3,155 citations


Additional excerpts

  • ...A brief presentation of one of the simplest ARIMA models (for nonseasonal stationary time series) is presented next, following the existing description in [30] (however, our framework can...

    [...]