T-Finder: A Recommender System for Finding Passengers and Vacant Taxis

doi:10.1109/TKDE.2012.153

Home
/
Papers
/
T-Finder: A Recommender System for Finding Passengers and Vacant Taxis

Journal Article•DOI•

T-Finder: A Recommender System for Finding Passengers and Vacant Taxis

Nicholas Jing Yuan¹, Yu Zheng¹, Liuhang Zhang, Xing Xie¹•Institutions (1)

Microsoft¹

01 Oct 2013-IEEE Transactions on Knowledge and Data Engineering (IEEE)-Vol. 25, Iss: 10, pp 2390-2403

TL;DR: A recommender system for both taxi drivers and people expecting to take a taxi, using the knowledge of passengers' mobility patterns and taxi drivers' picking-up/dropping-off behaviors learned from the GPS trajectories of taxicabs to provide taxi drivers with some locations toward which they are more likely to pick up passengers quickly.

read less

Abstract: This paper presents a recommender system for both taxi drivers and people expecting to take a taxi, using the knowledge of 1) passengers' mobility patterns and 2) taxi drivers' picking-up/dropping-off behaviors learned from the GPS trajectories of taxicabs. First, this recommender system provides taxi drivers with some locations and the routes to these locations, toward which they are more likely to pick up passengers quickly (during the routes or in these locations) and maximize the profit of the next trip. Second, it recommends people with some locations (within a walking distance) where they can easily find vacant taxis. In our method, we learn the above-mentioned knowledge (represented by probabilities) from GPS trajectories of taxis. Then, we feed the knowledge into a probabilistic model that estimates the profit of the candidate locations for a particular driver based on where and when the driver requests the recommendation. We build our system using historical trajectories generated by over 12,000 taxis during 110 days and validate the system with extensive evaluations including in-the-field user studies.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Urban Computing: Concepts, Methodologies, and Applications

[...]

Yu Zheng¹, Licia Capra², Ouri Wolfson³, Hai Yang⁴•Institutions (4)

Microsoft¹, University College London², University of Illinois at Chicago³, Hong Kong University of Science and Technology⁴

18 Sep 2014-ACM Transactions on Intelligent Systems and Technology

TL;DR: The concept of urban computing is introduced, discussing its general framework and key challenges from the perspective of computer sciences, and the typical technologies that are needed in urban computing are summarized into four folds.

...read moreread less

Abstract: Urbanization's rapid progress has modernized many people's lives but also engendered big issues, such as traffic congestion, energy consumption, and pollution. Urban computing aims to tackle these issues by using the data that has been generated in cities (e.g., traffic flow, human mobility, and geographical data). Urban computing connects urban sensing, data management, data analytics, and service providing into a recurrent process for an unobtrusive and continuous improvement of people's lives, city operation systems, and the environment. Urban computing is an interdisciplinary field where computer sciences meet conventional city-related fields, like transportation, civil engineering, environment, economy, ecology, and sociology in the context of urban spaces. This article first introduces the concept of urban computing, discussing its general framework and key challenges from the perspective of computer sciences. Second, we classify the applications of urban computing into seven categories, consisting of urban planning, transportation, the environment, energy, social, economy, and public safety and security, presenting representative scenarios in each category. Third, we summarize the typical technologies that are needed in urban computing into four folds, which are about urban sensing, urban data management, knowledge fusion across heterogeneous data, and urban data visualization. Finally, we give an outlook on the future of urban computing, suggesting a few research topics that are somehow missing in the community.

...read moreread less

1,290 citations

Cites background from "T-Finder: A Recommender System for ..."

...2, where Yuan et al. [2012a] inferred the functional regions in a city using road network data, points of interest, and human mobility learned from a large number of taxi trips....
[...]

Journal Article•DOI•

Trajectory Data Mining: An Overview

[...]

Yu Zheng¹•Institutions (1)

Microsoft¹

12 May 2015-ACM Transactions on Intelligent Systems and Technology

TL;DR: A systematic survey on the major research into trajectory data mining, providing a panorama of the field as well as the scope of its research topics, and introduces the methods that transform trajectories into other data formats, such as graphs, matrices, and tensors.

...read moreread less

Abstract: The advances in location-acquisition and mobile computing techniques have generated massive spatial trajectory data, which represent the mobility of a diversity of moving objects, such as people, vehicles, and animals. Many techniques have been proposed for processing, managing, and mining trajectory data in the past decade, fostering a broad range of applications. In this article, we conduct a systematic survey on the major research into trajectory data mining, providing a panorama of the field as well as the scope of its research topics. Following a road map from the derivation of trajectory data, to trajectory data preprocessing, to trajectory data management, and to a variety of mining tasks (such as trajectory pattern mining, outlier detection, and trajectory classification), the survey explores the connections, correlations, and differences among these existing techniques. This survey also introduces the methods that transform trajectories into other data formats, such as graphs, matrices, and tensors, to which more data mining and machine learning techniques can be applied. Finally, some public trajectory datasets are presented. This survey can help shape the field of trajectory data mining, providing a quick understanding of this field to the community.

...read moreread less

1,289 citations

Cites background or methods from "T-Finder: A Recommender System for ..."

...On the other hand, in some applications, for example, estimating the travel time of a path [Wang et al. 2014] and driving direction suggestion [Yuan et al. 2013a], such stay points should be removed from a trajectory during the preprocessing....
[...]
...The noise filtering method, which has been used in T-Drive [Yuan et al. 2010a, 2011a, 2013a] and GeoLife [Zheng et al. 2009a; Zheng et al. 2010] projects, first calculates the travel speed of each point in a trajectory based on the time interval and distance between a point and its successor (we…...
[...]
..., ∆tn−1 → sn , therefore facilitating a diversity of applications, such as travel recommendations [152] [154], destination prediction [116], taxi recommendation [127][129], and gas consumption estimation[132][133]....
[...]
...For example, in a task of travel speed estimation, we should remove the stay points (from a taxi’s trajectory) where a taxi was parked to wait for passengers [Yuan et al. 2013b]....
[...]
...…s2 t2→, . . . , tn−1→ sn, therefore facilitating a diversity of applications, such as travel recommendations [Zheng and Xie 2011b; Zheng et al. 2011c], destination prediction [Ye et al. 2009], taxi recommendation [Yuan et al. 2011b, 2013b], and gas consumption estimation [Zhang et al. 2013, 2015]....
[...]

Journal Article•DOI•

Mobile Crowd Sensing and Computing: The Review of an Emerging Human-Powered Sensing Paradigm

[...]

Bin Guo¹, Zhu Wang¹, Zhiwen Yu¹, Yu Wang², Neil Y. Yen³, Runhe Huang⁴, Xingshe Zhou¹ - Show less +3 more•Institutions (4)

Northwestern Polytechnical University¹, University of North Carolina at Charlotte², University of Aizu³, Hosei University⁴

10 Aug 2015-ACM Computing Surveys

TL;DR: The unique features and novel application areas of MCSC are characterized and a reference framework for building human-in-the-loop MCSC systems is proposed, which clarifies the complementary nature of human and machine intelligence and envision the potential of deep-fused human--machine systems.

...read moreread less

Abstract: With the surging of smartphone sensing, wireless networking, and mobile social networking techniques, Mobile Crowd Sensing and Computing (MCSC) has become a promising paradigm for cross-space and large-scale sensing. MCSC extends the vision of participatory sensing by leveraging both participatory sensory data from mobile devices (offline) and user-contributed data from mobile social networking services (online). Further, it explores the complementary roles and presents the fusion/collaboration of machine and human intelligence in the crowd sensing and computing processes. This article characterizes the unique features and novel application areas of MCSC and proposes a reference framework for building human-in-the-loop MCSC systems. We further clarify the complementary nature of human and machine intelligence and envision the potential of deep-fused human--machine systems. We conclude by discussing the limitations, open issues, and research opportunities of MCSC.

...read moreread less

650 citations

Cites background from "T-Finder: A Recommender System for ..."

...Service/activity recommendation T-Finder [Yuan et al. 2013] (GPS from taxis)...
[...]
...For example, T-Finder [Yuan et al. 2013] was a recommending system that could guide taxi drivers to the places where passengers could more easily be picked up....
[...]

Proceedings Article•DOI•

Travel time estimation of a path using sparse trajectories

[...]

Yilun Wang¹, Yu Zheng¹, Yexiang Xue²•Institutions (2)

Microsoft¹, Cornell University²

24 Aug 2014

TL;DR: A citywide and real-time model for estimating the travel time of any path (represented as a sequence of connected road segments) in real time in a city, based on the GPS trajectories of vehicles received in current time slots and over a period of history as well as map data sources is proposed.

...read moreread less

Abstract: In this paper, we propose a citywide and real-time model for estimating the travel time of any path (represented as a sequence of connected road segments) in real time in a city, based on the GPS trajectories of vehicles received in current time slots and over a period of history as well as map data sources. Though this is a strategically important task in many traffic monitoring and routing systems, the problem has not been well solved yet given the following three challenges. The first is the data sparsity problem, i.e., many road segments may not be traveled by any GPS-equipped vehicles in present time slot. In most cases, we cannot find a trajectory exactly traversing a query path either. Second, for the fragment of a path with trajectories, they are multiple ways of using (or combining) the trajectories to estimate the corresponding travel time. Finding an optimal combination is a challenging problem, subject to a tradeoff between the length of a path and the number of trajectories traversing the path (i.e., support). Third, we need to instantly answer users' queries which may occur in any part of a given city. This calls for an efficient, scalable and effective solution that can enable a citywide and real-time travel time estimation. To address these challenges, we model different drivers' travel times on different road segments in different time slots with a three dimension tensor. Combined with geospatial, temporal and historical contexts learned from trajectories and map data, we fill in the tensor's missing values through a context-aware tensor decomposition approach. We then devise and prove an object function to model the aforementioned tradeoff, with which we find the most optimal concatenation of trajectories for an estimate through a dynamic programming solution. In addition, we propose using frequent trajectory patterns (mined from historical trajectories) to scale down the candidates of concatenation and a suffix-tree-based index to manage the trajectories received in the present time slot. We evaluate our method based on extensive experiments, using GPS trajectories generated by more than 32,000 taxis over a period of two months. The results demonstrate the effectiveness, efficiency and scalability of our method beyond baseline approaches.

...read moreread less

488 citations

Cites background from "T-Finder: A Recommender System for ..."

...INTRODUCTION Real-time estimation of the travel time of a path, which is represented by a sequence of connected road segments, is of great importance for traffic monitoring [1], finding driving directions [20], ridesharing [13] and taxi dispatching [22]....
[...]

Proceedings Article•DOI•

T-share: A large-scale dynamic taxi ridesharing service

[...]

Shuo Ma¹, Yu Zheng², Ouri Wolfson¹•Institutions (2)

University of Illinois at Chicago¹, Microsoft²

08 Apr 2013

TL;DR: The dynamic ridesharing problem is formally defined, a large-scale taxi ridesh sharing service is proposed that efficiently serves real-time requests sent by taxi users and generates rideshared schedules that reduce the total travel distance significantly.

...read moreread less

Abstract: Taxi ridesharing can be of significant social and environmental benefit, e.g. by saving energy consumption and satisfying people's commute needs. Despite the great potential, taxi ridesharing, especially with dynamic queries, is not well studied. In this paper, we formally define the dynamic ridesharing problem and propose a large-scale taxi ridesharing service. It efficiently serves real-time requests sent by taxi users and generates ridesharing schedules that reduce the total travel distance significantly. In our method, we first propose a taxi searching algorithm using a spatio-temporal index to quickly retrieve candidate taxis that are likely to satisfy a user query. A scheduling algorithm is then proposed. It checks each candidate taxi and inserts the query's trip into the schedule of the taxi which satisfies the query with minimum additional incurred travel distance. To tackle the heavy computational load, a lazy shortest path calculation strategy is devised to speed up the scheduling algorithm. We evaluated our service using a GPS trajectory dataset generated by over 33,000 taxis during a period of 3 months. By learning the spatio-temporal distributions of real user queries from this dataset, we built an experimental platform that simulates user real behaviours in taking a taxi. Tested on this platform with extensive experiments, our approach demonstrated its efficiency, effectiveness, and scalability. For example, our proposed service serves 25% additional taxi users while saving 13% travel distance compared with no-ridesharing (when the ratio of the number of queries to that of taxis is 6).

...read moreread less

487 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Bagging predictors

[...]

Leo Breiman

01 Aug 1996

TL;DR: Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy.

...read moreread less

Abstract: Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality vote when predicting a class. The multiple versions are formed by making bootstrap replicates of the learning set and using these as new learning sets. Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy. The vital element is the instability of the prediction method. If perturbing the learning set can cause significant changes in the predictor constructed, then bagging can improve accuracy.

...read moreread less

16,118 citations

Book•

Stochastic Processes

[...]

Sheldon M. Ross

06 Dec 1982

6,033 citations

"T-Finder: A Recommender System for ..." refers background in this paper

...The arrival of the vacant taxis on a given road segment can be modeled using a nonhomogeneous Poisson process [13] (which is a Poisson process with a time-dependent arriving rate function) with arriving rate μ(t)....
[...]

Journal Article•DOI•

OPTICS: ordering points to identify the clustering structure

[...]

Mihael Ankerst¹, Markus M. Breunig¹, Hans-Peter Kriegel¹, Jörg Sander¹•Institutions (1)

Ludwig Maximilian University of Munich¹

01 Jun 1999

TL;DR: A new algorithm is introduced for the purpose of cluster analysis which does not produce a clustering of a data set explicitly; but instead creates an augmented ordering of the database representing its density-based clustering structure.

...read moreread less

Abstract: Cluster analysis is a primary method for database mining. It is either used as a stand-alone tool to get insight into the distribution of a data set, e.g. to focus further analysis and data processing, or as a preprocessing step for other algorithms operating on the detected clusters. Almost all of the well-known clustering algorithms require input parameters which are hard to determine but have a significant influence on the clustering result. Furthermore, for many real-data sets there does not even exist a global parameter setting for which the result of the clustering algorithm describes the intrinsic clustering structure accurately. We introduce a new algorithm for the purpose of cluster analysis which does not produce a clustering of a data set explicitly; but instead creates an augmented ordering of the database representing its density-based clustering structure. This cluster-ordering contains information which is equivalent to the density-based clusterings corresponding to a broad range of parameter settings. It is a versatile basis for both automatic and interactive cluster analysis. We show how to automatically and efficiently extract not only 'traditional' clustering information (e.g. representative points, arbitrary shaped clusters), but also the intrinsic clustering structure. For medium sized data sets, the cluster-ordering can be represented graphically and for very large data sets, we introduce an appropriate visualization technique. Both are suitable for interactive exploration of the intrinsic clustering structure offering additional insights into the distribution and correlation of the data.

...read moreread less

4,020 citations

Additional excerpts

...As depicted in Fig....
[...]

Proceedings Article•DOI•

Driving with knowledge from the physical world

[...]

Jing Yuan¹, Yu Zheng², Xing Xie², Guangzhong Sun¹•Institutions (2)

University of Science and Technology of China¹, Microsoft²

21 Aug 2011

TL;DR: A Cloud-based system computing customized and practically fast driving routes for an end user using (historical and real-time) traffic conditions and driver behavior, which accurately estimates the travel time of a route for a user; hence finding the fastest route customized for the user.

...read moreread less

Abstract: This paper presents a Cloud-based system computing customized and practically fast driving routes for an end user using (historical and real-time) traffic conditions and driver behavior. In this system, GPS-equipped taxicabs are employed as mobile sensors constantly probing the traffic rhythm of a city and taxi drivers' intelligence in choosing driving directions in the physical world. Meanwhile, a Cloud aggregates and mines the information from these taxis and other sources from the Internet, like Web maps and weather forecast. The Cloud builds a model incorporating day of the week, time of day, weather conditions, and individual driving strategies (both of the taxi drivers and of the end user for whom the route is being computed). Using this model, our system predicts the traffic conditions of a future time (when the computed route is actually driven) and performs a self-adaptive driving direction service for a particular user. This service gradually learns a user's driving behavior from the user's GPS logs and customizes the fastest route for the user with the help of the Cloud. We evaluate our service using a real-world dataset generated by over 33,000 taxis over a period of 3 months in Beijing. As a result, our service accurately estimates the travel time of a route for a user; hence finding the fastest route customized for the user.

...read moreread less

758 citations

Additional excerpts

...Then, we feed the knowledge into a probabilistic model that estimates the profit of the candidate locations for a particular driver based on where and when the driver requests the recommendation....
[...]

Journal Article•DOI•

Interpreting TF-IDF term weights as making relevance decisions

[...]

H. C. Wu¹, Robert W. P. Luk¹, Kam-Fai Wong², Kui Lam Kwok³•Institutions (3)

Hong Kong Polytechnic University¹, The Chinese University of Hong Kong², Queens College³

20 Jun 2008-ACM Transactions on Information Systems

TL;DR: A novel probabilistic retrieval model forms a basis to interpret the TF-IDF term weights as making relevance decisions, and it is shown that the term-frequency factor of the ranking formula can be rendered into different term- frequency factors of existing retrieval systems.

...read moreread less

Abstract: A novel probabilistic retrieval model is presented It forms a basis to interpret the TF-IDF term weights as making relevance decisions It simulates the local relevance decision-making for every location of a document, and combines all of these “local” relevance decisions as the “document-wide” relevance decision for the document The significance of interpreting TF-IDF in this way is the potential to: (1) establish a unifying perspective about information retrieval as relevance decision-making; and (2) develop advanced TF-IDF-related term weights for future elaborate retrieval models Our novel retrieval model is simplified to a basic ranking formula that directly corresponds to the TF-IDF term weights In general, we show that the term-frequency factor of the ranking formula can be rendered into different term-frequency factors of existing retrieval systems In the basic ranking formula, the remaining quantity - log p(r¯|t ∈ d) is interpreted as the probability of randomly picking a nonrelevant usage (denoted by r¯) of term t Mathematically, we show that this quantity can be approximated by the inverse document-frequency (IDF) Empirically, we show that this quantity is related to IDF, using four reference TREC ad hoc retrieval data collections

...read moreread less

752 citations

"T-Finder: A Recommender System for ..." refers background in this paper

...This section details the process for detecting parked status from a nonoccupied trip and accordingly finding out the parking places in the urban area of a city based on a collection of taxi trajectories....
[...]