Home
/
Authors
/
Jianfeng Liu

Author

Jianfeng Liu

Bio: Jianfeng Liu is an academic researcher. The author has contributed to research in topics: DBSCAN & Public transport. The author has an hindex of 4, co-authored 5 publications receiving 548 citations.

Topics: DBSCAN, Public transport, Cluster analysis, Smart card, Transaction data ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Mining smart card data for transit riders’ travel patterns

[...]

Xiaolei Ma¹, Yao Jan Wu², Yinhai Wang¹, Feng Chen, Jianfeng Liu - Show less +1 more•Institutions (2)

University of Washington¹, University of Arizona²

01 Nov 2013-Transportation Research Part C-emerging Technologies

TL;DR: Wang et al. as mentioned in this paper proposed an efficient and effective data-mining procedure that models the travel patterns of transit riders in Beijing, China and identified trip chains based on the temporal and spatial characteristics of their smart card transaction data.

...read moreread less

Abstract: To mitigate the congestion caused by the ever increasing number of privately owned automobiles, public transit is highly promoted by transportation agencies worldwide. A better understanding of travel patterns and regularity at the “magnitude” level will enable transit authorities to evaluate the services they offer, adjust marketing strategies, retain loyal customers and improve overall transit performance. However, it is fairly challenging to identify travel patterns for individual transit riders in a large dataset. This paper proposes an efficient and effective data-mining procedure that models the travel patterns of transit riders in Beijing, China. Transit riders’ trip chains are identified based on the temporal and spatial characteristics of their smart card transaction data. The Density-based Spatial Clustering of Applications with Noise (DBSCAN) algorithm then analyzes the identified trip chains to detect transit riders’ historical travel patterns and the K-Means++ clustering algorithm and the rough-set theory are jointly applied to cluster and classify travel pattern regularities. The performance of the rough-set-based algorithm is compared with those of other prevailing classification algorithms. The results indicate that the proposed rough-set-based algorithm outperforms other commonly used data-mining algorithms in terms of accuracy and efficiency.

...read moreread less

510 citations

Mining Smart Card Data for Transit Riders’ Travel Patterns

[...]

Xiaolei Ma¹, Yao Jan Wu², Yinhai Wang¹, Feng Chen, Jianfeng Liu - Show less +1 more•Institutions (2)

University of Washington¹, University of Arizona²

01 Jan 2013

TL;DR: This paper proposes an efficient and effective data-mining procedure that models the travel patterns of transit riders in Beijing, China and indicates that the proposed rough-set-based algorithm outperforms other commonly used data- mining algorithms in terms of accuracy and efficiency.

...read moreread less

Abstract: To mitigate congestion caused by the increasing number of privately owned automobiles, public transit is highly promoted by transportation agencies worldwide. With a better understanding of the travel patterns and regularity (the “magnitude” level of travel pattern) of transit riders, transit authorities can evaluate the current transit services to adjust marketing strategies, keep loyal customers and improve transit performance. However, it is fairly challenging to identify travel pattern for each individual transit rider in a large dataset. Therefore, this paper proposes an efficient and effective data-mining approach that models the travel patterns of transit riders using the smart card data collected in Beijing, China. Transit riders’ trip chains are identified based on the temporal and spatial characteristics of smart card transaction data. Based on the identified trip chains, the Density-based Spatial Clustering of Applications with Noise (DBSCAN) algorithm is used to detect each transit rider’s historical travel patterns. The K-Means++ clustering algorithm and the rough-set theory are jointly applied to clustering and classifying the travel pattern regularities. The rough-set-based algorithm is compared with other classification algorithms, including Naive Bayes Classifier, C4.5 Decision Tree, K-Nearest Neighbor (KNN) and three-hidden-layers Neural Network. The results indicate that the proposed rough-set-based algorithm outperforms other prevailing data-mining algorithms in terms of accuracy and efficiency.

...read moreread less

90 citations

Journal Article•DOI•

Transit smart card data mining for passenger origin information extraction

[...]

Xiaolei Ma¹, Yinhai Wang¹, Feng Chen, Jianfeng Liu•Institutions (1)

University of Washington¹

10 Oct 2012-Journal of Zhejiang University Science C

TL;DR: To extract passengers’ origin data from recorded SC transaction information, a Markov chain based Bayesian decision tree algorithm is developed in this study and verified with transit vehicles equipped with global positioning system (GPS) data loggers.

...read moreread less

Abstract: The automated fare collection (AFC) system, also known as the transit smart card (SC) system, has gained more and more popularity among transit agencies worldwide. Compared with the conventional manual fare collection system, an AFC system has its inherent advantages in low labor cost and high efficiency for fare collection and transaction data archival. Although it is possible to collect highly valuable data from transit SC transactions, substantial efforts and methodologies are needed for extracting such data because most AFC systems are not initially designed for data collection. This is true especially for the Beijing AFC system, where a passenger’s boarding stop (origin) on a flat-rate bus is not recorded on the check-in scan. To extract passengers’ origin data from recorded SC transaction information, a Markov chain based Bayesian decision tree algorithm is developed in this study. Using the time invariance property of the Markov chain, the algorithm is further optimized and simplified to have a linear computational complexity. This algorithm is verified with transit vehicles equipped with global positioning system (GPS) data loggers. Our verification results demonstrated that the proposed algorithm is effective in extracting transit passengers’ origin information from SC transactions with a relatively high accuracy. Such transit origin data are highly valuable for transit system planning and route optimization.

...read moreread less

89 citations

Patent•

Method for reckoning getting-on stops on basis of data of one-ticket public-transport integrated circuit (IC) card

[...]

Feng Chen, Huimin Wen, Jifu Guo, Yinhai Wang, Jianfeng Liu - Show less +1 more

04 May 2011

TL;DR: In this paper, a method for reckoning getting-on stops on the basis of data of a one-ticket public-transport integrated circuit (IC) card is presented, which is based on a Bayesian decision tree method.

...read moreread less

Abstract: The invention discloses a method for reckoning getting-on stops on the basis of data of a one-ticket public-transport integrated circuit (IC) card, comprising the following steps: a, carrying out clustering on card-swiping records of a target bus according to the time of the card-swiping records, taking the time-adjacent card-swiping records as a cluster, each cluster corresponds to one stop, and forming a stop sequence to be recognized; b, determining transfer information according to adjacent card-swiping records of each IC card; c, according to transfer the information, crosspoint data information between bus routes and stop information of the bus routes, reckoning the practical stops corresponding to the cluster of the IC card and forming a recognized stop sequence; d, according to the recognized stop sequence, reckoning a practical stop to be recognized corresponding to a cluster to be recognized, and when the recognized stop quantity is less than 2, reckoning by a Bayesian decision tree method featured by mobile step; and when the recognized cluster quantity is more than or equal to 2, adopting a mode recognition method for reckoning. By utilizing the method, the requirement to data is low and the accuracy is higher.

...read moreread less

11 citations

Proceedings Article•DOI•

Temporal Distribution Analysis of Beijing’s Subway Ridership

[...]

Jianfeng Liu, Xiaolei Ma, Congcong Liu, Yinhai Wang, Jing Wang - Show less +1 more

29 Jun 2016

TL;DR: Wang et al. as mentioned in this paper categorize Beijing subway ridership characteristics into seven different groups based on their temporal distributions and corresponding land use types by analyzing Beijing subway smart card data, and the heterogeneity among stop-level, line-level and network-level ridership temporal distributions is analyzed.

...read moreread less

Abstract: Similar to other metropolitan cities in China, urban rail transportation has been highly emphasized in Beijing in the past decades. However, the growing subway system is seriously challenged by several critical issues, one of which is that the ridership must be restricted at certain stations during peak hours due to crowded conditions. This suppresses the transport demand and pushes passengers to other modes of transport. Therefore, the characteristics of subway ridership should be carefully studied to increase the appeal of subways and maximize their potential. This paper categorizes Beijing subway ridership characteristics into seven different groups based on their temporal distributions and corresponding land use types by analyzing Beijing’s subway smart card data. In addition, the heterogeneity among stop-level, line-level, and network-level ridership temporal distributions is analyzed. Temporal distribution characteristics should be incorporated into ridership prediction and subway network optimization, and can thereby improve subway demand forecasting, planning, and design.

...read moreread less

2 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Machine Learning for Internet of Things Data Analysis: A Survey

[...]

Mohammad Saeid Mahdavinejad¹, Mohammad Saeid Mahdavinejad², Mohammadreza Rezvan¹, Mohammadreza Rezvan², Mohammadamin Barekatain³, Peyman Adibi¹, Payam Barnaghi⁴, Amit P. Sheth² - Show less +4 more•Institutions (4)

University of Isfahan¹, Wright State University², Technische Universität München³, University of Surrey⁴

12 Oct 2017-Digital Communications and Networks

TL;DR: This article assesses the different machine learning methods that deal with the challenges in IoT data by considering smart cities as the main use case and presents a taxonomy of machine learning algorithms explaining how different techniques are applied to the data in order to extract higher level information.

...read moreread less

690 citations

Journal Article•DOI•

Mining smart card data for transit riders’ travel patterns

[...]

Xiaolei Ma¹, Yao Jan Wu², Yinhai Wang¹, Feng Chen, Jianfeng Liu - Show less +1 more•Institutions (2)

University of Washington¹, University of Arizona²

01 Nov 2013-Transportation Research Part C-emerging Technologies

...read moreread less

510 citations

Journal Article•DOI•

The promises of big data and small data for travel behavior (aka human mobility) analysis.

[...]

Cynthia Chen¹, Jingtao Ma, Yusak O. Susilo², Yu Liu³, Menglin Wang⁴ - Show less +1 more•Institutions (4)

University of Washington¹, Royal Institute of Technology², Peking University³, Cambridge Systematics⁴

01 Jul 2016-Transportation Research Part C-emerging Technologies

TL;DR: The purpose of this paper is to introduce datasets, concepts, knowledge and methods used in these two fields, and most importantly raise cross-discipline ideas for conversations and collaborations between the two.

...read moreread less

Abstract: The last decade has witnessed very active development in two broad, but separate fields, both involving understanding and modeling of how individuals move in time and space (hereafter called "travel behavior analysis" or "human mobility analysis"). One field comprises transportation researchers who have been working in the field for decades and the other involves new comers from a wide range of disciplines, but primarily computer scientists and physicists. Researchers in these two fields work with different datasets, apply different methodologies, and answer different but overlapping questions. It is our view that there is much, hidden synergy between the two fields that needs to be brought out. It is thus the purpose of this paper to introduce datasets, concepts, knowledge and methods used in these two fields, and most importantly raise cross-discipline ideas for conversations and collaborations between the two. It is our hope that this paper will stimulate many future cross-cutting studies that involve researchers from both fields.

...read moreread less

425 citations

Journal Article•DOI•

Machine learning for Internet of Things data analysis: A survey

[...]

Mohammad Saeid Mahdavinejad¹, Mohammad Saeid Mahdavinejad², Mohammadreza Rezvan¹, Mohammadreza Rezvan², Mohammadamin Barekatain³, Peyman Adibi², Payam Barnaghi⁴, Amit P. Sheth¹ - Show less +4 more•Institutions (4)

Wright State University¹, University of Isfahan², Technische Universität München³, University of Surrey⁴

17 Feb 2018-arXiv: Learning

TL;DR: In this article, the authors present a taxonomy of machine learning algorithms that can be applied to the data in order to extract higher level information, and a use case of applying Support Vector Machine (SVM) on Aarhus Smart City traffic data is presented for more detailed exploration.

...read moreread less

Abstract: Rapid developments in hardware, software, and communication technologies have allowed the emergence of Internet-connected sensory devices that provide observation and data measurement from the physical world. By 2020, it is estimated that the total number of Internet-connected devices being used will be between 25 and 50 billion. As the numbers grow and technologies become more mature, the volume of data published will increase. Internet-connected devices technology, referred to as Internet of Things (IoT), continues to extend the current Internet by providing connectivity and interaction between the physical and cyber worlds. In addition to increased volume, the IoT generates Big Data characterized by velocity in terms of time and location dependency, with a variety of multiple modalities and varying data quality. Intelligent processing and analysis of this Big Data is the key to developing smart IoT applications. This article assesses the different machine learning methods that deal with the challenges in IoT data by considering smart cities as the main use case. The key contribution of this study is presentation of a taxonomy of machine learning algorithms explaining how different techniques are applied to the data in order to extract higher level information. The potential and challenges of machine learning for IoT data analytics will also be discussed. A use case of applying Support Vector Machine (SVM) on Aarhus Smart City traffic data is presented for a more detailed exploration.

...read moreread less

375 citations

An essay towards solving a problem in the doctrine of chances. [Facsimil]

[...]

Thomas Bayes

01 Jan 2001

TL;DR: The probability of any event is the ratio between the value at which an expectation depending on the happening of the event ought to be computed, and the value of the thing expected upon it’s 2 happening.

...read moreread less

Abstract: Problem Given the number of times in which an unknown event has happened and failed: Required the chance that the probability of its happening in a single trial lies somewhere between any two degrees of probability that can be named. SECTION 1 Definition 1. Several events are inconsistent, when if one of them happens, none of the rest can. 2. Two events are contrary when one, or other of them must; and both together cannot happen. 3. An event is said to fail, when it cannot happen; or, which comes to the same thing, when its contrary has happened. 4. An event is said to be determined when it has either happened or failed. 5. The probability of any event is the ratio between the value at which an expectation depending on the happening of the event ought to be computed, and the value of the thing expected upon it’s 2 happening.

...read moreread less

368 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116

Collapse