scispace - formally typeset
Journal ArticleDOI

Exploring the capacity of social media data for modelling travel behaviour: Opportunities and challenges

Reads0
Chats0
TLDR
In this paper, a detailed discussion is provided about how social media data from different sources can be used to indirectly and with minimal cost extract travel attributes such as trip purpose, mode of transport, activity duration and destination choice, as well as land use variables such as home, job and school location and socio-demographic attributes including gender, age and income.
Abstract
In the past few years, the social science literature has shown significance attention to extracting information from social media to track and analyse human movements. In this paper the transportation aspect of social media is investigated and reviewed. A detailed discussion is provided about how social media data from different sources can be used to indirectly and with minimal cost extract travel attributes such as trip purpose, mode of transport, activity duration and destination choice, as well as land use variables such as home, job and school location and socio-demographic attributes including gender, age and income. The evolution of the field of transport and travel behaviour around applications of social media over the last few years is studied. Further, this paper presents results of a qualitative survey from travel demand modelling experts around the world on applicability of social media data for modelling daily travel behaviour. The result of the survey reveals positive view of the experts about usefulness of such data sources.

read more

Content maybe subject to copyright    Report

Exploring the capacity of social media data for modelling travel
behaviour: Opportunities and challenges
q
Taha H. Rashidi
a,
, Alireza Abbasi
b
, Mojtaba Maghrebi
d
, Samiul Hasan
c
, Travis S. Waller
a
a
School of Civil and Environmental Engineering, UNSW, Australia
b
School of Engineering and Information Technology, UNSW, Australia
c
Department of Civil, Environment, and Construction Engineering, University of Central Florida, United States
d
Department of Civil Engineering, Ferdowsi University of Mashhad, Mashhad, Khorasan Razavi, Iran
article info
Article history:
Received 30 May 2016
Received in revised form 12 December 2016
Accepted 15 December 2016
Keywords:
Travel diary survey
Social media
Travel demand modelling
Mobility behaviour
abstract
In the past few years, the social science literature has shown significance attention to
extracting information from social media to track and analyse human movements. In this
paper the transportation aspect of social media is investigated and reviewed. A detailed
discussion is provided about how social media data from different sources can be used
to indirectly and with minimal cost extract travel attributes such as trip purpose, mode
of transport, activity duration and destination choice, as well as land use variables such
as home, job and school location and socio-demographic attributes including gender, age
and income. The evolution of the field of transport and travel behaviour around applica-
tions of social media over the last few years is studied. Further, this paper presents results
of a qualitative survey from travel demand modelling experts around the world on appli-
cability of social media data for modelling daily travel behaviou r. The result of the survey
reveals positive view of the experts about usefulness of such data sources.
Ó 2016 Elsevier Ltd. All rights reserved.
1. Introduction
The digital age accelerated the evolution of online social networks. Social media has become an emerging industry with
massive input and output cash flow. As a result, massive data sources have been created as a result of such massive market.
Harnessing such big data has become an interesting topic for researchers, scientists, practitioners and governments. Fields
such as computer science, mathematics, social sciences, economics and management have invested considerable effort in
developing understanding about various aspects of social networks and media data. It has been only recently that transport
engineering, urban planners and travel demand modellers have noticed the richness of such big data and have started
exploring the capacity of such data source for planning, management and operating purposes.
Initial movements towards understanding social media and their impact on the transport system started with descriptive
analysis on mobility using location based social networks (
Onnela et al., 2011). As the potentials of such data sources further
explored, transport modellers pushed the frontiers of applications of social media data for modelling transport related issues
(
Hasan and Ukkusuri, 2015; Hasan et al., 2016). Nonetheless such efforts are still at their infancy and the community is not
http://dx.doi.org/10.1016/j.trc.2016.12.008
0968-090X/Ó 2016 Elsevier Ltd. All rights reserved.
q
This article belongs to the Virtual Special Issue on "Social Network".
Corresponding author.
E-mail addresses:
rashidi@unsw.edu.au (T.H. Rashidi), a.abbasi@unsw.edu.au (A. Abbasi), Mojtabamaghrebi@um.ac.ir (M. Maghrebi), samiul.hasan@ucf.
edu
(S. Hasan), s.waller@unsw.edu.au (T.S. Waller).
Transportation Research Part C 75 (2017) 197–211
Contents lists available at ScienceDirect
Transportation Research Part C
journal homepage: www.elsevier.com/locate/trc

yet convinced about full potential of such cheaply available but costly to prepare dataset ( Wu et al., 2014), where privacy of
users must be maintained through aggregate or anonymized parsing analysis (
Smith et al., 2012).
There are several complications associated with using social media data, especially if analysing the content of such a big
data is of importance in understating the observations. For example, Twitter
1
data (tweets) typically contain normal text,
hash-tag(s), and/or check-in data. Check-in data include location of tweets, making it associated with activities happening at
that location (e.g., all tweets linked to a stadium, by users who provided checked-in data, are more likely to be related recre-
ational activities). Similarly, hash-tag (#) messages are associated with an activity, event, location, etc. Therefore, it is relatively
easier to work with check-in and hash-tag data as they are already associated with an event or location (
Katakis et al., 2008). In
particular, when check-in data is used for analysis of the destination/origin of the activity, determining trip purpose is relatively
easy (
Cheng et al., 2011). More information about applications of Twitter data can be found in a review paper by Steiger et al.
(2015)
where transport is completely excluded from their study. If check-in data or hash tag data is not of interest and more
general information is used, extracting meaningful information can be challenging. More importantly, there are several biases
and issues highlighted affecting the research on human mobility behaviour in different ways for some of which solutions have
be proposed in fields such as epidemiology, statistics, and machine learning (
Ruths and Pfeffer, 2014).
This study presents an overview of transport related studies which used social media for transportation planning and
management. A special focus is given to the application of social media data in travel demand modelling studies. Relevant
studies focusing on applications of social media on the following categories that are related to transport research are dis-
cussed in Section
2: (i) travel demand modelling, (ii) mobility behaviour (iii) individuals’ activity pattern, (iv) assessing pub-
lic transport and (v) traffic condition, (vi) and incidents and natural disasters. Section
3 presents a discussion about the
evolution of evolution of social media use for transportation applications. This section is followed by a more detailed discus-
sion about the capacity of social media data through results of an online survey in which travel demand modelling experts
declared their opinions about usefulness of different social media data sources for planning, management and operation pur-
poses. Finally a summary of the discussion and recommendations for future directions of using social media data in the field
is discussed.
2. Use of social media in transport research
2.1. Travel demand modelling studies
The history of planning the transport system infrastructure goes back to the time the wheel was invented followed by the
construction of the first paved road in Sumer in 500 BCE. At the same time, Darius I the Great, 500 BCE, started construction
of an extensive road system for Persia including the famous Royal Road which was one of the first highways. About the same
time, Roman roads were constructed with advanced technologies of stone-paved and metaled, cambered for drainage and
were flanked by footpaths, bridleways and drainage ditches. Same road structure was later used by the Great Britain in
the 18th century to establish the first toll system which included 250 miles of road and 40 bridges. All of these early trans-
port system planning and network design efforts inspired transport engineers of the 20th century to develop a systematic
procedure for policy appraisal and network design purposes. It was in the 1950s when the first prototypes of the conven-
tional four-step models developed in Chicago and Detroit in USA. Since then, many metropolitan areas adopted a similar
structure to evaluate the short, medium and long term consequences of different designs and policies. The 4-step modelling
paradigm, which is a trip-based approach, led to the tour-based scheme in which individual level travel information is
regarded for modelling purposes. Tour-based models were later evolved to activity based model in which individual/house-
hold level data is used to model individual/household level travel attributes (
Rashidi and Kanaroglou, 2013).
Travel demand modelling techniques target modelling the mobility (movement) of people and vehicles (including pas-
senger and commercial vehicles) in cities to understand their (mainly short distance) travel behaviour. Models are developed
based on individual level data sources, in which behaviour of travellers is reflected, have been argued to dominate aggregate
level models in terms of policy appraisal (
Rashidi and Kanaroglou, 2013).
The evolution of travel demand modelling techniques developed the need for high resolution databases in which socio-
demographic and economic attributes of people are used to model their day-to-day travel behaviour. Such data sources
encompass travel diary of a sample of people representing the population. Having access to such an individual level travel
diary is crucial to develop several components of the advanced behavioural modelling frameworks like tour-based and
activity-based. The most important travel attributes considered in these modelling frameworks are: (a) trip purpose, (b)
departure time, (c) mode of transport, (d) activity duration, (e) activity location, (f) travel route, (g) party composition,
and (h) traffic condition
Other than travel data, information about long-term household decisions should be collected and modelled to be used as
an important input to travel demand models. The major household decision for which commonly data is collected and mod-
els are developed are: residential location, job location and vehicle ownership. Among these three, vehicle ownership has
been modelled more in travel demand frameworks. Housing and job search behaviour have been mainly considered exoge-
nously in the travel modelling structures (
Rashidi et al., 2012).
1
www.twitter.com.
198 T.H. Rashidi et al. / Transportation Research Part C 75 (2017) 197–211

Data is generally a valuable product which exhausts a large portion of the provided financial resources for planning and
operating the transport system. As a result, not necessarily all metropolitan areas can afford collecting data on a monthly or
yearly basis. This has resulted in emergent of innovative approaches to temporally or /and spatially transferring data and
models (Rashidi and Mohammadian, 2011) or indirectly imputing the required data from other readily accessible data source
(
Miller et al., 2014).
Data for demand modelling has been collected using two major methods called: (i) revealed preference (RP) surveys and
(ii) stated preference (SP) surveys. These two major methods are used to collect data about (a) household/individual travel
diary (
Rashidi et al., 2010), (b) attitudes or opinions of people about the system and service (Beirão and Cabral, 2007), and (c)
counting agents (people or vehicles) using the transport system (
Francis et al., 2003). Conventional data collection tech-
niques for a and b include face-to-face, telephone, mail-out-mail-back, web-based, on-board (on transit for example) survey-
ing methods. Count (c) data has been traditionally collected using roadside, GPS, on-board and smart card techniques. The
significantly large cost associated with the data collection methods for data types of a and b is quite clear as the average cost
of one complete household travel survey is more than $200 (
Zhang and Mohammadian, 2010). As a result technology has
been employed to collect household travel survey data (or even count data) in a cost effective manner. For example, the
capacity of web-based surveys (apps), social networking sites or applications, smart phones (accelerometers) and personal
health sensors have been explored (
Wilde et al., 2015). Nonetheless, the practical inherent capacity of these emerging
technology-based methods is yet to be explored.
The capacity of social media platforms such as Facebook,
2
Twitter LinkedIn,
3
Instagram,
4
Foursquare,
5
and Yelp
6
to provide
information on household daily travel has been minimally examined (
Golder and Macy, 2014; Yin et al., 2015). Tasse and Hong
(2014)
presented a wide range of possible ways of using geotagged social media to develop understanding of urban areas
instead of using traditional ways of data collections. They categorized the opportunities (i) for city planner (such as: under-
standing the mobility pattern, understanding average distance travelled), (ii) for small business owners (such as: understanding
customers demographic and customers before and after activities) and (iii) for individuals (such as: understating socially con-
structed places and understanding social flows in cities).
Social media platforms have a feature known as location-based services, which enable people to share their activity
related choices (check-in) in their virtual social networks. Through location-based services, users can share their activity-
locations when they visit restaurants, shopping malls, movie theatres and so on. Location-based data has received increasing
attention, for travel demand modelling as the data can provide further knowledge about travel behaviour. However, the
amount of check-in information using such services is less than the geo-tagged associated ‘text’ data available on people’s
posts on social media platforms such as Twitter. However, the main challenge before using such rich data is the significant
noise existing in them which requires advanced text mining, natural language processing and data mining techniques to
extract useful information that can be related to travel behaviour of people (
Cramer et al., 2011; Maghrebi et al., 2015).
2.2. Aggregate mobility behaviour
Several studies have investigated how social media data can be used for understanding human mobility behaviour for a
large number of people. These studies discovered universal laws for mobility behaviour of people at aggregate levels across
different geographical scales (
Noulas et al., 2012; Cheng et al., 2011; Jurdak et al., 2015). For instance, Cheng et al. (2011)
analysed 22 million check-ins and observed Lèvy Flight patterns and periodic behaviours in mobility behaviour of social
media users.
Cho et al. (2011) investigated the relationship between human mobility and social relationship using
location-based check-in data. They found that social relationships can explain up to 30% of all human movements, while
periodic behaviour explains 50–70%.
Hasan et al. (2013) used a dataset of Foursquare check-ins to analyse urban human
mobility and activity patterns. They determined the spatial distributions of visiting different places for various activity pur-
poses by counting the number of purpose-specific visits within each cell and computed the proportion of visits to each cell
for each activity category.
Zhu et al. (2014) discussed an alternative way for household travel survey using location-based
social networks (LBSNs). It was tried to predict Puget Sound Travel Survey (PSRC) using geotagged Foursquare data. To do
so, they extracted (i) demographic features (age and work status), (ii) temporal features (proportion of a day), (iii) spatial
features (using Foursqaure API) from social media data. They analysed 13 million geotagged tweets over a period of 1 year
to investigate crowd movements (spatio-temporal) pattern in New York (Manhattan).
Several studies have investigated if aggregate patterns, suitable for transportation planning, can be obtained from social
media data. In particular, social media data has been used to estimate Origin-destination (OD) matrix.
Cebelak (2013) and Jin
et al. (2014)
investigated the feasibility of using the location-based social media data to estimate travel demand using a
doubly-constrained gravity model. They evaluated their result against the OD matrix generated by an existing singly-
constrained gravity model and a reference matrix from the local metropolitan planning organization. They found significant
improvement in reducing estimation errors caused by the sampling bias from the OD estimation method based on the
2
www.facebook.com.
3
www.linkedin.com.
4
www.instagram.com.
5
www.foursquare.com.
6
www.yelp.com.
T.H. Rashidi et al. / Transportation Research Part C 75 (2017) 197–211
199

singly-constrained gravity model. In another study, Lee et al. (2015, 2016) used geo-tagged Twitter data to understand its
relationship with traditional travel demand model. Based on greater Los Angeles metropolitan area, they compared the Twit-
ter based OD matrix with a recent OD matrix provided from a 4-step model output and estimated regression models to mea-
sure the correlations between the ODs provided traditional travel demand model and Twitter-based method. Their
preliminary results show the added value of large-scale location-based social media data for modelling travel demand.
Although the above studies show the potential of social media data for modelling aggregate travel behaviour, these stud-
ies have limited scopes and hence further research is needed to utilize the full potential of this kind of data. Most of the stud-
ies, related to discovering mobility patterns, actually analysed the visiting patterns of the users to different places in a city.
While such information is valuable, for modelling purposes we also need the origins and destinations of the movements and
modal preferences. Methods to estimate O-D matrix can help us to resolve the problem of identifying the origins and des-
tinations of movements. However, it is not clear how much error is introduced in the aggregate patterns due to the lack of
sample representativeness and the biases present in the data. More comparative analysis between traditional survey-based
and social media data is needed to measure and correct the biases.
2.3. Individual-based activity behaviour
Geo-tagged social media data and particularly check-in data has been utilized to infer activity purposes. Using venue cat-
egory information from check-in data, studies from social science, computer science, and transportation science have used
innovative ways to extract meaningful activity behaviour patterns and model behaviours with diverse applications. These
studies include activity recognition (
Lian and Xie, 2011), activity choice patterns (Pianese et al., 2013; Coffey and
Pozdnoukhov, 2013; Hasan and Ukkusuri, 2014
), predicting next place to check-in and friendship (Chang and Sun, 2011)
and inferring life-style behaviour from activity-location choices patterns (
Hasan and Ukkusuri, 2015).
Lian and Xie (2011) developed a conditional random fields model which predicts user activities given the location, time,
identification and check-in history of the user.
Chang and Sun (2011) analysed Facebook check-in data to predict next check-
in place using a logistic regression model.
Coffey and Pozdnoukhov (2013) compared Foursquare data with CapitalBikeShare
report in Washington DC to predict the behaviour of people who use Bike Sharing facilities. Using probabilistic topic models,
they analysed bikeshare user movement as well as finding relationship between bikeshare user activities before, during and
after using bike sharing facilities.
Hasan and Ukkusuri (2014) analysed Foursquare check-in data from social media for
extracting individual weekly activity patterns using probabilistic topic models.
Lee et al. (2016) used geo-tagged tweets
to create individual activity spaces based on minimum bounding geometry (convex hull). By creating density maps of activ-
ity space, they found clear differences between weekday and weekend activity spaces. They used a clustering model to clas-
sify activity patterns. However, social media data contains rich information on activity types but this study could not
differentiate activity types as found in several earlier studies.
Davis and Goulias (2015) presented an ordered probit model
to explain the attractiveness and opportunities of places perceived by the residents of Santa Barbara, California. They com-
bined information from a place perception survey, geo-tagged tweets from Twitter and business establishment data from
Yelp. This study has found improved explanatory power for models because of social media data showing a promising direc-
tion towards developing better activity-travel behaviour models.
Check-in data from social media has an enormous potential of improving our knowledge in activity participation beha-
viour. Approaches so far used to understand activity participation from social media mainly come from machine learning and
data mining fields. Probabilistic models such as conditional random fields, logistic regression, and probit models have been
used to predict various aspects of activity participation. Different classification techniques such as probabilistic topic and k-
nearest neighbour models have been to classify activity choice patterns and cluster users based on their activity patterns.
However, the full potential of check-in data for activity-based modelling is yet to be realized. It is not clear how the derived
activity patterns can be explained since very limited socio-demographic variables are available from social media data.
Researchers have also identified the challenges of modelling activity generation and sequences/scheduling using social
media data due to missing activities (
Hasan, 2013). Complex probabilistic models accounting for missing observations
and inferring socio-demographic characteristics will be needed.
2.4. Public transportation assessment
There are a few papers in this area that mostly used sentiment analysis and keyword search for assessing public trans-
portation (
Schweitzer, 2014). Public transport has benefited from a solid review paper discussing applications of social
media data in domains related to public transport by
Pender et al. (2014) which is a standalone exercise of its kind with
a special focus on transit.
Collins et al. (2013) used Twitter data to evaluate transit rider satisfaction in Chicago train lines.
They proposed a two-side assessment model by considering people opinions along with metrics that are typically measured
by authorities. This paper is recognised as one of the pioneers of using social media data in public transport analysis. Sim-
ilarly,
Luong and Houston (2015) studied public opinions and attitudes about light rail transit service in Los Angles by look-
ing at Twitter data instead of traditional survey and interview. Nik Bakht et al. (2015) used Twitter data and only news
sources to assess public involvement in transportation planning. They picked Eglinton Crosstown transit project in Toronto
as case study because this project was mostly re-designed after public consultations.
Steiger et al. (2014) assessed public
transportation flows using geotagged social media (from Twitter, Foursquare, Instagram and Flicker) and validated it using
200 T.H. Rashidi et al. / Transportation Research Part C 75 (2017) 197–211

real data obtained from OpenStreetMap. They applied density-based spatial clustering (DBSCAN) to LDA to cluster the topics
related to ‘‘Train”. Then it was tried to segment the geotagged social media data to railways.
For agencies and city planners, knowing people’s opinion about public transport projects/service is crucial. As it has been
discussed in the literature, social media can be used as a possible source of information to obtain public option about the
system. This approach can be used at a low cost level at any time the information about public opinion is required.
2.5. Traffic conditions
There are some recent efforts trying to extract traffic condition data from social media which are mainly useful for net-
work operation and management purposes.
Tian et al. (2016) assessed the validity of traffic incidents reported in social
media by comparing field camera data from Austin, Texas and social media posts. The study found that citizens tweet more
often about true incidents compared to false incidents and tweet more often about major severe incidents compared to
minor incidents such as traffic hazards and stalled vehicles. However, they also found that social media incident reports have
low quality as around half of the verifiable incidents in their sample turned out to have limited information to the traveling
public.
Steur (2015) showed in a particular highway in the Netherlands, there is a meaningful correlation between number of
accidents and frequency of tweets near that area.
Wanichayapong et al. (2011) proposed a rule-based content analysis for
extracting traffic related information from tweets related to either points or links in Bangkok urban network. This paper
has been significantly cited in the literature as it was one of the first papers introducing social media as a means for early
accident identification for traffic management purposes. They used an approach to detect tweets including place and traffic
related information of accidents.
Ribeiro et al. (2012) illustrated that there is a meaningful correlation between real traffic
conditions and tweets talking about traffic conditions in Belo Horizonte (Brazil). They searched a predefined list of words in
the content of tweets reflecting traffic conditions such as movement (e.g. ‘‘slow”) or traffic status (e.g. ‘‘accident”). Then to
match proper location a gazetteer was used to find street and neighborhood names as described in contents.
Kosala and Adi
(2012)
developed a method for monitoring traffic condition of roads in Jakarta by real-time analysis of tweets. They also did
keywords search among the tweets’ contents to extract the traffic conditions. Their results were enhanced with confidence
level of traffic information.
Gao et al. (2012) attempted to investigate how social media data can be used to facilitate and
enhance transportation management. Later,
Gao et al. (2013) used a similar approach to propose a location-based recom-
mendation system based on the temporal properties of user movement tracked using the same ‘‘check-in” data. Such
approaches facilitate a variety of services such as traffic forecasting, advertisement, and disaster relief.
It has been addressed in the literature that social media contents can be used for traffic monitoring. This approach might
be considered as a supplement for the ever growing transport monitoring platforms. However, reliability of social media data
can be questioned due to low response rate for specific modes of transport at different time of day. Nonetheless, when suf-
ficiently large data is in hand by social media, it can be considered a supplementary source of data to extract information
about traffic conditions.
2.6. Interventions: Incidents and natural disasters
Pender et al. (2014) reviewed the literature of unplanned transit network disruptions with a focus on social media appli-
cations. This paper discussed how social media can be used to inform people and collect data during disruptions when other
types of media are not necessarily accessible.
Lindsay (2011) addressed the potential advantages of using social media in
case of nature disasters.
Hasan and Ukkusuri (2013) considered the social network influences on evacuation decisions.
Ukkusuri et al. (2014) studied the potential influences of social media during natural disasters to more effectively under-
stand people behaviour when a crisis happened. They particularly applied a sentiment analysis on Twitter data posted about
the tornado in Moore, Oklahoma. Similarly,
Kaigo (2012) studied the role of social networks and particularly Twitter during
Tsukuba 2011 earthquake in Japan where power outage immediately after the earthquake limited users’ access to media. In
this situation, social networks via smartphones became the primary way of access to media.
Sakaki et al. (2010) focused on
tweets related to earthquake/typhoon to extract real time information about a disaster and constructing an earthquake
reporting system in Japan. They developed a platform that can notify public much faster than authorities and agencies.
Another application of social media which also has received attention in the literature is acquiring real information about
incidents. In those papers using social media for more effectively managing traffic incidents was discussed.
Fu et al. (2015)
studied the feasibility of detecting traffic incidents from tweets. They also proposed a way to manage incidents more effec-
tively based on extra information that can obtain from related Twitter data. They only focused on tweets that contain inci-
dent related keywords and evaluated their achievements by comparing with the real-world incident data. It was showed that
tweets are useful for early incident detection and can be used as additional source of information for incident management.
Similar approach was taken by
Mai and Hranac (2013) by comparing recorded incidents by California Highway Patrol with
related tweets via visualizing the density of incidents and tweets coincide near the same location.
Steur (2015) did a similar
approach but for highways in the Netherlands.
In short, case of emergency situations in large cities, particularly when natural disasters happen, having real-time infor-
mation about the current network is crucial for managing the system in an efficient way. Social media, given its popularity
during disasters, can supplement other data sources to reflect the situation in a more holistic way. The main drawback is
T.H. Rashidi et al. / Transportation Research Part C 75 (2017) 197–211
201

Citations
More filters
Journal ArticleDOI

Big data in tourism research: A literature review

TL;DR: This paper might be the first attempt to present a comprehensive literature review on different types of big data in tourism research, and facilitates a thorough understanding of this sunrise research and offers valuable insights into its future prospects.
Journal ArticleDOI

Social media data for conservation science: A methodological overview

TL;DR: Combined with other data sources and carefully considering the biases and ethical issues, social media data can provide a complementary and cost-efficient information source for addressing the grand challenges of biodiversity conservation in the Anthropocene epoch.
Journal ArticleDOI

A deep learning approach for detecting traffic accidents from social media data

TL;DR: This paper thoroughly investigates the 1-year over 3 million tweet contents in two metropolitan areas: Northern Virginia and New York City and shows that paired tokens can capture the association rules inherent in the accident-related tweets and increase the accuracy of the traffic accident detection.
Journal ArticleDOI

TripImputor: Real-Time Imputing Taxi Trip Purpose Leveraging Multi-Sourced Urban Data

TL;DR: A probabilistic two-phase framework, named TripImputor, for making the real-time taxi trip purpose imputation and recommending services to passengers at their dropoff points, which is able to infer the trip purpose accurately and can provide recommendation results to passengers within 1.6 s in Manhattan on average.
Journal ArticleDOI

Dynamic assessment of PM2.5 exposure and health risk using remote sensing and geo-spatial big data.

TL;DR: In this article, location-based service (LBS) data from social media and satellite-derived high-quality PM2.5 concentrations were collected to perform highly spatiotemporal exposure assessments for thirteen cities in the Beijing-Tianjin-Hebei (BTH) region, China.
References
More filters
Proceedings ArticleDOI

Earthquake shakes Twitter users: real-time event detection by social sensors

TL;DR: This paper investigates the real-time interaction of events such as earthquakes in Twitter and proposes an algorithm to monitor tweets and to detect a target event and produces a probabilistic spatiotemporal model for the target event that can find the center and the trajectory of the event location.
Proceedings ArticleDOI

Friendship and mobility: user movement in location-based social networks

TL;DR: A model of human mobility that combines periodic short range movements with travel due to the social network structure is developed and it is shown that this model reliably predicts the locations and dynamics of future human movement and gives an order of magnitude better performance.
Journal IssueDOI

CiteSpace II: Detecting and visualizing emerging trends and transient patterns in scientific literature

TL;DR: This article describes the latest development of a generic approach to detecting and visualizing emerging trends and transient patterns in scientific literature, and makes substantial theoretical and methodological contributions to progressive knowledge domain visualization.
Journal ArticleDOI

Data-intensive applications, challenges, techniques and technologies: A survey on Big Data

TL;DR: This paper is aimed to demonstrate a close-up view about Big Data, including Big Data applications, Big Data opportunities and challenges, as well as the state-of-the-art techniques and technologies currently adopt to deal with the Big Data problems.
Journal ArticleDOI

Data mining with big data

TL;DR: A HACE theorem is presented that characterizes the features of the Big Data revolution, and a Big Data processing model is proposed, from the data mining perspective, which involves demand-driven aggregation of information sources, mining and analysis, user interest modeling, and security and privacy considerations.
Related Papers (5)