scispace - formally typeset
Search or ask a question

Showing papers in "ACM Transactions on Intelligent Systems and Technology in 2015"


Journal ArticleDOI
Yu Zheng1
TL;DR: A systematic survey on the major research into trajectory data mining, providing a panorama of the field as well as the scope of its research topics, and introduces the methods that transform trajectories into other data formats, such as graphs, matrices, and tensors.
Abstract: The advances in location-acquisition and mobile computing techniques have generated massive spatial trajectory data, which represent the mobility of a diversity of moving objects, such as people, vehicles, and animals. Many techniques have been proposed for processing, managing, and mining trajectory data in the past decade, fostering a broad range of applications. In this article, we conduct a systematic survey on the major research into trajectory data mining, providing a panorama of the field as well as the scope of its research topics. Following a road map from the derivation of trajectory data, to trajectory data preprocessing, to trajectory data management, and to a variety of mining tasks (such as trajectory pattern mining, outlier detection, and trajectory classification), the survey explores the connections, correlations, and differences among these existing techniques. This survey also introduces the methods that transform trajectories into other data formats, such as graphs, matrices, and tensors, to which more data mining and machine learning techniques can be applied. Finally, some public trajectory datasets are presented. This survey can help shape the field of trajectory data mining, providing a quick understanding of this field to the community.

1,289 citations


Journal ArticleDOI
TL;DR: A comprehensive survey on geo-social multimedia computing, recognition, mining, and analytics, covering recent advances in recognition and mining of geographical-aware social multimedia.
Abstract: Coming with the popularity of multimedia sharing platforms such as Facebook and Flickr, recent years have witnessed an explosive growth of geographical tags on social multimedia content. This trend enables a wide variety of emerging applications, for example, mobile location search, landmark recognition, scene reconstruction, and touristic recommendation, which range from purely research prototype to commercial systems. In this article, we give a comprehensive survey on these applications, covering recent advances in recognition and mining of geographical-aware social multimedia. We review related work in the past decade regarding to location recognition, scene summarization, tourism suggestion, 3D building modeling, mobile visual search and city navigation. At the end, we further discuss potential challenges, future topics, as well as open issues related to geo-social multimedia computing, recognition, mining, and analytics.

190 citations


Journal ArticleDOI
TL;DR: This article proposes a two-stage HPR system for Sign Language Recognition using a Kinect sensor and applies deep neural networks (DNNs) to automatically learn features from hand posture images that are insensitive to movement, scaling, and rotation.
Abstract: Hand posture recognition (HPR) is quite a challenging task, due to both the difficulty in detecting and tracking hands with normal cameras and the limitations of traditional manually selected features. In this article, we propose a two-stage HPR system for Sign Language Recognition using a Kinect sensor. In the first stage, we propose an effective algorithm to implement hand detection and tracking. The algorithm incorporates both color and depth information, without specific requirements on uniform-colored or stable background. It can handle the situations in which hands are very close to other parts of the body or hands are not the nearest objects to the camera and allows for occlusion of hands caused by faces or other hands. In the second stage, we apply deep neural networks (DNNs) to automatically learn features from hand posture images that are insensitive to movement, scaling, and rotation. Experiments verify that the proposed system works quickly and accurately and achieves a recognition accuracy as high as 98.12p.

116 citations


Journal ArticleDOI
TL;DR: A fast parallel SG method, FPSG, for shared memory systems is developed by dramatically reducing the cache-miss rate and carefully addressing the load balance of threads, which is more efficient than state-of-the-art parallel algorithms for matrix factorization.
Abstract: Matrix factorization is known to be an effective method for recommender systems that are given only the ratings from users to items. Currently, stochastic gradient (SG) method is one of the most popular algorithms for matrix factorization. However, as a sequential approach, SG is difficult to be parallelized for handling web-scale problems. In this article, we develop a fast parallel SG method, FPSG, for shared memory systems. By dramatically reducing the cache-miss rate and carefully addressing the load balance of threads, FPSG is more efficient than state-of-the-art parallel algorithms for matrix factorization.

98 citations


Journal ArticleDOI
TL;DR: A novel framework for the real-time capture, assessment, and visualization of ballet dance movements as performed by a student in an instructional, virtual reality (VR) setting is proposed.
Abstract: This article proposes a novel framework for the real-time capture, assessment, and visualization of ballet dance movements as performed by a student in an instructional, virtual reality (VR) setting. The acquisition of human movement data is facilitated by skeletal joint tracking captured using the popular Microsoft (MS) Kinect camera system, while instruction and performance evaluation are provided in the form of 3D visualizations and feedback through a CAVE virtual environment, in which the student is fully immersed. The proposed framework is based on the unsupervised parsing of ballet dance movement into a structured posture space using the spherical self-organizing map (SSOM). A unique feature descriptor is proposed to more appropriately reflect the subtleties of ballet dance movements, which are represented as gesture trajectories through posture space on the SSOM. This recognition subsystem is used to identify the category of movement the student is attempting when prompted (by a virtual instructor) to perform a particular dance sequence. The dance sequence is then segmented and cross-referenced against a library of gestural components performed by the teacher. This facilitates alignment and score-based assessment of individual movements within the context of the dance sequence. An immersive interface enables the student to review his or her performance from a number of vantage points, each providing a unique perspective and spatial context suggestive of how the student might make improvements in training. An evaluation of the recognition and virtual feedback systems is presented.

96 citations


Journal ArticleDOI
TL;DR: An approach to leverage citizen observations of various city systems and services, such as traffic, public transport, water supply, weather, sewage, and public safety, as a source of city events to be extracted from annotated text is proposed.
Abstract: Cities are composed of complex systems with physical, cyber, and social components. Current works on extracting and understanding city events mainly rely on technology-enabled infrastructure to observe and record events. In this work, we propose an approach to leverage citizen observations of various city systems and services, such as traffic, public transport, water supply, weather, sewage, and public safety, as a source of city events. We investigate the feasibility of using such textual streams for extracting city events from annotated text. We formalize the problem of annotating social streams such as microblogs as a sequence labeling problem. We present a novel training data creation process for training sequence labeling models. Our automatic training data creation process utilizes instance-level domain knowledge (e.g., locations in a city, possible event terms). We compare this automated annotation process to a state-of-the-art tool that needs manually created training data and show that it has comparable performance in annotation tasks. An aggregation algorithm is then presented for event extraction from annotated text. We carry out a comprehensive evaluation of the event annotation and event extraction on a real-world dataset consisting of event reports and tweets collected over 4 months from the San Francisco Bay Area. The evaluation results are promising and provide insights into the utility of social stream for extracting city events.

90 citations


Journal ArticleDOI
TL;DR: This article proposes considering the labeling noise in the process of learning to rank and using a two-step approach to extend existing algorithms to handle noisy training data and shows that the proposed approach can effectively distinguish noisy documents from clean ones, and the extended learning-to-rank algorithms can achieve better performances than baselines.
Abstract: Learning to rank, which learns the ranking function from training data, has become an emerging research area in information retrieval and machine learning. Most existing work on learning to rank assumes that the training data is clean, which is not always true, however. The ambiguity of query intent, the lack of domain knowledge, and the vague definition of relevance levels all make it difficult for common annotators to give reliable relevance labels to some documents. As a result, the relevance labels in the training data of learning to rank usually contain noise. If we ignore this fact, the performance of learning-to-rank algorithms will be damaged.In this article, we propose considering the labeling noise in the process of learning to rank and using a two-step approach to extend existing algorithms to handle noisy training data. In the first step, we estimate the degree of labeling noise for a training document. To this end, we assume that the majority of the relevance labels in the training data are reliable and we use a graphical model to describe the generative process of a training query, the feature vectors of its associated documents, and the relevance labels of these documents. The parameters in the graphical model are learned by means of maximum likelihood estimation. Then the conditional probability of the relevance label given the feature vector of a document is computed. If the probability is large, we regard the degree of labeling noise for this document as small; otherwise, we regard the degree as large. In the second step, we extend existing learning-to-rank algorithms by incorporating the estimated degree of labeling noise into their loss functions. Specifically, we give larger weights to those training documents with smaller degrees of labeling noise and smaller weights to those with larger degrees of labeling noise. As examples, we demonstrate the extensions for McRank, RankSVM, RankBoost, and RankNet. Empirical results on benchmark datasets show that the proposed approach can effectively distinguish noisy documents from clean ones, and the extended learning-to-rank algorithms can achieve better performances than baselines.

88 citations


Journal ArticleDOI
TL;DR: A new gravity model for location recommendations, called LORE, is proposed, to exploit the spatiotemporal sequential influence on location recommendations and achieves significantly superior location recommendations compared to other state-of-the-art location recommendation techniques.
Abstract: Recommending to users personalized locations is an important feature of Location-Based Social Networks (LBSNs), which benefits users who wish to explore new places and businesses to discover potential customers. In LBSNs, social and geographical influences have been intensively used in location recommendations. However, human movement also exhibits spatiotemporal sequential patterns, but only a few current studies consider the spatiotemporal sequential influence of locations on users’ check-in behaviors. In this article, we propose a new gravity model for location recommendations, called LORE, to exploit the spatiotemporal sequential influence on location recommendations. First, LORE extracts sequential patterns from historical check-in location sequences of all users as a Location-Location Transition Graph (L2TG), and utilizes the L2TG to predict the probability of a user visiting a new location through the developed additive Markov chain that considers the effect of all visited locations in the check-in history of the user on the new location. Furthermore, LORE applies our contrived gravity model to weigh the effect of each visited location on the new location derived from the personalized attractive force (i.e., the weight) between the visited location and the new location. The gravity model effectively integrates the spatiotemporal, social, and popularity influences by estimating a power-law distribution based on (i) the spatial distance and temporal difference between two consecutive check-in locations of the same user, (ii) the check-in frequency of social friends, and (iii) the popularity of locations from all users. Finally, we conduct a comprehensive performance evaluation for LORE using three large-scale real-world datasets collected from Foursquare, Gowalla, and Brightkite. Experimental results show that LORE achieves significantly superior location recommendations compared to other state-of-the-art location recommendation techniques.

84 citations


Journal ArticleDOI
Haoyi Xiong1, Daqing Zhang1, Leye Wang1, J. Paul Gibson1, Jie Zhu2 
TL;DR: Evaluations with a large-scale real-world phone call dataset show that the proposed EEMC framework outperforms the baseline approaches, and it can reduce overall energy consumption in data transfer by 54--66p when compared to the 3G-based solution.
Abstract: Mobile Crowdsensing (MCS) requires users to be motivated to participate. However, concerns regarding energy consumption and privacy—among other things—may compromise their willingness to join such a crowd. Our preliminary observations and analysis of common MCS applications have shown that the data transfer in MCS applications may incur significant energy consumption due to the 3G connection setup. However, if data are transferred in parallel with a traditional phone call, then such transfer can be done almost “for free”: with only an insignificant additional amount of energy required to piggy-back the data—usually incoming task assignments and outgoing sensor results—on top of the call. Here, we present an Energy-Efficient Mobile Crowdsensing (EEMC) framework where task assignments and sensing results are transferred in parallel with phone calls. The main objective, and the principal contribution of this article, is an MCS task assignment scheme that guarantees that a minimum number of anonymous participants return sensor results within a specified time frame, while also minimizing the waste of energy due to redundant task assignments and considering privacy concerns of participants. Evaluations with a large-scale real-world phone call dataset show that our proposed EEMC framework outperforms the baseline approaches, and it can reduce overall energy consumption in data transfer by 54--66p when compared to the 3G-based solution.

79 citations


Journal ArticleDOI
TL;DR: This paper introduces a novel heterogeneous transfer learning technique, Feature-Space Remapping (FSR), which transfers knowledge between domains with different feature spaces without requiring typical feature-feature, feature instance, or instance-instance co-occurrence data.
Abstract: Transfer learning aims to improve performance on a target task by utilizing previous knowledge learned from source tasks. In this paper we introduce a novel heterogeneous transfer learning technique, Feature-Space Remapping (FSR), which transfers knowledge between domains with different feature spaces. This is accomplished without requiring typical feature-feature, feature instance, or instance-instance co-occurrence data. Instead we relate features in different feature-spaces through the construction of metafeatures. We show how these techniques can utilize multiple source datasets to construct an ensemble learner which further improves performance. We apply FSR to an activity recognition problem and a document classification problem. The ensemble technique is able to outperform all other baselines and even performs better than a classifier trained using a large amount of labeled data in the target domain. These problems are especially difficult because, in addition to having different feature-spaces, the marginal probability distributions and the class labels are also different. This work extends the state of the art in transfer learning by considering large transfer across dramatically different spaces.

76 citations


Journal ArticleDOI
TL;DR: A Collaborative Exploration and Periodically Returning model, based on a novel problem, Exploration Prediction (EP), which forecasts whether people will seek unvisited locations to visit, and improves performances by as much as 30p compared to the traditional location prediction algorithms.
Abstract: With the growing popularity of location-based social networks, numerous location visiting records (e.g., check-ins) continue to accumulate over time. The more these records are collected, the better we can understand users’ mobility patterns and the more accurately we can predict their future locations. However, due to the personality trait of neophilia, people also show propensities of novelty seeking in human mobility, that is, exploring unvisited but tailored locations for them to visit. As such, the existing prediction algorithms, mainly relying on regular mobility patterns, face severe challenges because such behavior is beyond the reach of regularity. As a matter of fact, the prediction of this behavior not only relies on the forecast of novelty-seeking tendency but also depends on how to determine unvisited candidate locations. To this end, we put forward a Collaborative Exploration and Periodically Returning model (CEPR), based on a novel problem, Exploration Prediction (EP), which forecasts whether people will seek unvisited locations to visit, in the following. When people are predicted to do exploration, a state-of-the-art recommendation algorithm, armed with collaborative social knowledge and assisted by geographical influence, will be applied for seeking the suitable candidates; otherwise, a traditional prediction algorithm, incorporating both regularity and the Markov model, will be put into use for figuring out the most possible locations to visit. We then perform case studies on check-ins and evaluate them on two large-scale check-in datasets with 6M and 36M records, respectively. The evaluation results show that EP achieves a roughly 20p classification error rate on both datasets, greatly outperforming the baselines, and that CEPR improves performances by as much as 30p compared to the traditional location prediction algorithms.

Journal ArticleDOI
TL;DR: UFSM can be considered as a sparse high-dimensional factor model where the previous preferences of each user are incorporated within his or her latent representation and combines the merits of item similarity models that capture local relations among items and factor models that learn global preference patterns.
Abstract: Recommending new items for suitable users is an important yet challenging problem due to the lack of preference history for the new items. Noncollaborative user modeling techniques that rely on the item features can be used to recommend new items. However, they only use the past preferences of each user to provide recommendations for that user. They do not utilize information from the past preferences of other users, which can potentially be ignoring useful information. More recent factor models transfer knowledge across users using their preference information in order to provide more accurate recommendations. These methods learn a low-rank approximation for the preference matrix, which can lead to loss of information. Moreover, they might not be able to learn useful patterns given very sparse datasets. In this work, we present UFSM, a method for top-n recommendation of new items given binary user preferences. UFSM learns User-specific Feature-based item-Similarity Models, and its strength lies in combining two points: (1) exploiting preference information across all users to learn multiple global item similarity functions and (2) learning user-specific weights that determine the contribution of each global similarity function in generating recommendations for each user. UFSM can be considered as a sparse high-dimensional factor model where the previous preferences of each user are incorporated within his or her latent representation. This way, UFSM combines the merits of item similarity models that capture local relations among items and factor models that learn global preference patterns. A comprehensive set of experiments was conduced to compare UFSM against state-of-the-art collaborative factor models and noncollaborative user modeling techniques. Results show that UFSM outperforms other techniques in terms of recommendation quality. UFSM manages to yield better recommendations even with very sparse datasets. Results also show that UFSM can efficiently handle high-dimensional as well as low-dimensional item feature spaces.

Journal ArticleDOI
TL;DR: This article provides a framework for the unsupervised learning of this perceptual causal structure from video, and takes action and object status detections as input and uses heuristics suggested by cognitive science research to produce the causal links perceived between them.
Abstract: Perceptual causality is the perception of causal relationships from observation. Humans, even as infants, form such models from observation of the world around them [Saxe and Carey 2006]. For a deeper understanding, the computer must make similar models through the analogous form of observation: video. In this article, we provide a framework for the unsupervised learning of this perceptual causal structure from video. Our method takes action and object status detections as input and uses heuristics suggested by cognitive science research to produce the causal links perceived between them. We greedily modify an initial distribution featuring independence between potential causes and effects by adding dependencies that maximize information gain. We compile the learned causal relationships into a Causal And-Or Graph, a probabilistic and-or representation of causality that adds a prior to causality. Validated against human perception, experiments show that our method correctly learns causal relations, attributing status changes of objects to causing actions amid irrelevant actions. Our method outperforms Hellinger’s χ2-statistic by considering hierarchical action selection, and outperforms the treatment effect by discounting coincidental relationships.

Journal ArticleDOI
TL;DR: This article shows that for any acyclic functional causal model, minimizing the mutual information between the hypothetical cause and the noise term is equivalent to maximizing the data likelihood with a flexible model for the distribution of the noiseTerm, and proposes a Bayesian nonparametric approach based on mutual information minimization.
Abstract: Compared to constraint-based causal discovery, causal discovery based on functional causal models is able to identify the whole causal model under appropriate assumptions [Shimizu et al. 2006; Hoyer et al. 2009; Zhang and Hyvarinen 2009b]. Functional causal models represent the effect as a function of the direct causes together with an independent noise term. Examples include the linear non-Gaussian acyclic model (LiNGAM), nonlinear additive noise model, and post-nonlinear (PNL) model. Currently, there are two ways to estimate the parameters in the models: dependence minimization and maximum likelihood. In this article, we show that for any acyclic functional causal model, minimizing the mutual information between the hypothetical cause and the noise term is equivalent to maximizing the data likelihood with a flexible model for the distribution of the noise term. We then focus on estimation of the PNL causal model and propose to estimate it with the warped Gaussian process with the noise modeled by the mixture of Gaussians. As a Bayesian nonparametric approach, it outperforms the previous one based on mutual information minimization with nonlinear functions represented by multilayer perceptrons; we also show that unlike the ordinary regression, estimation results of the PNL causal model are sensitive to the assumption on the noise distribution. Experimental results on both synthetic and real data support our theoretical claims.

Journal ArticleDOI
TL;DR: This article proposes a complete data-driven system that pushes towards real-time sensing of individual refueling behavior and citywide petrol consumption, and proposes context-aware tensor factorization (CATF), a factorization model that considers a variety of contextual factors that affect consumers’ refueling decision.
Abstract: Urban transportation is an important factor in energy consumption and pollution, and is of increasing concern due to its complexity and economic significance. Its importance will only increase as urbanization continues around the world. In this article, we explore drivers’ refueling behavior in urban areas. Compared to questionnaire-based methods of the past, we propose a complete data-driven system that pushes towards real-time sensing of individual refueling behavior and citywide petrol consumption. Our system provides the following: detection of individual refueling events (REs) from which refueling preference can be analyzed; estimates of gas station wait times from which recommendations can be made; an indication of overall fuel demand from which macroscale economic decisions can be made, and a spatial, temporal, and economic view of urban refueling characteristics. For individual behavior, we use reported trajectories from a fleet of GPS-equipped taxicabs to detect gas station visits. For time spent estimates, to solve the sparsity issue along time and stations, we propose context-aware tensor factorization (CATF), a factorization model that considers a variety of contextual factors (e.g., price, brand, and weather condition) that affect consumers’ refueling decision. For fuel demand estimates, we apply a queue model to calculate the overall visits based on the time spent inside the station. We evaluated our system on large-scale and real-world datasets, which contain 4-month trajectories of 32,476 taxicabs, 689 gas stations, and the self-reported refueling details of 8,326 online users. The results show that our system can determine REs with an accuracy of more than 90p, estimate time spent with less than 2 minutes of error, and measure overall visits in the same order of magnitude with the records in the field study.

Journal ArticleDOI
TL;DR: The flexibility, modularity, and extensibility of ohmage in supporting diverse deployment settings are presented through three distinct case studies in education, health, and clinical research.
Abstract: Participatory sensing (PS) is a distributed data collection and analysis approach where individuals, acting alone or in groups, use their personal mobile devices to systematically explore interesting aspects of their lives and communities [Burke et al. 2006]. These mobile devices can be used to capture diverse spatiotemporal data through both intermittent self-report and continuous recording from on-board sensors and applications. Ohmage (http://ohmage.org) is a modular and extensible open-source, mobile to Web PS platform that records, stores, analyzes, and visualizes data from both prompted self-report and continuous data streams. These data streams are authorable and can dynamically be deployed in diverse settings. Feedback from hundreds of behavioral and technology researchers, focus group participants, and end users has been integrated into ohmage through an iterative participatory design process. Ohmage has been used as an enabling platform in more than 20 independent projects in many disciplines. We summarize the PS requirements, challenges and key design objectives learned through our design process, and ohmage system architecture to achieve those objectives. The flexibility, modularity, and extensibility of ohmage in supporting diverse deployment settings are presented through three distinct case studies in education, health, and clinical research.

Journal ArticleDOI
TL;DR: A parameterless mixture model-based approach that is capable of addressing the three aforementioned issues in a single framework is proposed, based on the multivariate beta mixtures, in order to model the estimated set of feature vectors.
Abstract: Several approaches have been proposed for the problem of identifying authoritative actors in online communities. However, the majority of existing methods suffer from one or more of the following limitations: (1) There is a lack of an automatic mechanism to formally discriminate between authoritative and nonauthoritative users. In fact, a common approach to authoritative user identification is to provide a ranked list of users expecting authorities to come first. A major problem of such an approach is the question of where to stop reading the ranked list of users. How many users should be chosen as authoritativeq (2) Supervised learning approaches for authoritative user identification suffer from their dependency on the training data. The problem here is that labeled samples are more difficult, expensive, and time consuming to obtain than unlabeled ones. (3) Several approaches rely on some user parameters to estimate an authority score. Detection accuracy of authoritative users can be seriously affected if incorrect values are used. In this article, we propose a parameterless mixture model-based approach that is capable of addressing the three aforementioned issues in a single framework. In our approach, we first represent each user with a feature vector composed of information related to its social behavior and activity in an online community. Next, we propose a statistical framework, based on the multivariate beta mixtures, in order to model the estimated set of feature vectors. The probability density function is therefore estimated and the beta component that corresponds to the most authoritative users is identified. The suitability of the proposed approach is illustrated on real data extracted from the Stack Exchange question-answering network and Twitter.

Journal ArticleDOI
Tao Qin1, Wei Chen1, Tie-Yan Liu1
TL;DR: A comprehensive review of sponsored search auctions is provided in hopes of helping both industry practitioners and academic researchers to become familiar with this field, to know the state of the art, and to identify future research topics.
Abstract: Sponsored search has been proven to be a successful business model, and sponsored search auctions have become a hot research direction. There have been many exciting advances in this field, especially in recent years, while at the same time, there are also many open problems waiting for us to resolve. In this article, we provide a comprehensive review of sponsored search auctions in hopes of helping both industry practitioners and academic researchers to become familiar with this field, to know the state of the art, and to identify future research topics. Specifically, we organize the article into two parts. In the first part, we review research works on sponsored search auctions with basic settings, where fully rational advertisers without budget constraints, preknown click-through rates (CTRs) without interdependence, and exact match between queries and keywords are assumed. Under these assumptions, we first introduce the generalized second price (GSP) auction, which is the most popularly used auction mechanism in the industry. Then we give the definitions of several well-studied equilibria and review the latest results on GSP’s efficiency and revenue in these equilibria. In the second part, we introduce some advanced topics on sponsored search auctions. In these advanced topics, one or more assumptions made in the basic settings are relaxed. For example, the CTR of an ad could be unknown and dependent on other ads; keywords could be broadly matched to queries before auctions are executed; and advertisers are not necessarily fully rational, could have budget constraints, and may prefer rich bidding languages. Given that the research on these advanced topics is still immature, in each section of the second part, we provide our opinions on how to make further advances, in addition to describing what has been done by researchers in the corresponding direction.

Journal ArticleDOI
TL;DR: This article uses human flow dynamics, which reflects the social activeness of a region, to detect social events and measure their impacts, and proposes a method that can not only discover the happening time and venue of events from abnormalsocial activeness, but also measure the scale of events through changes in such activeness.
Abstract: A social event is an occurrence that involves lots of people and is accompanied by an obvious rise in human flow. Analysis of social events has real-world importance because events bring about impacts on many aspects of city life. Traditionally, detection and impact measurement of social events rely on social investigation, which involves considerable human effort. Recently, by analyzing messages in social networks, researchers can also detect and evaluate country-scale events. Nevertheless, the analysis of city-scale events has not been explored. In this article, we use human flow dynamics, which reflect the social activeness of a region, to detect social events and measure their impacts. We first extract human flow dynamics from taxi traces. Second, we propose a method that can not only discover the happening time and venue of events from abnormal social activeness, but also measure the scale of events through changes in such activeness. Third, we extract traffic congestion information from traces and use its change during social events to measure their impact. The results of experiments validate the effectiveness of both the event detection and impact measurement methods.

Journal ArticleDOI
TL;DR: A novel algorithm to model and recognize sign language performed in front of a Microsoft Kinect sensor is proposed that first assigns a binary latent variable to each frame in training videos for indicating its discriminative capability, then develops a latent support vector machine model to classify the signs.
Abstract: Vision-based sign language recognition has attracted more and more interest from researchers in the computer vision field. In this article, we propose a novel algorithm to model and recognize sign language performed in front of a Microsoft Kinect sensor. Under the assumption that some frames are expected to be both discriminative and representative in a sign language video, we first assign a binary latent variable to each frame in training videos for indicating its discriminative capability, then develop a latent support vector machine model to classify the signs, as well as localize the discriminative and representative frames in each video. In addition, we utilize the depth map together with the color image captured by the Kinect sensor to obtain a more effective and accurate feature to enhance the recognition accuracy. To evaluate our approach, we conducted experiments on both word-level sign language and sentence-level sign language. An American Sign Language dataset including approximately 2,000 word-level sign language phrases and 2,000 sentence-level sign language phrases was collected using the Kinect sensor, and each phrase contains color, depth, and skeleton information. Experiments on our dataset demonstrate the effectiveness of the proposed method for sign language recognition.

Journal ArticleDOI
Kyumin Lee1, Jalal Mahmud2, Jilin Chen3, Michelle Zhou1, Jeffrey Nichols3 
TL;DR: A recommender system that predicts the likelihood of a stranger to retweet information when asked, within a specific time window, and recommends the top-N qualified strangers to engage with.
Abstract: There has been much effort on studying how social media sites, such as Twitter, help propagate information in different situations, including spreading alerts and SOS messages in an emergency. However, existing work has not addressed how to actively identify and engage the right strangers at the right time on social media to help effectively propagate intended information within a desired time frame. To address this problem, we have developed three models: (1) a feature-based model that leverages people's exhibited social behavior, including the content of their tweets and social interactions, to characterize their willingness and readiness to propagate information on Twitter via the act of retweeting; (2) a wait-time model based on a user's previous retweeting wait times to predict his or her next retweeting time when asked; and (3) a subset selection model that automatically selects a subset of people from a set of available people using probabilities predicted by the feature-based model and maximizes retweeting rate. Based on these three models, we build a recommender system that predicts the likelihood of a stranger to retweet information when asked, within a specific time window, and recommends the top-N qualified strangers to engage with. Our experiments, including live studies in the real world, demonstrate the effectiveness of our work.

Journal ArticleDOI
TL;DR: This work proposes a novel visibility restoration approach that is based on Bi-Histogram modification, and which integrates a haze density estimation module and a haze formation removal module for effective and accurate estimation of haze density in the transmission map.
Abstract: Visibility restoration techniques are widely used for information recovery of hazy images in many computer vision applications. Estimation of haze density is an essential task of visibility restoration techniques. However, conventional visibility restoration techniques often suffer from either the generation of serious artifacts or the loss of object information in the restored images due to uneven haze density, which usually means that the images contain heavy haze formation within their background regions and little haze formation within their foreground regions. This frequently occurs when the images feature real-world scenes with a deep depth of field. How to effectively and accurately estimate the haze density in the transmission map for these images is the most challenging aspect of the traditional state-of-the-art techniques. In response to this problem, this work proposes a novel visibility restoration approach that is based on Bi-Histogram modification, and which integrates a haze density estimation module and a haze formation removal module for effective and accurate estimation of haze density in the transmission map. As our experimental results demonstrate, the proposed approach achieves superior visibility restoration efficacy in comparison with the other state-of-the-art approaches based on both qualitative and quantitative evaluations. The proposed approach proves effective and accurate in terms of both background and foreground restoration of various hazy scenarios.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed the concept of causal rules (CRs) and developed an algorithm for mining CRs in large datasets and used the idea of retrospective cohort studies to detect CRs.
Abstract: Randomised controlled trials (RCTs) are the most effective approach to causal discovery, but in many circumstances it is impossible to conduct RCTs. Therefore, observational studies based on passively observed data are widely accepted as an alternative to RCTs. However, in observational studies, prior knowledge is required to generate the hypotheses about the cause-effect relationships to be tested, and hence they can only be applied to problems with available domain knowledge and a handful of variables. In practice, many datasets are of high dimensionality, which leaves observational studies out of the opportunities for causal discovery from such a wealth of data sources. In another direction, many efficient data mining methods have been developed to identify associations among variables in large datasets. The problem is that causal relationships imply associations, but the reverse is not always true. However, we can see the synergy between the two paradigms here. Specifically, association rule mining can be used to deal with the high-dimensionality problem, whereas observational studies can be utilised to eliminate noncausal associations. In this article, we propose the concept of causal rules (CRs) and develop an algorithm for mining CRs in large datasets. We use the idea of retrospective cohort studies to detect CRs based on the results of association rule mining. Experiments with both synthetic and real-world datasets have demonstrated the effectiveness and efficiency of CR mining. In comparison with the commonly used causal discovery methods, the proposed approach generally is faster and has better or competitive performance in finding correct or sensible causes. It is also capable of finding a cause consisting of multiple variables—a feature that other causal discovery methods do not possess.

Journal ArticleDOI
TL;DR: It is shown how properly accounting for spatial and temporal variation can lead to more reasonable causal graphs and how highly structured data, like images and text, can be used in a causal inference framework using a novel structured input/output Gaussian process formulation.
Abstract: In applied fields, practitioners hoping to apply causal structure learning or causal orientation algorithms face an important question: which independence test is appropriate for my data? In the case of real-valued iid data, linear dependencies, and Gaussian error terms, partial correlation is sufficient. But once any of these assumptions is modified, the situation becomes more complex. Kernel-based tests of independence have gained popularity to deal with nonlinear dependencies in recent years, but testing for conditional independence remains a challenging problem. We highlight the important issue of non-iid observations: when data are observed in space, time, or on a network, “nearby” observations are likely to be similar. This fact biases estimates of dependence between variables. Inspired by the success of Gaussian process regression for handling non-iid observations in a wide variety of areas and by the usefulness of the Hilbert-Schmidt Independence Criterion (HSIC), a kernel-based independence test, we propose a simple framework to address all of these issues: first, use Gaussian process regression to control for certain variables and to obtain residuals. Second, use HSIC to test for independence. We illustrate this on two classic datasets, one spatial, the other temporal, that are usually treated as iid. We show how properly accounting for spatial and temporal variation can lead to more reasonable causal graphs. We also show how highly structured data, like images and text, can be used in a causal inference framework using a novel structured input/output Gaussian process formulation. We demonstrate this idea on a dataset of translated sentences, trying to predict the source language.

Journal ArticleDOI
TL;DR: Gestures à Go Go (g3), a web service plus an accompanying web application for bootstrapping stroke gesture samples based on the kinematic theory of rapid human movements, and it is shown that synthesized gestures perform equally similar to gestures generated by human users.
Abstract: Training a high-quality gesture recognizer requires providing a large number of examples to enable good performance on unseen, future data. However, recruiting participants, data collection, and labeling, etc., necessary for achieving this goal are usually time consuming and expensive. Thus, it is important to investigate how to empower developers to quickly collect gesture samples for improving UI usage and user experience. In response to this need, we introduce Gestures a Go Go (g3), a web service plus an accompanying web application for bootstrapping stroke gesture samples based on the kinematic theory of rapid human movements. The user only has to provide a gesture example once, and g3 will create a model of that gesture. Then, by introducing local and global perturbations to the model parameters, g3 generates from tens to thousands of synthetic human-like samples. Through a comprehensive evaluation, we show that synthesized gestures perform equally similar to gestures generated by human users. Ultimately, this work informs our understanding of designing better user interfaces that are driven by gestures.

Journal ArticleDOI
TL;DR: Experimental results show that the four different features investigated in the article can complement each other, and appropriate fusion methods can improve the recognition accuracies significantly over each individual feature.
Abstract: Human action recognition is a very active research topic in computer vision and pattern recognition. Recently, it has shown a great potential for human action recognition using the three-dimensional (3D) depth data captured by the emerging RGB-D sensors. Several features and/or algorithms have been proposed for depth-based action recognition. A question is raised: Can we find some complementary features and combine them to improve the accuracy significantly for depth-based action recognitionq To address the question and have a better understanding of the problem, we study the fusion of different features for depth-based action recognition. Although data fusion has shown great success in other areas, it has not been well studied yet on 3D action recognition. Some issues need to be addressed, for example, whether the fusion is helpful or not for depth-based action recognition, and how to do the fusion properly. In this article, we study different fusion schemes comprehensively, using diverse features for action characterization in depth videos. Two different levels of fusion schemes are investigated, that is, feature level and decision level. Various methods are explored at each fusion level. Four different features are considered to characterize the depth action patterns from different aspects. The experiments are conducted on four challenging depth action databases, in order to evaluate and find the best fusion methods generally. Our experimental results show that the four different features investigated in the article can complement each other, and appropriate fusion methods can improve the recognition accuracies significantly over each individual feature. More importantly, our fusion-based action recognition outperforms the state-of-the-art approaches on these challenging databases.

Journal ArticleDOI
TL;DR: A novel framework that combines interactive image segmentation with multifeature fusion to achieve improved MLR with high accuracy and an effective vector binarization method to reduce the memory usage of image descriptors extracted on-device, which maintains comparable recognition accuracy to the original descriptors.
Abstract: Along with the exponential growth of high-performance mobile devices, on-device Mobile Landmark Recognition (MLR) has recently attracted increasing research attention. However, the latency and accuracy of automatic recognition remain as bottlenecks against its real-world usage. In this article, we introduce a novel framework that combines interactive image segmentation with multifeature fusion to achieve improved MLR with high accuracy. First, we propose an effective vector binarization method to reduce the memory usage of image descriptors extracted on-device, which maintains comparable recognition accuracy to the original descriptors. Second, we design a location-aware fusion algorithm that can fuse multiple visual features into a compact yet discriminative image descriptor to improve on-device efficiency. Third, a user-friendly interaction scheme is developed that enables interactive foreground/background segmentation to largely improve recognition accuracy. Experimental results demonstrate the effectiveness of the proposed algorithms for on-device MLR applications.

Journal ArticleDOI
TL;DR: This article presents a point-of-interest (POI) category-transition--based approach, with a goal of estimating the visiting probability of a series of successive POIs conditioned on current user context and sensor context, and presents a mobile local recommendation demo application.
Abstract: While on the go, people are using their phones as a personal concierge discovering what is around and deciding what to do. Mobile phone has become a recommendation terminal customized for individuals—capable of recommending activities and simplifying the accomplishment of related tasks. In this article, we conduct usage mining on the check-in data, with summarized statistics identifying the local recommendation challenges of huge solution space, sparse available data, and complicated user intent, and discovered observations to motivate the hierarchical, contextual, and sequential solution. We present a point-of-interest (POI) category-transition--based approach, with a goal of estimating the visiting probability of a series of successive POIs conditioned on current user context and sensor context. A mobile local recommendation demo application is deployed. The objective and subjective evaluations validate the effectiveness in providing mobile users both accurate recommendation and favorable user experience.

Journal ArticleDOI
TL;DR: In this paper, the authors show that the number of topics is a key factor that can significantly boost the utility of topic-modeling systems and develop a novel distributed system called Peacock to learn big LDA models from big data.
Abstract: Latent Dirichlet allocation (LDA) is a popular topic modeling technique in academia but less so in industry, especially in large-scale applications involving search engine and online advertising systems. A main underlying reason is that the topic models used have been too small in scale to be useful; for example, some of the largest LDA models reported in literature have up to 103 topics, which difficultly cover the long-tail semantic word sets. In this article, we show that the number of topics is a key factor that can significantly boost the utility of topic-modeling systems. In particular, we show that a “big” LDA model with at least 105 topics inferred from 109 search queries can achieve a significant improvement on industrial search engine and online advertising systems, both of which serve hundreds of millions of users. We develop a novel distributed system called Peacock to learn big LDA models from big data. The main features of Peacock include hierarchical distributed architecture, real-time prediction, and topic de-duplication. We empirically demonstrate that the Peacock system is capable of providing significant benefits via highly scalable LDA topic models for several industrial applications.

Journal ArticleDOI
TL;DR: The experiment results show that health-related social media is a promising source for ADR detection, and the proposed techniques are effective to identify early ADR signals.
Abstract: Since adverse drug reactions (ADRs) represent a significant health problem all over the world, ADR detection has become an important research topic in drug safety surveillance. As many potential ADRs cannot be detected though premarketing review, drug safety currently depends heavily on postmarketing surveillance. Particularly, current postmarketing surveillance in the United States primarily relies on the FDA Adverse Event Reporting System (FAERS). However, the effectiveness of such spontaneous reporting systems for ADR detection is not as good as expected because of the extremely high underreporting ratio of ADRs. Moreover, it often takes the FDA years to complete the whole process of collecting reports, investigating cases, and releasing alerts. Given the prosperity of social media, many online health communities are publicly available for health consumers to share and discuss any healthcare experience such as ADRs they are suffering. Such health-consumer-contributed content is timely and informative, but this data source still remains untapped for postmarketing drug safety surveillance. In this study, we propose to use (1) association mining to identify the relations between a drug and an ADR and (2) temporal analysis to detect drug safety signals at the early stage. We collect data from MedHelp and use the FDA's alerts and information of drug labeling revision as the gold standard to evaluate the effectiveness of our approach. The experiment results show that health-related social media is a promising source for ADR detection, and our proposed techniques are effective to identify early ADR signals.