Showing papers by "Gao Cong published in 2019"

PDF

Open Access

Proceedings Article•DOI•

Computing Trajectory Similarity in Linear Time: A Generic Seed-Guided Neural Metric Learning Approach

[...]

Di Yao¹, Gao Cong², Chao Zhang³, Jingping Bi¹•Institutions (3)

Chinese Academy of Sciences¹, Nanyang Technological University², University of Illinois at Urbana–Champaign³

08 Apr 2019

TL;DR: NeuTraj is generic to accommodate any existing trajectory measure and fast to compute the similarity of a given trajectory pair in linear time, and obtains 50x-1000x speed up over bruteforce methods and 3x-500x speedup over existing approximate algorithms, while yielding more accurate approximations of the similarity functions.

...read moreread less

Abstract: Trajectory similarity computation is a fundamental problem for various applications in trajectory data analysis. However, the high computation cost of existing trajectory similarity measures has become the key bottleneck for trajectory analysis at scale. While there have been many research efforts for reducing the complexity, they are specific to one similarity measure and often yield limited speedups. We propose NeuTraj to accelerate trajectory similarity computation. NeuTraj is generic to accommodate any existing trajectory measure and fast to compute the similarity of a given trajectory pair in linear time. Furthermore, NeuTraj is elastic to collaborate with all spatial-based trajectory indexing methods to reduce the search space. NeuTraj samples a number of seed trajectories from the given database, and then uses their pair-wise similarities as guidance to approximate the similarity function with a neural metric learning framework. NeuTraj features two novel modules to achieve accurate approximation of the similarity function: (1) a spatial attention memory module that augments existing recurrent neural networks for trajectory encoding; and (2) a distance-weighted ranking loss that effectively transcribes information from the seed-based guidance. With these two modules, NeuTraj can yield high accuracies and fast convergence rates even if the training data is small. Our experiments on two real-life datasets show that NeuTraj achieves over 80% accuracy on Fre chet, Hausdorff, ERP and DTW measures, which outperforms state-of-the-art baselines consistently and significantly. It obtains 50x-1000x speedup over bruteforce methods and 3x-500x speedup over existing approximate algorithms, while yielding more accurate approximations of the similarity functions.

...read moreread less

74 citations

Proceedings Article•DOI•

Learning Travel Time Distributions with Deep Generative Model

[...]

Xiucheng Li¹, Gao Cong¹, Aixin Sun¹, Yun Cheng•Institutions (1)

Nanyang Technological University¹

13 May 2019

TL;DR: A deep generative model to learn the travel time distribution for any route by conditioning on the real-time traffic, which produces substantially better results than state-of-the-art alternatives in two tasks: travel time estimation and route recovery from sparse trajectory data.

...read moreread less

Abstract: Travel time estimation of a given route with respect to real-time traffic condition is extremely useful for many applications like route planning. We argue that it is even more useful to estimate the travel time distribution, from which we can derive the expected travel time as well as the uncertainty. In this paper, we develop a deep generative model - DeepGTT - to learn the travel time distribution for any route by conditioning on the real-time traffic. DeepGTT interprets the generation of travel time using a three-layer hierarchical probabilistic model. In the first layer, we present two techniques, amortization and spatial smoothness embeddings, to share statistical strength among different road segments; a convolutional neural net based representation learning component is also proposed to capture the dynamically changing real-time traffic condition. In the middle layer, a nonlinear factorization model is developed to generate auxiliary random variable i.e., speed. The introduction of this middle layer separates the statical spatial features from the dynamically changing real-time traffic conditions, allowing us to incorporate the heterogeneous influencing factors into a single model. In the last layer, an attention mechanism based function is proposed to collectively generate the observed travel time. DeepGTT describes the generation process in a reasonable manner, and thus it not only produces more accurate results but also is more efficient. On a real-world large-scale data set, we show that DeepGTT produces substantially better results than state-of-the-art alternatives in two tasks: travel time estimation and route recovery from sparse trajectory data.

...read moreread less

50 citations

Proceedings Article•DOI•

Interact and Decide: Medley of Sub-Attention Networks for Effective Group Recommendation

[...]

Lucas Vinh Tran¹, Tuan-Anh Nguyen Pham¹, Yi Tay¹, Yiding Liu¹, Gao Cong¹, Xiaoli Li² - Show less +2 more•Institutions (2)

Nanyang Technological University¹, Institute for Infocomm Research Singapore²

18 Jul 2019

TL;DR: This paper proposes Medley of Sub-Attention Networks (MoSAN), a new novel neural architecture for the group recommendation task that not only achieves state-of-the-art performance but also improves standard baselines by a considerable margin.

...read moreread less

Abstract: This paper proposes Medley of Sub-Attention Networks (MoSAN), a new novel neural architecture for the group recommendation task. Group-level recommendation is known to be a challenging task, in which intricate group dynamics have to be considered. As such, this is to be contrasted with the standard recommendation problem where recommendations are personalized with respect to a single user. Our proposed approach hinges upon the key intuition that the decision making process (in groups) is generally dynamic, i.e., a user's decision is highly dependent on the other group members. All in all, our key motivation manifests in a form of an attentive neural model that captures fine-grained interactions between group members. In our MoSAN model, each sub-attention module is representative of a single member, which models a user's preference with respect to all other group members. Subsequently, a Medley of Sub-Attention modules is then used to collectively make the group's final decision. Overall, our proposed model is both expressive and effective. Via a series of extensive experiments, we show that MoSAN not only achieves state-of-the-art performance but also improves standard baselines by a considerable margin.

...read moreread less

46 citations

Proceedings Article•DOI•

Effective and Efficient Sports Play Retrieval with Deep Representation Learning

[...]

Zheng Wang¹, Cheng Long¹, Gao Cong¹, Ce Ju²•Institutions (2)

Nanyang Technological University¹, Baidu²

25 Jul 2019

TL;DR: A deep learning approach to learn the representations of sports plays, called play2vec, is proposed, which is robust against noise and takes only linear time to compute the similarity between two sports plays.

...read moreread less

Abstract: With the proliferation of commercial tracking systems, sports data is being generated at an unprecedented speed and the interest in sports play retrieval has grown dramatically as well. However, it is challenging to design an effective, efficient and robust similarity measure for sports play retrieval. To this end, we propose a deep learning approach to learn the representations of sports plays, called play2vec, which is robust against noise and takes only linear time to compute the similarity between two sports plays. We conduct experiments on real-world soccer match data, and the results show that our solution performs more effectively and efficiently compared with the state-of-the-art methods.

...read moreread less

33 citations

Journal Article•DOI•

Finding attribute-aware similar regions for data analysis

[...]

Kaiyu Feng¹, Gao Cong¹, Christian S. Jensen², Tao Guo³•Institutions (3)

Nanyang Technological University¹, Aalborg University², Google³

01 Jul 2019

TL;DR: This work formalizes and study a new problem called the attribute-aware similar region search (ASRS) problem, and proposes a novel algorithm called DS-Search to find the most similar region of the same size.

...read moreread less

Abstract: With the proliferation of mobile devices and location-based services, increasingly massive volumes of geo-tagged data are becoming available. This data typically also contains non-location information. We study how to use such information to characterize a region and then how to find a region of the same size and with the most similar characteristics. This functionality enables a user to identify regions that share characteristics with a user-supplied region that the user is familiar with and likes. More specifically, we formalize and study a new problem called the attribute-aware similar region search (ASRS) problem. We first define so-called composite aggregators that are able to express aspects of interest in terms of the information associated with a user-supplied region. When applied to a region, an aggregator captures the region's relevant characteristics. Next, given a query region and a composite aggregator, we propose a novel algorithm called DS-Search to find the most similar region of the same size. Unlike any previous work on region search, DS-Search repeatedly discretizes and splits regions until an split region either satisfies a drop condition or it is guaranteed to not contribute to the result. In addition, we extend DS-Search to solve the ASRS problem approximately. Finally, we report on extensive empirical studies that offer insight into the efficiency and effectiveness of the paper's proposals.

...read moreread less

13 citations

Journal Article•DOI•

Evaluating pattern matching queries for spatial databases

[...]

Yixiang Fang¹, Yun Li², Reynold Cheng¹, Nikos Mamoulis³, Gao Cong⁴ - Show less +1 more•Institutions (4)

University of Hong Kong¹, Nanjing University², University of Ioannina³, Nanyang Technological University⁴

01 Oct 2019

TL;DR: It is proved that answering spatial pattern matching queries is computationally intractable and proposed algorithms to address two related problems of the SPM are highly effective and efficient.

...read moreread less

Abstract: In this paper, we study the spatial pattern matching (SPM) query. Given a set D of spatial objects (e.g., houses and shops), each with a textual description, we aim at finding all combinations of objects from D that match a user-defined spatial patternP. A pattern P is a graph whose vertices represent spatial objects, and edges denote distance relationships between them. The SPM query returns the instances that satisfy P. An example of P can be “a house within 10-min walk from a school, which is at least 2 km away from a hospital.” The SPM query can benefit users such as house buyers, urban planners, and archeologists. We prove that answering such queries is computationally intractable and propose two efficient algorithms for their evaluation. Moreover, we study efficient solutions to address two related problems of the SPM: (1) find top-k matches that are close to a query location and (2) return partial matches for a query pattern. Experiments and case studies on real datasets show that our proposed solutions are highly effective and efficient.

...read moreread less

9 citations

Journal Article•DOI•

Exploring market competition over topics in spatio-temporal document collections

[...]

Kaiqi Zhao¹, Gao Cong¹, Jin Yao Chin¹, Rong Wen•Institutions (1)

Nanyang Technological University¹

01 Feb 2019

TL;DR: A novel framework equipped by a generative model for mining topics and market competition, an Octree-based off-line pre-training method for the model and an efficient algorithm for combining pre-trained models to return the topics andMarket competition on each topic within a user-specified pair of region and time span is proposed.

...read moreread less

Abstract: With the prominence of location-based services and social networks in recent years, huge amounts of spatio-temporal document collections (e.g., geo-tagged tweets) have been generated. These data collections often imply user’s ideas on different products and thus are helpful for business owners to explore hot topics of their brands and the competition relation to other brands in different spatial regions during different periods. In this work, we aim to mine the topics and the market competition of different brands over each topic for a category of business (e.g., coffeehouses) from spatio-temporal documents within a user-specified region and time period. To support such spatio-temporal search online in an exploratory manner, we propose a novel framework equipped by (1) a generative model for mining topics and market competition, (2) an Octree-based off-line pre-training method for the model and (3) an efficient algorithm for combining pre-trained models to return the topics and market competition on each topic within a user-specified pair of region and time span. Extensive experiments show that our framework is able to improve the runtime by up to an order of magnitude compared with baselines while achieving similar model quality in terms of training log-likelihood.

...read moreread less

4 citations

Book Chapter•DOI•

Spatio-Textual Data

[...]

Gao Cong¹, Christian S. Jensen•Institutions (1)

Nanyang Technological University¹

01 Jan 2019