scispace - formally typeset
Search or ask a question
Author

Wang-Chien Lee

Bio: Wang-Chien Lee is an academic researcher from Pennsylvania State University. The author has contributed to research in topics: Wireless sensor network & Nearest neighbor search. The author has an hindex of 60, co-authored 366 publications receiving 14123 citations. Previous affiliations of Wang-Chien Lee include Ohio State University & Verizon Communications.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper evaluates, both analytically and experimentally, how the pruning efficiency of the Spitfire algorithm plays a pivotal role in reducing communication and response time up to an order of magnitude, compared to three other state-of-the-art distributed AkNN algorithms executed in distributed main-memory.
Abstract: A wide spectrum of Internet-scale mobile applications, ranging from social networking, gaming and entertainment to emergency response and crisis management, all require efficient and scalable All k Nearest Neighbor (AkNN) computations over millions of moving objects every few seconds to be operational. Most traditional techniques for computing AkNN queries are centralized, lacking both scalability and efficiency. Only recently, distributed techniques for shared-nothing cloud infrastructures have been proposed to achieve scalability for large datasets. These batch-oriented algorithms are sub-optimal due to inefficient data space partitioning and data replication among processing units. In this paper, we present Spitfire , a distributed algorithm that provides a scalable and high-performance AkNN processing framework. Our proposed algorithm deploys a fast load-balanced partitioning scheme along with an efficient replication-set selection algorithm, to provide fast main-memory computations of the exact AkNN results in a batch-oriented manner. We evaluate, both analytically and experimentally, how the pruning efficiency of the Spitfire algorithm plays a pivotal role in reducing communication and response time up to an order of magnitude, compared to three other state-of-the-art distributed AkNN algorithms executed in distributed main-memory.

27 citations

Journal ArticleDOI
TL;DR: In this article, a pattern-aware trajectory search (PATS) framework is proposed to retrieve the top K trajectories passing through popular ROIs by considering travel behavior exploration and trajectory search.
Abstract: With the popularity of positioning devices, Web 2.0 technology, and trip sharing services, many users are willing to log and share their trips on the Web. Thus, trip planning Web sites are able to provide some new services by inferring Regions-Of-Interest (ROIs) and recommending popular travel routes from trip trajectories. We argue that simply providing some travel routes consisting of popular ROIs to users is not sufficient. To tour around a wide geographical area, for example, a city, some users may prefer a trip to visit as many ROIs as possible, while others may like to stop by only a few ROIs for an in-depth visit. We refer to a trip fitting the former user group as an in-breadth trip and a trip suitable for the latter user group as an in-depth trip. Prior studies on trip planning have focused on mining ROIs and travel routes without considering these different preferences. In this article, given a spatial range and a user preference of depth/breadth specified by a user, we develop a Pattern-Aware Trajectory Search (PATS) framework to retrieve the top K trajectories passing through popular ROIs. PATS is novel because the returned travel trajectories, discovered from travel patterns hidden in trip trajectories, may represent the most valuable travel experiences of other travelers fitting the user's trip preference in terms of depth or breadth. The PATS framework comprises two components: travel behavior exploration and trajectory search. The travel behavior exploration component determines a set of ROIs along with their attractive scores by considering not only the popularity of the ROIs but also the travel sequential relationships among the ROIs. To capture the travel sequential relationships among ROIs and to derive their attractive scores, a user movement graph is constructed. For the trajectory search component of PATS, we formulate two trajectory score functions, the depth-trip score function and the breadth-trip score function, by taking into account the number of ROIs in a trajectory and their attractive scores. Accordingly, we propose an algorithm, namely, Bounded Trajectory Search (BTS), to efficiently retrieve the top K trajectories based on the two trajectory scores. The PATS framework is evaluated by experiments and user studies using a real dataset. The experimental results demonstrate the effectiveness and the efficiency of the proposed PATS framework.

27 citations

Proceedings ArticleDOI
05 Jun 2006
TL;DR: It is argued that query brokering and access control are not two orthogonal issues because access control deployment strategies can have a significant impact on the "whole" system's end-to-end performance.
Abstract: An XML brokerage system is a distributed XML database system that comprises data sources and brokers which, respectively, hold XML documents and document distribution information. However, all existing information brokerage systems view or handle query brokering and access control as two orthogonal issues: query brokering is a system issue that concerns costs and performance, while access control is a security issue that concerns information confidentiality. As a result, access control deployment strategies (in terms of where and when to do access control) and the impact of such strategies on end-to-end system performance are neglected by existing information brokerage systems. In addition, data source side access control deployment is taken-for-granted as the "right" thing to do. In this paper, we challenge this traditional, taken-for-granted access control deployment methodology, and argue that query brokering and access control are not two orthogonal issues because access control deployment strategies can have a significant impact on the "whole" system's end-to-end performance. We propose the first in-broker access control deployment strategy where access control is "pushed" from the boundary into the "heart" of the information brokerage system.

26 citations

Proceedings ArticleDOI
27 Oct 2013
TL;DR: This work proposes a heterogeneous patent citation-bibliographic network that combines patent citations (reflecting value relation) and bibliographic information ( Reflecting similarity relation) together and proposes a two-stage framework for patent citation recommendation.
Abstract: Patent citation recommendation and prior patent search, critical for patent filing and patent examination, have become increasingly difficult due to the rapidly growing number of patents. Unlike paper citations that focus on reference comprehensiveness, patent citations tend to be more parsimonious and refer only to those prior patents bearing significant technological and/or economic value, as they define the scope of the citing patent and thus have significant legal and economic implications. Based on the insight that patent citations are important information reflecting the value of cited patents to the citing patent, we propose a heterogeneous patent citation-bibliographic network that combines patent citations (reflecting value relation) and bibliographic information (reflecting similarity relation) together. From this network, we extract various features that reflect the value of a prior patent to a query patent with regard to the context of the query patent such as its assignee, classifications, etc. We then propose a two-stage framework for patent citation recommendation. Our idea is that by exploiting those context-specific value measures of candidate patents to the query patent, the proposed framework is able to make effective patent citation recommendations. We evaluate the proposed context-guided value-driven framework using a collection of 1.8M U.S. patents. Experimental results validate our ideas and show that those value-driven features are very effective and significantly outperform two state-of-the-art methods in terms of both the precision and recall rates.

26 citations

Posted Content
TL;DR: Wang et al. as discussed by the authors investigated the seed selection problem for viral marketing that considers both effects of social influence and item inference (for product recommendation) and developed a new model, Social Item Graph (SIG), that captures both effects in form of hyperedges.
Abstract: Research issues and data mining techniques for product recommendation and viral marketing have been widely studied. Existing works on seed selection in social networks do not take into account the effect of product recommendations in e-commerce stores. In this paper, we investigate the seed selection problem for viral marketing that considers both effects of social influence and item inference (for product recommendation). We develop a new model, Social Item Graph (SIG), that captures both effects in form of hyperedges. Accordingly, we formulate a seed selection problem, called Social Item Maximization Problem (SIMP), and prove the hardness of SIMP. We design an efficient algorithm with performance guarantee, called Hyperedge-Aware Greedy (HAG), for SIMP and develop a new index structure, called SIG-index, to accelerate the computation of diffusion process in HAG. Moreover, to construct realistic SIG models for SIMP, we develop a statistical inference based framework to learn the weights of hyperedges from data. Finally, we perform a comprehensive evaluation on our proposals with various baselines. Experimental result validates our ideas and demonstrates the effectiveness and efficiency of the proposed model and algorithms over baselines.

26 citations


Cited by
More filters
01 Jan 2002

9,314 citations

Journal ArticleDOI

6,278 citations

Proceedings ArticleDOI
21 Aug 2011
TL;DR: A model of human mobility that combines periodic short range movements with travel due to the social network structure is developed and it is shown that this model reliably predicts the locations and dynamics of future human movement and gives an order of magnitude better performance.
Abstract: Even though human movement and mobility patterns have a high degree of freedom and variation, they also exhibit structural patterns due to geographic and social constraints. Using cell phone location data, as well as data from two online location-based social networks, we aim to understand what basic laws govern human motion and dynamics. We find that humans experience a combination of periodic movement that is geographically limited and seemingly random jumps correlated with their social networks. Short-ranged travel is periodic both spatially and temporally and not effected by the social network structure, while long-distance travel is more influenced by social network ties. We show that social relationships can explain about 10% to 30% of all human movement, while periodic behavior explains 50% to 70%. Based on our findings, we develop a model of human mobility that combines periodic short range movements with travel due to the social network structure. We show that our model reliably predicts the locations and dynamics of future human movement and gives an order of magnitude better performance than present models of human mobility.

2,922 citations

01 Nov 2008

2,686 citations

Journal ArticleDOI
TL;DR: This review presents the emergent field of temporal networks, and discusses methods for analyzing topological and temporal structure and models for elucidating their relation to the behavior of dynamical systems.
Abstract: A great variety of systems in nature, society and technology -- from the web of sexual contacts to the Internet, from the nervous system to power grids -- can be modeled as graphs of vertices coupled by edges The network structure, describing how the graph is wired, helps us understand, predict and optimize the behavior of dynamical systems In many cases, however, the edges are not continuously active As an example, in networks of communication via email, text messages, or phone calls, edges represent sequences of instantaneous or practically instantaneous contacts In some cases, edges are active for non-negligible periods of time: eg, the proximity patterns of inpatients at hospitals can be represented by a graph where an edge between two individuals is on throughout the time they are at the same ward Like network topology, the temporal structure of edge activations can affect dynamics of systems interacting through the network, from disease contagion on the network of patients to information diffusion over an e-mail network In this review, we present the emergent field of temporal networks, and discuss methods for analyzing topological and temporal structure and models for elucidating their relation to the behavior of dynamical systems In the light of traditional network theory, one can see this framework as moving the information of when things happen from the dynamical system on the network, to the network itself Since fundamental properties, such as the transitivity of edges, do not necessarily hold in temporal networks, many of these methods need to be quite different from those for static networks

2,452 citations