Showing papers by "Wang-Chien Lee published in 2012"

PDF

Open Access

Proceedings Article•DOI•

Event-based social networks: linking the online and offline social worlds

[...]

Xingjie Liu¹, Qi He², Yuanyuan Tian², Wang-Chien Lee¹, John McPherson², Jiawei Han³ - Show less +2 more•Institutions (3)

Pennsylvania State University¹, IBM², University of Illinois at Urbana–Champaign³

12 Aug 2012

TL;DR: This paper is the first research to study EBSNs at scale and paves the way for future studies on this new type of social network.

...read moreread less

Abstract: Newly emerged event-based online social services, such as Meetup and Plancast, have experienced increased popularity and rapid growth. From these services, we observed a new type of social network - event-based social network (EBSN). An EBSN does not only contain online social interactions as in other conventional online social networks, but also includes valuable offline social interactions captured in offline activities. By analyzing real data collected from Meetup, we investigated EBSN properties and discovered many unique and interesting characteristics, such as heavy-tailed degree distributions and strong locality of social interactions.We subsequently studied the heterogeneous nature (co-existence of both online and offline social interactions) of EBSNs on two challenging problems: community detection and information flow. We found that communities detected in EBSNs are more cohesive than those in other types of social networks (e.g. location-based social networks). In the context of information flow, we studied the event recommendation problem. By experimenting various information diffusion patterns, we found that a community-based diffusion model that takes into account of both online and offline interactions provides the best prediction power.This paper is the first research to study EBSNs at scale and paves the way for future studies on this new type of social network. A sample dataset of this study can be downloaded from http://www.largenetwork.org/ebsn.

...read moreread less

307 citations

Proceedings Article•DOI•

Exploring social influence for recommendation: a generative model approach

[...]

Mao Ye¹, Xingjie Liu¹, Wang-Chien Lee¹•Institutions (1)

Pennsylvania State University¹

12 Aug 2012

TL;DR: Experimental results show that social influence captured based on the proposed probabilistic generative model, called social influenced selection (SIS), is effective for enhancing both item recommendation and group recommendation, essential for viral marketing, and useful for various user analysis.

...read moreread less

Abstract: Social friendship has been shown beneficial for item recommendation for years. However, existing approaches mostly incorporate social friendship into recommender systems by heuristics. In this paper, we argue that social influence between friends can be captured quantitatively and propose a probabilistic generative model, called social influenced selection(SIS), to model the decision making of item selection (e.g., what book to buy or where to dine). Based on SIS, we mine the social influence between linked friends and the personal preferences of users through statistical inference. To address the challenges arising from multiple layers of hidden factors in SIS, we develop a new parameter learning algorithm based on expectation maximization (EM). Moreover, we show that the mined social influence and user preferences are valuable for group recommendation and viral marketing. Finally, we conduct a comprehensive performance evaluation using real datasets crawled from last.fm and whrrl.com to validate our proposal. Experimental results show that social influence captured based on our SIS model is effective for enhancing both item recommendation and group recommendation, essential for viral marketing, and useful for various user analysis.

...read moreread less

284 citations

Proceedings Article•DOI•

Exploring personal impact for group recommendation

[...]

Xingjie Liu¹, Yuan Tian¹, Mao Ye¹, Wang-Chien Lee¹•Institutions (1)

Pennsylvania State University¹

29 Oct 2012

TL;DR: This paper analyzes the decision making process in a group to propose a personal impact topic (PIT) model for group recommendations, which effectively identifies the group preference profile for a given group by considering the personal preferences and personal impacts of group members.

...read moreread less

Abstract: Group activities are essential ingredients of people's social life. The rapid growth of online social networking services has greatly boosted group activities by providing convenient platform for users to organize and participate in such activities. Therefore, recommender systems, as a critical component in social networking services, now face new challenges in supporting group activities. In this paper, we study the group recommendation problem, i.e., making recommendations to a group of people in social networking services. We analyze the decision making process in a group to propose a personal impact topic (PIT) model for group recommendations. The PIT model effectively identifies the group preference profile for a given group by considering the personal preferences and personal impacts of group members. Moreover, we further enhance the discovery of personal impact with social network information to obtain an extended personal impact topic (E-PIT) model. We have conducted comprehensive data analysis and evaluations on three real datasets. The results show that our proposed group recommendation techniques outperform baseline approaches.

...read moreread less

127 citations

Proceedings Article•DOI•

On socio-spatial group query for location-based social networks

[...]

De-Nian Yang¹, Chih-Ya Shen², Wang-Chien Lee³, Ming-Syan Chen²•Institutions (3)

Academia Sinica¹, National Taiwan University², Pennsylvania State University³

12 Aug 2012

TL;DR: This paper designs an efficient algorithm SSGSelect, which includes effective pruning techniques to reduce the running time for finding the optimal solution, and proposes a new index structure, Social R-Tree, to further improve the efficiency.

...read moreread less

Abstract: Challenges faced in organizing impromptu activities are the requirements of making timely invitations in accordance with the locations of candidate attendees and the social relationship among them. It is desirable to find a group of attendees close to a rally point and ensure that the selected attendees have a good social relationship to create a good atmosphere in the activity. Therefore, this paper proposes Socio-Spatial Group Query (SSGQ) to select a group of nearby attendees with tight social relation. Efficient processing of SSGQ is very challenging due to the tradeoff in the spatial and social domains. We show that the problem is NP-hard via a proof and design an efficient algorithm SSGSelect, which includes effective pruning techniques to reduce the running time for finding the optimal solution. We also propose a new index structure, Social R-Tree to further improve the efficiency. User study and experimental results demonstrate that SSGSelect significantly outperforms manual coordination in both solution quality and efficiency.

...read moreread less

115 citations

Journal Article•DOI•

ROAD: A New Spatial Object Search Framework for Road Networks

[...]

Ken C. K. Lee¹, Wang-Chien Lee², Baihua Zheng, Yuan Tian²•Institutions (2)

University of Massachusetts Dartmouth¹, Pennsylvania State University²

01 Mar 2012-IEEE Transactions on Knowledge and Data Engineering

TL;DR: The analysis and experiment results show the superiority of ROAD over the state-of-the-art approaches.

...read moreread less

Abstract: In this paper, we present a new system framework called ROAD for spatial object search on road networks. ROAD is extensible to diverse object types and efficient for processing various location-dependent spatial queries (LDSQs), as it maintains objects separately from a given network and adopts an effective search space pruning technique. Based on our analysis on the two essential operations for LDSQ processing, namely, network traversal and object lookup, ROAD organizes a large road network as a hierarchy of interconnected regional subnetworks (called Rnets). Each Rnet is augmented with 1) shortcuts and 2) object abstracts to accelerate network traversals and provide quick object lookups, respectively. To manage those shortcuts and object abstracts, two cooperating indices, namely, Route Overlay and Association Directory are devised. In detail, we present 1) the Rnet hierarchy and several properties useful in constructing and maintaining the Rnet hierarchy, 2) the design and implementation of the ROAD framework, and 3) a suite of efficient search algorithms for single-source LDSQs and multisource LDSQs. We conduct a theoretical performance analysis and carry out a comprehensive empirical study to evaluate ROAD. The analysis and experiment results show the superiority of ROAD over the state-of-the-art approaches.

...read moreread less

104 citations

Journal Article•DOI•

A Framework for Personal Mobile Commerce Pattern Mining and Prediction

[...]

E. H-C Lu¹, Wang-Chien Lee², V. S-M Tseng¹•Institutions (2)

National Cheng Kung University¹, Pennsylvania State University²

01 May 2012-IEEE Transactions on Knowledge and Data Engineering

TL;DR: This work proposes a novel framework, called Mobile Commerce Explorer (MCE), for mining and prediction of mobile users' movements and purchase transactions under the context of mobile commerce, and is believed to be the first work that facilitates mining and predictions of users' commerce behaviors in order to recommend stores and items previously unknown to a user.

...read moreread less

Abstract: Due to a wide range of potential applications, research on mobile commerce has received a lot of interests from both of the industry and academia. Among them, one of the active topic areas is the mining and prediction of users' mobile commerce behaviors such as their movements and purchase transactions. In this paper, we propose a novel framework, called Mobile Commerce Explorer (MCE), for mining and prediction of mobile users' movements and purchase transactions under the context of mobile commerce. The MCE framework consists of three major components: 1) Similarity Inference Model (SIM) for measuring the similarities among stores and items, which are two basic mobile commerce entities considered in this paper; 2) Personal Mobile Commerce Pattern Mine (PMCP-Mine) algorithm for efficient discovery of mobile users' Personal Mobile Commerce Patterns (PMCPs); and 3) Mobile Commerce Behavior Predictor (MCBP) for prediction of possible mobile user behaviors. To our best knowledge, this is the first work that facilitates mining and prediction of mobile users' commerce behaviors in order to recommend stores and items previously unknown to a user. We perform an extensive experimental evaluation by simulation and show that our proposals produce excellent results.

...read moreread less

62 citations

Proceedings Article•DOI•

HTTP: a new framework for bus travel time prediction based on historical trajectories

[...]

Wang-Chien Lee¹, Weiping Si¹, Ling-Jyh Chen², Meng Chang Chen²•Institutions (2)

Pennsylvania State University¹, Academia Sinica²

06 Nov 2012

TL;DR: Experimental result shows that the proposed prediction schemes significantly outperforms the state-of-the-art and baseline techniques.

...read moreread less

Abstract: In this paper, we develop a new bus travel time prediction framework, called Historical Trajectory based Travel/Arrival Time Prediction (HTTP) for real-time prediction of travel time over future segments (and thus the arrival time at stops) of an on-going bus journey. The basic idea behind HTTP is to use a collection of historical trajectories "similar" to the current bus trajectory to predict the future segments. Specifically, the HTTP framework (1) samples a set of similar trajectories as the basis for travel time estimation instead of relying on only one historical trajectory best matching the on-going bus journey; and (2) explores different prediction schemes, namely, passed segments, temporal features, and hybrid methods, to identify the sample set of similar trajectories. We conduct a comprehensive empirical experimentation using real bus trajectory data collected from Taipei City, Taiwan to validate our ideas and to evaluate the proposed schemes. Experimental result shows that the proposed prediction schemes significantly outperforms the state-of-the-art and baseline techniques.

...read moreread less

54 citations

Proceedings Article•DOI•

A straw shows which way the wind blows: ranking potentially popular items from early votes

[...]

Peifeng Yin¹, Ping Luo², Min Wang², Wang-Chien Lee¹•Institutions (2)

Pennsylvania State University¹, Hewlett-Packard²

08 Feb 2012

TL;DR: This work argues that the former personality prompts a user to cast her vote conforming to the majority of the service community while on the contrary the later personality makes her vote different from the community and proposes a Conformer-Maverick (CM) model to simulate the voting process and use it to rank top-k potentially popular items based on the early votes they received.

...read moreread less

Abstract: Prediction of popular items in online content sharing systems has recently attracted a lot of attention due to the tremendous need of users and its commercial values. Different from previous works that make prediction by fitting a popularity growth model, we tackle this problem by exploiting the latent conforming and maverick personalities of those who vote to assess the quality of on-line items. We argue that the former personality prompts a user to cast her vote conforming to the majority of the service community while on the contrary the later personality makes her vote different from the community. We thus propose a Conformer-Maverick (CM) model to simulate the voting process and use it to rank top-k potentially popular items based on the early votes they received. Through an extensive experimental evaluation, we validate our ideas and find that our proposed CM model achieves better performance than baseline solutions, especially for smaller k.

...read moreread less

41 citations

Journal Article•DOI•

Energy-Aware Set-Covering Approaches for Approximate Data Collection in Wireless Sensor Networks

[...]

Chih-Chieh Hung¹, Wen-Chih Peng¹, Wang-Chien Lee²•Institutions (2)

National Chiao Tung University¹, Pennsylvania State University²

01 Nov 2012-IEEE Transactions on Knowledge and Data Engineering

TL;DR: A centralized algorithm to determine a set of representative nodes with high energy levels and wide data coverage ranges is proposed, and maintenance mechanisms are proposed to dynamically select alternative representative nodes when the original representative nodes run low on energy, or cannot capture spatial correlation within their respective data Coverage ranges.

...read moreread less

Abstract: To conserve energy, sensor nodes with similar readings can be grouped such that readings from only the representative nodes within the groups need to be reported. However, efficiently identifying sensor groups and their representative nodes is a very challenging task. In this paper, we propose a centralized algorithm to determine a set of representative nodes with high energy levels and wide data coverage ranges. Here, the data coverage range of a sensor node is considered to be the set of sensor nodes that have reading behaviors very close to the particular sensor node. To further reduce the extra cost incurred in messages for selection of representative nodes, a distributed algorithm is developed. Furthermore, maintenance mechanisms are proposed to dynamically select alternative representative nodes when the original representative nodes run low on energy, or cannot capture spatial correlation within their respective data coverage ranges. Using experimental studies on both synthesis and real data sets, our proposed algorithms are shown to effectively and efficiently provide approximate data collection while prolonging the network lifetime.

...read moreread less

37 citations

Proceedings Article•DOI•

Key Formulation Schemes for Spatial Index in Cloud Data Managements

[...]

Ya-Ting Hsu¹, Yi-Chin Pan¹, Ling-Yin Wei¹, Wen-Chih Peng¹, Wang-Chien Lee² - Show less +1 more•Institutions (2)

National Chiao Tung University¹, Pennsylvania State University²

23 Jul 2012

TL;DR: A novel Key formulation scheme based on R+-tree (abbreviated as KR+-index) is proposed, which outperforms other existing key formulations and MD-HBase and two spatial queries, k-NN query and range query, are designed.

...read moreread less

Abstract: Due to the flexibility and scalability in cloud computing, cloud computing nowadays plays an important role to handle a large-scale data analysis. For data processing operations, several cloud data managements (CDMs), such as HBase and Cassandra, are developed. Such CDMs usually provide key-value storages, where each key is used to access its corresponding value. Both HBase and Cassandra provide some basic operations (e.g., Get, Scan) to retrieve the values via keys specified by users. The exiting CDMs fully inherit the characteristics of cloud computing (i.e., high scalability and availability). With the aforementioned characteristics of cloud computing, CDMs are widely employed for Web data, especially for search engines. However, with the proliferation of smart phones and location-based services, data with spatial information, referring as spatial data, are dramatically increasing. Consequently, how to formulate keys for spatial data in the existing CDMs is a challenge issue. In this paper, we develop several key formulation schemes. In particular, we propose a novel Key formulation scheme based on R+-tree (abbreviated as KR+-index). With our design for keys of spatial data, the existing CDMs are able to efficiently retrieve spatial data. In light of KR+-tree, two spatial queries, k-NN query and range query, are designed. Moreover, we implement the proposed key formulation schemes on HBase and Cassandra, and import real spatial data for spatial queries. The experimental results demonstrate that KR+-tree outperforms other existing key formulations and MD-HBase.

...read moreread less

28 citations

Proceedings Article•DOI•

Continuous All k-Nearest-Neighbor Querying in Smartphone Networks

[...]

Georgios Chatzimilioudis¹, Demetrios Zeinalipour-Yazti¹, Wang-Chien Lee², Marios D. Dikaiakos¹•Institutions (2)

University of Cyprus¹, Pennsylvania State University²

23 Jul 2012

TL;DR: An algorithm, coined Proximity, which answers CAkNN queries in O(n(k + λ)) time, where n denotes the number of users and λ a network-specific parameter, and its efficiency is mainly attributed to a smart search space sharing technique it introduces.

...read moreread less

Abstract: Consider a centralized query operator that identifies to every smart phone user its k geographically nearest neighbors at all times, a query we coin Continuous All k-Nearest Neighbor (CAkNN). Such an operator could be utilized to enhance public emergency services, allowing users to send SOS beacons out to the closest rescuers and allowing gamers or social networking users to establish ad-hoc overlay communication infrastructures, in order to carry out complex interactions. In this paper, we study the problem of efficiently processing a CAkNN query in a cellular or WiFi network, both of which are ubiquitous. We introduce an algorithm, coined Proximity, which answers CAkNN queries in O(n(k+lambda)) time, where n denotes the number of users and lambda a network-specific parameter (lambda

...read moreread less

Proceedings Article•DOI•

From face-to-face gathering to social structure

[...]

Chunyan Wang¹, Mao Ye², Wang-Chien Lee³•Institutions (3)

Stanford University¹, Hewlett-Packard², Pennsylvania State University³

29 Oct 2012

TL;DR: This paper proposes a dynamic model for group gathering based on the process of friend invitation to interpret how a f2f group is formed on-line, and demonstrates that using such group information can effectively improve the accuracies of social tie inference and friend recommendation.

...read moreread less

Abstract: The rapid development of on-line social networking sites has dramatically changed the way people live and communicate. One particularly interesting phenomena came along with this development is the prominent role of various on-line networking portals played in scheduling and organizing off-line group events and activities. In this paper, we focus on studying the face-to-face(f2f) group formed through, or facilitated by, on-line portals. We first show the distinct characteristics of such f2f groups by analyzing datasets collected from Whrrl and Meetup. Next, we propose a dynamic model for group gathering based on the process of friend invitation to interpret how a f2f group is formed on-line. The results of our model are confirmed by empirical observations. Finally, we demonstrate that using such group information can effectively improve the accuracies of social tie inference and friend recommendation.

...read moreread less

Proceedings Article•DOI•

On bundle configuration for viral marketing in social networks

[...]

De-Nian Yang¹, Wang-Chien Lee², Nai-Hui Chia¹, Mao Ye², Hui-Ju Hung¹ - Show less +1 more•Institutions (2)

Academia Sinica¹, Pennsylvania State University²

29 Oct 2012

TL;DR: Experimental results show that ABC significantly outperforms its counterpart and two baseline approaches in terms of both computational overhead and bundle quality.

...read moreread less

Abstract: Prior research on viral marketing mostly focuses on promoting one single product item. In this work, we explore the idea of bundling multiple items for viral marketing and formulate a new research problem, called Bundle Configuration for SpreAd Maximization (BCSAM). Efficiently obtaining an optimal product bundle under the setting of BCSAM is very challenging. Aiming to strike a balance between the quality of solution and the computational overhead, we systematically explore various heuristics to develop a suite of algorithms, including κ-Bundle Configuration and Aggregated Bundle Configuration. Moreover, we integrate all the proposed ideas into one efficient algorithm, called Aggregated Bundle Configuration (ABC). Finally, we conduct an extensive performance evaluation on our proposals. Experimental results show that ABC significantly outperforms its counterpart and two baseline approaches in terms of both computational overhead and bundle quality.

...read moreread less

Journal Article•DOI•

Querying Uncertain Minimum in Wireless Sensor Networks

[...]

Mao Ye¹, Ken C. K. Lee², Wang-Chien Lee¹, Xingjie Liu¹, Meng Chang Chen - Show less +1 more•Institutions (2)

Pennsylvania State University¹, University of Massachusetts Dartmouth²

01 Dec 2012-IEEE Transactions on Knowledge and Data Engineering

TL;DR: To answer PMVQs and PMNQs energy-efficiently, two suites of in-network algorithms are devised and extended to answerPMNQ variants, and all the proposed approaches are evaluated through cost analysis and simulations.

...read moreread less

Abstract: In this paper, we introduce two types of probabilistic aggregation queries, namely, Probabilistic Minimum Value Queries (PMVQ)s and Probabilistic Minimum Node Queries (PMNQ)s. A PMVQ determines possible minimum values among all imprecise sensed data, while a PMNQ identifies sensor nodes that possibly provide minimum values. However, centralized approaches incur a lot of energy from battery-powered sensor nodes and well-studied in-network aggregation techniques that presume precise sensed data are not practical to inherently imprecise sensed data. Thus, to answer PMVQs and PMNQs energy-efficiently, we devised suites of in-network algorithms. For PMVQs, our in-network minimum value screening algorithm (MVS) filters candidate minimum values; and our in-network minimum value aggregation algorithm (MVA) conducts in-network probability calculation. PMNQs requires possible minimum values to be determined a prior, inevitably consuming more energy to evaluate than PMVQs. Accordingly, our one-phase and two-phase in-network algorithms are devised. We also extend the algorithms to answer PMNQ variants. We evaluate all our proposed approaches through cost analysis and simulations.

...read moreread less

Proceedings Article•DOI•

Efficient Time Series Disaggregation for Non-intrusive Appliance Load Monitoring

[...]

Yao-Chung Fan¹, Xingjie Liu², Wang-Chien Lee², Arbee L. P. Chen³•Institutions (3)

National Chung Hsing University¹, Pennsylvania State University², National Chengchi University³

04 Sep 2012

TL;DR: Aiming at achieving high estimation accuracy and alleviating excessive computation, a time-series disaggregation algorithm is developed which incorporates two novel techniques, namely, DE-pruning and monotonic enumeration, for search space pruning.

...read moreread less

Abstract: The growing concerns on urgent environmental and economical issues, such as global warming and rising energy cost, have motivated research studies on various green computing technologies. For example, Non-Intrusive Appliance Load Monitor (NIALM) techniques, aiming at energy monitoring, load forecasting and improved control of residential electrical appliances, have been developed by monitoring one electrical circuit that contains a number of electrical appliances without using separate sub-meters. By employing pattern recognition algorithms, the NIALM techniques estimate the consumption of individual appliances. While the basic ideas behind the NIALM techniques are valid, existing proposals suffer from the issue of poor estimation accuracy. In this paper, we model the process of load separation in NIALM as a time series disaggregation problem. Aiming at achieving high estimation ac-curacy and alleviating excessive computation, we develop a time-series disaggregation algorithm which incorporates two novel techniques, namely, DE-pruning and monotonic enumeration, for search space pruning. A comprehensive set of experiments are conducted to validate our proposals and to evaluate the effectiveness and the efficiency of the proposed methods. The result shows that our proposal is effective and efficient.

...read moreread less

Journal Article•DOI•

The Design and Evaluation of Task Assignment Algorithms for GWAP-based Geospatial Tagging Systems

[...]

Ling-Jyh Chen, Yu-Song Syu, Hung-Chia Chen, Wang-Chien Lee¹•Institutions (1)

Pennsylvania State University¹

01 Jun 2012-Mobile Networks and Applications

TL;DR: This study designs three metrics to evaluate the system performance, develops five task assignment algorithms for GWAP-based geotagging systems, and finds that the Least-Throughput-First Assignment algorithm (LTFA) is the most effective approach because it can achieve competitive system utility, while its computational complexity remains moderate.

...read moreread less

Abstract: Geospatial tagging (geotagging) is an emerging and very promising application that can help users find a wide variety of location-specific information, and thereby facilitate the development of advanced location-based services. Conventional geotagging systems share some limitations, such as the use of a two-phase operating model and the tendency to tag popular objects with simple contexts. To address these problems, a number of geotagging systems based on the concept of `Games with a Purpose' (GWAP) have been developed recently. In this study, we use analysis to investigate these new systems. Based on our analysis results, we design three metrics to evaluate the system performance, and develop five task assignment algorithms for GWAP-based systems. Using a comprehensive set of simulations under both synthetic and realistic mobility scenarios, we find that the Least-Throughput-First Assignment algorithm (LTFA) is the most effective approach because it can achieve competitive system utility, while its computational complexity remains moderate. We also find that, to improve the system utility, it is better to assign as many tasks as possible in each round. However, because players may feel annoyed if too many tasks are assigned at the same time, it is recommended that multiple tasks be assigned one by one in each round in order to achieve higher system utility.

...read moreread less