scispace - formally typeset
Search or ask a question

Showing papers by "Wang-Chien Lee published in 2013"


Proceedings ArticleDOI
27 Oct 2013
TL;DR: This paper proposes a collaborative recommendation framework, called User Preference, Proximity and Social-Based Collaborative Filtering} (UPS-CF), to make location recommendation for mobile users in LBSNs and finds that preference derived from similar users is important for in-town users while social influence becomes more important for out-of- town users.
Abstract: Most previous research on location recommendation services in location-based social networks (LBSNs) makes recommendations without considering where the targeted user is currently located. Such services may recommend a place near her hometown even if the user is traveling out of town. In this paper, we study the issues in making location recommendations for out-of-town users by taking into account user preference, social influence and geographical proximity. Accordingly, we propose a collaborative recommendation framework, called User Preference, Proximity and Social-Based Collaborative Filtering} (UPS-CF), to make location recommendation for mobile users in LBSNs. We validate our ideas by comprehensive experiments using real datasets collected from Foursquare and Gowalla. By comparing baseline algorithms and conventional collaborative filtering approach (and its variants), we show that UPS-CF exhibits the best performance. Additionally, we find that preference derived from similar users is important for in-town users while social influence becomes more important for out-of-town users.

133 citations


Proceedings ArticleDOI
04 Feb 2013
TL;DR: This work proposes an Actual- Tempting model that captures factors that invoke a user to replace an old app with a new app and shows that the AT model performs significantly better than the conventional recommendation techniques such as collaborative filtering and content-based recommendation.
Abstract: Due to the huge and still rapidly growing number of mobile applications (apps), it becomes necessary to provide users an app recommendation service. Different from conventional item recommendation where the user interest is the primary factor, app recommendation also needs to consider factors that invoke a user to replace an old app (if she already has one) with a new app. In this work we propose an Actual- Tempting model that captures such factors in the decision process of mobile app adoption. The model assumes that each owned app has an actual satisfactory value and a new app under consideration has a tempting value. The former stands for the real satisfactory value the owned app brings to the user while the latter represents the estimated value the new app may seemingly have. We argue that the process of app adoption therefore is a contest between the owned apps' actual values and the candidate app's tempting value. Via the extensive experiments we show that the AT model performs significantly better than the conventional recommendation techniques such as collaborative filtering and content-based recommendation. Furthermore, the best recommendation performance is achieved when the AT model is combined with them.

98 citations


Proceedings ArticleDOI
11 Aug 2013
TL;DR: This work aims to enrich the user-vote matrix by converting the dwell time on items into users' ``pseudo votes'' and then help improve recommendation performance, and shows that the traditional rate-based recommendation's performance is greatly improved with the support of VV model.
Abstract: Social media is a platform for people to share and vote content. From the analysis of the social media data we found that users are quite inactive in rating/voting. For example, a user on average only votes 2 out of 100 accessed items. Traditional recommendation methods are mostly based on users' votes and thus can not cope with this situation. Based on the observation that the dwell time on an item may reflect the opinion of a user, we aim to enrich the user-vote matrix by converting the dwell time on items into users' ``pseudo votes'' and then help improve recommendation performance. However, it is challenging to correctly interpret the dwell time since many subjective human factors, e.g. user expectation, sensitivity to various item qualities, reading speed, are involved into the casual behavior of online reading. In psychology, it is assumed that people have choice threshold in decision making. The time spent on making decision reflects the decision maker's threshold. This idea inspires us to develop a View-Voting model, which can estimate how much the user likes the viewed item according to her dwell time, and thus make recommendations even if there is no voting data available. Finally, our experimental evaluation shows that the traditional rate-based recommendation's performance is greatly improved with the support of VV model.

62 citations


Proceedings ArticleDOI
11 Aug 2013
TL;DR: Wang et al. as mentioned in this paper proposed a recommendation support for active friending, where a user actively specifies a friending target and formulated a new optimization problem, namely, Acceptance Probability Maximization (APM), and developed a polynomial time algorithm, called Selective Invitation with Tree and In-Node Aggregation (SITINA), to find the optimal solution.
Abstract: Friending recommendation has successfully contributed to the explosive growth of online social networks. Most friending recommendation services today aim to support passive friending, where a user passively selects friending targets from the recommended candidates. In this paper, we advocate a recommendation support for active friending, where a user actively specifies a friending target. To the best of our knowledge, a recommendation designed to provide guidance for a user to systematically approach his friending target has not been explored for existing online social networking services. To maximize the probability that the friending target would accept an invitation from the user, we formulate a new optimization problem, namely, Acceptance Probability Maximization (APM), and develop a polynomial time algorithm, called Selective Invitation with Tree and In-Node Aggregation (SITINA), to find the optimal solution. We implement an active friending service with SITINA on Facebook to validate our idea. Our user study and experimental results reveal that SITINA outperforms manual selection and the baseline approach in solution quality efficiently.

59 citations


Journal ArticleDOI
TL;DR: A new uncertain skyline query, called U-Skyline query, that searches for a set of tuples that has the highest probability (aggregated from all possible scenarios) as the skyline answer, and proposes a number of optimization techniques for query processing.
Abstract: The skyline query, aiming at identifying a set of skyline tuples that are not dominated by any other tuple, is particularly useful for multicriteria data analysis and decision making For uncertain databases, a probabilistic skyline query, called P-Skyline, has been developed to return skyline tuples by specifying a probability threshold However, the answer obtained via a P-Skyline query usually includes skyline tuples undesirably dominating each other when a small threshold is specified; or it may contain much fewer skyline tuples if a larger threshold is employed To address this concern, we propose a new uncertain skyline query, called U-Skyline query, in this paper Instead of setting a probabilistic threshold to qualify each skyline tuple independently, the U-Skyline query searches for a set of tuples that has the highest probability (aggregated from all possible scenarios) as the skyline answer In order to answer U-Skyline queries efficiently, we propose a number of optimization techniques for query processing, including 1) computational simplification of U-Skyline probability, 2) pruning of unqualified candidate skylines and early termination of query processing, 3) reduction of the input data set, and 4) partition and conquest of the reduced data set We perform a comprehensive performance evaluation on our algorithm and an alternative approach that formulates the U-Skyline processing problem by integer programming Experimental results demonstrate that our algorithm is 10-100 times faster than using CPLEX, a parallel integer programming solver, to answer the U-Skyline query

50 citations


Posted Content
TL;DR: This paper advocates a recommendation support for active friending, where a user actively specifies a friending target, and develops a polynomial time algorithm, called Selective Invitation with Tree and In-Node Aggregation (SITINA), to find the optimal solution.
Abstract: Friending recommendation has successfully contributed to the explosive growth of on-line social networks. Most friending recommendation services today aim to support passive friending, where a user passively selects friending targets from the recommended candidates. In this paper, we advocate recommendation support for active friending, where a user actively specifies a friending target. To the best of our knowledge, a recommendation designed to provide guidance for a user to systematically approach his friending target, has not been explored in existing on-line social networking services. To maximize the probability that the friending target would accept an invitation from the user, we formulate a new optimization problem, namely, \emph{Acceptance Probability Maximization (APM)}, and develop a polynomial time algorithm, called \emph{Selective Invitation with Tree and In-Node Aggregation (SITINA)}, to find the optimal solution. We implement an active friending service with SITINA in Facebook to validate our idea. Our user study and experimental results manifest that SITINA outperforms manual selection and the baseline approach in solution quality efficiently.

48 citations


Journal ArticleDOI
TL;DR: A personalized mobile search engine that captures the users' preferences in the form of concepts by mining their clickthrough data and addresses the privacy issue by restricting the information in the user profile exposed to the PMSE server with two privacy parameters is proposed.
Abstract: We propose a personalized mobile search engine (PMSE) that captures the users' preferences in the form of concepts by mining their clickthrough data. Due to the importance of location information in mobile search, PMSE classifies these concepts into content concepts and location concepts. In addition, users' locations (positioned by GPS) are used to supplement the location concepts in PMSE. The user preferences are organized in an ontology-based, multifacet user profile, which are used to adapt a personalized ranking function for rank adaptation of future search results. To characterize the diversity of the concepts associated with a query and their relevances to the user's need, four entropies are introduced to balance the weights between the content and location facets. Based on the client-server model, we also present a detailed architecture and design for implementation of PMSE. In our design, the client collects and stores locally the clickthrough data to protect privacy, whereas heavy tasks such as concept extraction, training, and reranking are performed at the PMSE server. Moreover, we address the privacy issue by restricting the information in the user profile exposed to the PMSE server with two privacy parameters. We prototype PMSE on the Google Android platform. Experimental results show that PMSE significantly improves the precision comparing to the baseline.

43 citations


Proceedings ArticleDOI
25 Aug 2013
TL;DR: The analysis confirms that the social networks of OSS communities follow power-law degree distributions and exhibit small-world characteristics, however, the degree mixing pattern shows that high degree nodes tend to connect more with low degree nodes, suggesting collaborations between experts and newbie developers.
Abstract: We conduct a statistical analysis on the social networks of contributors in Open Source Software (OSS) communities using datasets collected from two most fast-growing OSS social interaction sites, Github.com and Ohloh.net. Our goal is to analyze the connectivity structure of the social networks of contributors and to investigate the effect of the different social tie structures on developers' overall productivity to OSS projects. We first analyze the general structure of the social networks, e.g., graph distances and the degree distribution of the social networks. Our analysis confirms that the social networks of OSS communities follow power-law degree distributions and exhibit small-world characteristics. However, the degree mixing pattern shows that high degree nodes tend to connect more with low degree nodes, suggesting collaborations between experts and newbie developers. Second, we study the correlation between graph degrees and the productivity of the contributors in terms of the amount of contribution and commitment to OSS projects. The analysis demonstrates evident influence of the social ties on the developers' overall productivity.

30 citations


Journal ArticleDOI
TL;DR: In this article, a pattern-aware trajectory search (PATS) framework is proposed to retrieve the top K trajectories passing through popular ROIs by considering travel behavior exploration and trajectory search.
Abstract: With the popularity of positioning devices, Web 2.0 technology, and trip sharing services, many users are willing to log and share their trips on the Web. Thus, trip planning Web sites are able to provide some new services by inferring Regions-Of-Interest (ROIs) and recommending popular travel routes from trip trajectories. We argue that simply providing some travel routes consisting of popular ROIs to users is not sufficient. To tour around a wide geographical area, for example, a city, some users may prefer a trip to visit as many ROIs as possible, while others may like to stop by only a few ROIs for an in-depth visit. We refer to a trip fitting the former user group as an in-breadth trip and a trip suitable for the latter user group as an in-depth trip. Prior studies on trip planning have focused on mining ROIs and travel routes without considering these different preferences. In this article, given a spatial range and a user preference of depth/breadth specified by a user, we develop a Pattern-Aware Trajectory Search (PATS) framework to retrieve the top K trajectories passing through popular ROIs. PATS is novel because the returned travel trajectories, discovered from travel patterns hidden in trip trajectories, may represent the most valuable travel experiences of other travelers fitting the user's trip preference in terms of depth or breadth. The PATS framework comprises two components: travel behavior exploration and trajectory search. The travel behavior exploration component determines a set of ROIs along with their attractive scores by considering not only the popularity of the ROIs but also the travel sequential relationships among the ROIs. To capture the travel sequential relationships among ROIs and to derive their attractive scores, a user movement graph is constructed. For the trajectory search component of PATS, we formulate two trajectory score functions, the depth-trip score function and the breadth-trip score function, by taking into account the number of ROIs in a trajectory and their attractive scores. Accordingly, we propose an algorithm, namely, Bounded Trajectory Search (BTS), to efficiently retrieve the top K trajectories based on the two trajectory scores. The PATS framework is evaluated by experiments and user studies using a real dataset. The experimental results demonstrate the effectiveness and the efficiency of the proposed PATS framework.

27 citations


Proceedings ArticleDOI
27 Oct 2013
TL;DR: This work proposes a heterogeneous patent citation-bibliographic network that combines patent citations (reflecting value relation) and bibliographic information ( Reflecting similarity relation) together and proposes a two-stage framework for patent citation recommendation.
Abstract: Patent citation recommendation and prior patent search, critical for patent filing and patent examination, have become increasingly difficult due to the rapidly growing number of patents. Unlike paper citations that focus on reference comprehensiveness, patent citations tend to be more parsimonious and refer only to those prior patents bearing significant technological and/or economic value, as they define the scope of the citing patent and thus have significant legal and economic implications. Based on the insight that patent citations are important information reflecting the value of cited patents to the citing patent, we propose a heterogeneous patent citation-bibliographic network that combines patent citations (reflecting value relation) and bibliographic information (reflecting similarity relation) together. From this network, we extract various features that reflect the value of a prior patent to a query patent with regard to the context of the query patent such as its assignee, classifications, etc. We then propose a two-stage framework for patent citation recommendation. Our idea is that by exploiting those context-specific value measures of candidate patents to the query patent, the proposed framework is able to make effective patent citation recommendations. We evaluate the proposed context-guided value-driven framework using a collection of 1.8M U.S. patents. Experimental results validate our ideas and show that those value-driven features are very effective and significantly outperform two state-of-the-art methods in terms of both the precision and recall rates.

26 citations


Journal ArticleDOI
TL;DR: The notion of sufficient set and necessary set for distributed processing of probabilistic top-k queries in cluster-based wireless sensor networks and an adaptive algorithm that dynamically switches among the three proposed algorithms to minimize the transmission cost are introduced.
Abstract: In this paper, we introduce the notion of sufficient set and necessary set for distributed processing of probabilistic top-k queries in cluster-based wireless sensor networks. These two concepts have very nice properties that can facilitate localized data pruning in clusters. Accordingly, we develop a suite of algorithms, namely, sufficient set-based (SSB), necessary set-based (NSB), and boundary-based (BB), for intercluster query processing with bounded rounds of communications. Moreover, in responding to dynamic changes of data distribution in the network, we develop an adaptive algorithm that dynamically switches among the three proposed algorithms to minimize the transmission cost. We show the applicability of sufficient set and necessary set to wireless sensor networks with both two-tier hierarchical and tree-structured network topologies. Experimental results show that the proposed algorithms reduce data transmissions significantly and incur only small constant rounds of data communications. The experimental results also demonstrate the superiority of the adaptive algorithm, which achieves a near-optimal performance under various conditions.

Book ChapterDOI
14 Apr 2013
TL;DR: Two types of usage patterns which capture the representative usage behaviors of appliances in a smart home environment and the corresponding algorithms for discovering usage patterns efficiently are introduced and applied on a real-world dataset to show the practicability of usage pattern mining.
Abstract: Nowadays, due to the great advent of sensor technology, the data of all appliances in a house can be collected easily. However, with a huge amount of appliance usage log data, it is not an easy task for residents to visualize how the appliances are used. Mining algorithms is necessary to discover appliance usage patterns that capture representative usage behavior of appliances. If some of our representative patterns of appliance electricity usages are available, we may be able to adapt our usage behaviors to conserve the energy easily. In this paper, we introduce (i) two types of usage patterns which capture the representative usage behaviors of appliances in a smart home environment and (ii) the corresponding algorithms for discovering usage patterns efficiently. Finally, we apply our algorithms on a real-world dataset to show the practicability of usage pattern mining.

Proceedings ArticleDOI
03 Jun 2013
TL;DR: This paper proposes two new VANET routing protocols, namely, Routing Protocol with Beacon Control (RPBC) and Routing protocol with BeaconLess (RPBL), to alleviate packet losses and proposes the idea of virtual beacons, which can be used for routing without heavily relying on beacons.
Abstract: Vehicular ad hoc networks (VANETs) have been attracting increasing research interests for the past decade. To address the routing problem, many protocols have been proposed in the past several years. Routing protocols for VANETs, mostly based on the ideas of “Geographical Routing” (or geo-routing for short), typically have nodes periodically broadcast one-hop beacon messages to reveal their positions to neighbors. Nevertheless, packet loss and thus deterioration of routing performance in these protocols are anticipated in urban areas due to high density of vehicles in the network. In this paper, we propose two new VANET routing protocols, namely, Routing Protocol with Beacon Control (RPBC) and Routing Protocol with BeaconLess (RPBL), to alleviate packet losses. In RPBC, each vehicle determines whether to transmit a beacon message based on a new beacon control scheme proposed in this paper, which by minimizing redundant beacon messages reduces transmission overhead significantly. On the other hand, RPBL is a beaconless protocol where a node broadcasts a packet to its neighboring nodes and transmits packet via multiple paths to achieve high delivery ratio. Moreover, as packets in geo-routing protocols include the location of the sender, it can be used for routing without heavily relying on beacons. Accordingly, we propose the idea of virtual beacons and use it to further improve our proposed protocols. We conduct comprehensive experiments by simulation to validate our ideas and evaluate the proposed protocols. The simulation results show that our proposals can achieve high delivery ratios, short delays, and small overhead.

Proceedings ArticleDOI
27 Oct 2013
TL;DR: This paper investigates the problem of searching for the k Diverse-Near Neighbors in spatial space that is based upon the spatial diversity and proximity of candidate locations to the query point, and proposes two heuristic algorithms, namely, Distance-based Browsing and Diversity-basedBrowsing that provide high effectiveness while being efficient by exploring the search space prioritized upon the proximity to thequery point and spatial diversity.
Abstract: To many location-based service applications that prefer diverse results, finding locations that are spatially diverse and close in proximity to a query point (e.g., the current location of a user) can be more useful than finding the k nearest neighbors/locations. In this paper, we investigate the problem of searching for the k Diverse-Near Neighbors (kDNNs)} in spatial space that is based upon the spatial diversity and proximity of candidate locations to the query point. While employing a conventional distance measure for proximity, we develop a new and intuitive diversity metric based upon the variance of the angles among the candidate locations with respect to the query point. Accordingly, we create a dynamic programming algorithm that finds the optimal kDNNs. Unfortunately, the dynamic programming algorithm, with a time complexity of O(kn3), incurs excessive computational cost. Therefore, we further propose two heuristic algorithms, namely, Distance-based Browsing (DistBrow) and Diversity-based Browsing (DivBrow) that provide high effectiveness while being efficient by exploring the search space prioritized upon the proximity to the query point and spatial diversity, respectively. Using real and synthetic datasets, we conduct a comprehensive performance evaluation. The results show that DistBrow and DivBrow have superior effectiveness compared to state-of-the-art algorithms while maintaining high efficiency.

Proceedings ArticleDOI
07 Dec 2013
TL;DR: A novel system, namely, Correlation Pattern Mining System (CPMS), is developed to capture the usage patterns and correlations among appliances and is applied on a real-world dataset to show the practicability of correlation pattern mining.
Abstract: Owing to the great advent of sensor technology, the usage data of appliances in a house can be logged and collected easily today. However, it is a challenge for the residents to visualize how these appliances are used. Thus, mining algorithms are much needed to discover appliance usage patterns. Most previous studies on usage pattern discovery are mainly focused on analyzing the patterns of single appliance rather than mining the usage correlation among appliances. In this paper, a novel system, namely, Correlation Pattern Mining System (CPMS), is developed to capture the usage patterns and correlations among appliances. With several new optimization techniques, CPMS can reduce the search space effectively and efficiently. Furthermore, the proposed algorithm is applied on a real-world dataset to show the practicability of correlation pattern mining.

Book ChapterDOI
22 Apr 2013
TL;DR: The problem defined is NP-hard and three algorithms are presented, namely, PSTA, STA and NFA, to solve the problem, and the results show the effectiveness of the proposed algorithms to find a well acquainted teams satisfying a given query.
Abstract: We consider the team formation problem in open collaborative projects existing in large community setting such as the Open Source Software (OSS) community. Given a query specifying a set of required skills for an open project and an upper bound of team size, the goal is to find a team that maximizes the Degree of Acquaintance (DoA) and covers all the required skills in the query. We define the DoA in terms of the team graph connectivity and edge weights, corresponding to the local Clustering Coefficient for each team member and the strength of social ties between the team members, respectively. We perform a statistical analysis on historical data to show the importance of the connectivity and social tie strength to the overall productivity of the teams in open projects. We show that the problem defined is NP-hard and present three algorithms, namely, PSTA, STA and NFA, to solve the problem. We experiment the algorithms on a dataset from the OSS community. The results show the effectiveness of the proposed algorithms to find a well acquainted teams satisfying a given query.