Showing papers by "Wang-Chien Lee published in 2009"

PDF

Open Access

Proceedings Article•DOI•

[...]

Ken C. K. Lee¹, Wang-Chien Lee¹, Baihua Zheng²•Institutions (2)

Pennsylvania State University¹, Singapore Management University²

24 Mar 2009

TL;DR: The experiment result shows the superiority of ROAD over the state-of-the-art approaches, and several properties useful to construct Rnet hierarchy.

...read moreread less

Abstract: In this paper, we present ROAD, a general framework to evaluate Location-Dependent Spatial Queries (LDSQ)s that searches for spatial objects on road networks. By exploiting search space pruning technique and providing a dynamic object mapping mechanism, ROAD is very efficient and flexible for various types of queries, namely, range search and nearest neighbor search, on objects over large-scale networks. ROAD is named after its two components, namely, Route Overlay and Association Directory, designed to address the network traversal and object access aspects of the framework. In ROAD, a large road network is organized as a hierarchy of interconnected regional sub-networks (called Rnets) augmented with 1) shortcuts for accelerating network traversals; and 2) object abstracts for guiding traversals. In this paper, we present (i) the Rnet hierarchy and several properties useful to construct Rnet hierarchy, (ii) the design and implementation of the ROAD framework, (iii) efficient object search algorithms for various queries, and (iv) incremental update techniques for framework maintenance in presence of object and network changes. We conducted extensive experiments with real road networks to evaluate ROAD. The experiment result shows the superiority of ROAD over the state-of-the-art approaches.

...read moreread less

53 citations

Journal Article•DOI•

A distributed spatial index for error-prone wireless data broadcast

[...]

Baihua Zheng¹, Wang-Chien Lee², Ken C. Lee², Dik Lun Lee³, Min Shao² - Show less +1 more•Institutions (3)

Singapore Management University¹, Pennsylvania State University², Hong Kong University of Science and Technology³

01 Aug 2009

TL;DR: DSI is very resilient to the error-prone wireless communication environment because interrupted search operations based on DSI can be resumed easily and supports search algorithms for classical location-based queries such as window queries and kNN queries in both of the snapshot and continuous query modes.

...read moreread less

Abstract: Information is valuable to users when it is available not only at the right time but also at the right place. To support efficient location-based data access in wireless data broadcast systems, a distributed spatial index (called DSI) is presented in this paper. DSI is highly efficient because it has a linear yet fully distributed structure that naturally shares links in different search paths. DSI is very resilient to the error-prone wireless communication environment because interrupted search operations based on DSI can be resumed easily. It supports search algorithms for classical location-based queries such as window queries and kNN queries in both of the snapshot and continuous query modes. In-depth analysis and simulation-based evaluation have been conducted. The results show that DSI significantly out-performs a variant of R-trees tailored for wireless data broadcast environments.

...read moreread less

52 citations

Journal Article•DOI•

Visible Reverse k-Nearest Neighbor Query Processing in Spatial Databases

[...]

Yunjun Gao, Baihua Zheng, Gencai Chen¹, Wang-Chien Lee², Ken C. K. Lee², Qing Li³ - Show less +2 more•Institutions (3)

Zhejiang University¹, Pennsylvania State University², City University of Hong Kong³

01 Sep 2009-IEEE Transactions on Knowledge and Data Engineering

TL;DR: This paper proposes an efficient algorithm for VRNN query processing, and extends the solution to several variations of VRNN queries, including visible reverse k-nearest neighbor (VRkNN) search, which finds the points in P that have q as one of their k visible nearest neighbors.

...read moreread less

Abstract: Reverse nearest neighbor (RNN) queries have a broad application base such as decision support, profile-based marketing, resource allocation, etc. Previous work on RNN search does not take obstacles into consideration. In the real world, however, there are many physical obstacles (e.g., buildings) and their presence may affect the visibility between objects. In this paper, we introduce a novel variant of RNN queries, namely, visible reverse nearest neighbor (VRNN) search, which considers the impact of obstacles on the visibility of objects. Given a data set P, an obstacle set O, and a query point q in a 2D space, a VRNN query retrieves the points in P that have q as their visible nearest neighbor. We propose an efficient algorithm for VRNN query processing, assuming that P and O are indexed by R-trees. Our techniques do not require any preprocessing and employ half-plane property and visibility check to prune the search space. In addition, we extend our solution to several variations of VRNN queries, including: 1) visible reverse k-nearest neighbor (VRkNN) search, which finds the points in P that have q as one of their k visible nearest neighbors; 2) delta-VRkNN search, which handles VRkNN retrieval with the maximum visible distance delta constraint; and 3) constrained VRkNN (CVRkNN) search, which tackles the VRkNN query with region constraint. Extensive experiments on both real and synthetic data sets have been conducted to demonstrate the efficiency and effectiveness of our proposed algorithms under various experimental settings.

...read moreread less

50 citations

Proceedings Article•DOI•

Continuous visible nearest neighbor queries

[...]

Yunjun Gao¹, Baihua Zheng¹, Wang-Chien Lee², Gencai Chen³•Institutions (3)

Singapore Management University¹, Pennsylvania State University², Zhejiang University³

24 Mar 2009

TL;DR: This paper formulate the problem and propose efficient algorithms for CVNN query processing, assuming that both P and O are indexed by R-trees, and extend the techniques to several variations of theCVNN query.

...read moreread less

Abstract: In this paper, we identify and solve a new type of spatial queries, called continuous visible nearest neighbor (CVNN) search. Given a data set P, an obstacle set O, and a query line segment q, a CVNN query returns a set of (p, R) tuples such that p e P is the nearest neighbor (NN) to every point r along the interval R e q as well as p is visible to r. Note that p may be NULL, meaning that all points in P are invisible to all points in R, due to the obstruction of some obstacles in O. In this paper, we formulate the problem and propose efficient algorithms for CVNN query processing, assuming that both P and O are indexed by R-trees. In addition, we extend our techniques to several variations of the CVNN query. Extensive experiments verify the efficiency and effectiveness of our proposed algorithms using both real and synthetic datasets.

...read moreread less

45 citations

Journal Article•DOI•

111-Gb/s Transmission Over 1040-km Field-Deployed Fiber With 10G/40G Neighbors

[...]

M. S. Alfiad¹, Maxim Kuschnerov, T. Wuth², T. J. Xia³, Glenn A. Wellbrock³, E.-D. Schmidt⁴, D. van den Borne⁴, Bernhard Spinnler⁴, C.-J. Weiske⁴, E. de Man⁴, Antonio Napoli⁴, M. Finkenzeller⁴, Stefan Spaelter⁴, M. Rehman⁴, J. Behel⁴, M. Chbat⁴, J. Stachowiak⁴, D. Peterson³, Wang-Chien Lee³, M. Pollock³, B. Basch³, David Z. Chen³, M. Freiberger³, Berthold Lankl, H. de Waardt¹ - Show less +21 more•Institutions (4)

Eindhoven University of Technology¹, Siemens², Verizon Communications³, Nokia Networks⁴

15 May 2009-IEEE Photonics Technology Letters

TL;DR: In this paper, the authors demonstrate transmission of a 111-Gb/s coherent polarization-multiplexed return-to-zero differential quadrature phase shift keying signal over 1040-km field-deployed fiber together with different types of neighboring channels, and with a cascade of 50-GHz reconfigerable optical add-drop multiplexers.

...read moreread less

Abstract: We demonstrate transmission of a 111-Gb/s coherent polarization-multiplexed return-to-zero differential quadrature phase-shift keying signal over 1040-km field-deployed fiber together with different types of neighboring channels, and with a cascade of 50-GHz reconfigerable optical add-drop multiplexers. Our transmission experiment proves the feasibility of transmitting a 111-Gb/s phase-modulated channel with 10 times 10.7-Gb/s on-off keying neighboring channels on a 50-GHz grid, despite the presence of strong cross-phase modulation.

...read moreread less

44 citations

Proceedings Article•DOI•

Finding skyline paths in road networks

[...]

Yuan Tian¹, Ken C. K. Lee¹, Wang-Chien Lee¹•Institutions (1)

Pennsylvania State University¹

04 Nov 2009

TL;DR: To narrow down the search scope for result skyline paths, partial dominance test and full path dominance test are devised as two components of SkyPath.

...read moreread less

Abstract: This paper presents a research study on skyline path queries. Given a source s and a destination d on a road network and multiple path search criteria (e.g., short distance and short travel time), a skyline query returns a set of non-dominated paths from s to d. These non-dominated paths are called skyline paths. Efficient computation of skyline path queries is very challenging due to expensive network traversals and extensive path comparisons in dominance tests. In this paper, we explore the characteristics of skyline paths, based on which a novel skyline path search algorithm called SkyPath is proposed. To narrow down the search scope for result skyline paths, partial dominance test and full path dominance test are devised as two components of SkyPath. Evaluation results show the superiority of the SkyPath algorithm over the state-of-the-art approaches.

...read moreread less

38 citations

Proceedings Article•DOI•

Processing probabilistic spatio-temporal range queries over moving objects with uncertainty

[...]

Bruce S. E. Chung¹, Wang-Chien Lee², Arbee L. P. Chen³•Institutions (3)

National Tsing Hua University¹, Pennsylvania State University², National Chengchi University³

24 Mar 2009

TL;DR: This paper studies the problem of answering probabilistic range queries on moving objects based on an uncertainty model, which captures the possible movements of objects with probabilities and maps the uncertain movements of all objects to a dual space for indexing.

...read moreread less

Abstract: Range queries for querying the current and future positions of the moving objects have received growing interests in the research community. Existing methods, however, assume that an object only moves along an anticipated path. In this paper, we study the problem of answering probabilistic range queries on moving objects based on an uncertainty model, which captures the possible movements of objects with probabilities. Evaluation of probabilistic queries is challenging due to large objects volume and costly computation. We map the uncertain movements of all objects to a dual space for indexing. By querying the index, we quickly eliminate unqualified objects and employ an approximate approach to examine the remaining candidates for final answer. We conduct a comprehensive performance study, which shows our proposal significantly reduces the number of object examinations and the overall cost of the query evaluation.

...read moreread less

37 citations

Proceedings Article•DOI•

Visible Reverse k-Nearest Neighbor Queries

[...]

Yunjun Gao¹, Baihua Zheng¹, Gencai Chen², Wang-Chien Lee³, Ken C. K. Lee³, Qing Li⁴ - Show less +2 more•Institutions (4)

Singapore Management University¹, Zhejiang University², Pennsylvania State University³, City University of Hong Kong⁴

29 Mar 2009

TL;DR: This paper introduces a novel variant of RNN queries, namely visible reverse nearest neighbor (VRNN) search, which considers the obstacle influence on the visibility of objects and proposes an efficient algorithm for VRNN query processing.

...read moreread less

Abstract: Reverse nearest neighbor (RNN) queries have a broad application base such as decision support, profile-based marketing, resource allocation, data mining, etc. Previous work on RNN search does not take obstacles into consideration. In the real world, however, there are many physical obstacles (e.g., buildings, blindages, etc.), and their presence may affect the visibility/distance between two objects. In this paper, we introduce a novel variant of RNN queries, namely visible reverse nearest neighbor (VRNN) search, which considers the obstacle influence on the visibility of objects. Given a data set P, an obstacle set O, and a query point q, a VRNN query retrieves the points in P that have q as their nearest neighbor and are visible to q. We propose an efficient algorithm for VRNN query processing, assuming that both P and O are indexed by R-trees. Our method does not require any pre-processing, and employs half-plane property and visibility check to prune the search space.

...read moreread less

34 citations

Proceedings Article•DOI•

A probabilistic topic-based ranking framework for location-sensitive domain information retrieval

[...]

Huajing Li¹, Zhisheng Li, Wang-Chien Lee¹, Dik Lun Lee²•Institutions (2)

Pennsylvania State University¹, Hong Kong University of Science and Technology²

19 Jul 2009

TL;DR: The proposed method recognizes the geographical distribution of topic influence in the process of ranking documents and models it accurately using probabilistic Gaussian Process classifiers and works significantly better than other popular location-aware information retrieval techniques in ranking quality.

...read moreread less

Abstract: It has been observed that many queries submitted to search engines are location-sensitive. Traditional search techniques fail to interpret the significance of such geographical clues and as such are unable to return highly relevant search results. Although there have been efforts in the literature to support location-aware information retrieval, critical challenges still remain in terms of search result quality and data scalability. In this paper, we propose an innovative probabilistic ranking framework for domain information retrieval where users are interested in a set of location-sensitive topics. Our proposed method recognizes the geographical distribution of topic influence in the process of ranking documents and models it accurately using probabilistic Gaussian Process classifiers. Additionally, we demonstrate the effectiveness of the proposed ranking framework by implementing it in a Web search service for NBA news. Extensive performance evaluation is performed on real Web document collections, which confirms that our proposed mechanism works significantly better (around 29.7% averagely using DCG20 measure) than other popular location-aware information retrieval techniques in ranking quality.

...read moreread less

32 citations

Proceedings Article•DOI•

Navigational path privacy protection: navigational path privacy protection

[...]

Ken C. K. Lee¹, Wang-Chien Lee¹, Hong Va Leong², Baihua Zheng³•Institutions (3)

Pennsylvania State University¹, Hong Kong Polytechnic University², Singapore Management University³

02 Nov 2009

TL;DR: An obfuscator framework is presented that reduces the likelihood of path queries being revealed, while supporting different user privacy protection needs and retaining query evaluation efficiency, and enhancing privacy protection against collusion attacks.

...read moreread less

Abstract: Navigational path query, one of the most popular location-based services (LBSs), determines a route from a source to a destination on a road network. However, issuing path queries to some non-trustworthy service providers may pose privacy threats to the users. For instance, given a query requesting for a path from a residential address to a psychiatrist, some adversaries may deduce "who is related to what disease". In this paper, we present an obfuscator framework that reduces the likelihood of path queries being revealed, while supporting different user privacy protection needs and retaining query evaluation efficiency. The framework consists of two major components, namely, an obfuscator and an obfuscated path query processor. The former formulates obfuscated path queries by intermixing true and fake sources and destinations and the latter facilitates efficient evaluation of the obfuscated path queries in an LBS server. The framework supports three types of obfuscated path queries, namely, independent obfuscated path query, shared obfuscated path query, and anti-collusion obfuscated path query. Our proposal strikes a balance between privacy protection strength and query processing overheads, while enhancing privacy protection against collusion attacks. Finally, we validate the proposed ideas and evaluate the performance of our framework based on an extensive set of empirical experiments.

...read moreread less

30 citations

Journal Article•DOI•

KTR: An Efficient Key Management Scheme for Secure Data Access Control in Wireless Broadcast Services

[...]

Qijun Gu¹, Peng Liu², Wang-Chien Lee, Chao-Hsien Chu²•Institutions (2)

Texas State University¹, Penn State College of Information Sciences and Technology²

01 Jul 2009-IEEE Transactions on Dependable and Secure Computing

TL;DR: This paper proposes an efficient key management scheme, namely, key tree reuse (KTR), to handle key distribution with regard to complex subscription options and user activities and shows that KTR can save about 45 percent of communication overhead in the broadcast channel and about 50 percent of decryption cost for each user compared with logical-key-hierarchy-based approaches.

...read moreread less

Abstract: Wireless broadcast is an effective approach for disseminating data to a number of users. To provide secure access to data in wireless broadcast services, symmetric-key-based encryption is used to ensure that only users who own the valid keys can decrypt the data. With regard to various subscriptions, an efficient key management for distributing and changing keys is in great demand for access control in broadcast services. In this paper, we propose an efficient key management scheme, namely, key tree reuse (KTR), to handle key distribution with regard to complex subscription options and user activities. KTR has the following advantages. First, it supports all subscription activities in wireless broadcast services. Second, in KTR, a user only needs to hold one set of keys for all subscribed programs instead of separate sets of keys for each program. Third, KTR identifies the minimum set of keys that must be changed to ensure broadcast security and minimize the rekey cost. Our simulations show that KTR can save about 45 percent of communication overhead in the broadcast channel and about 50 percent of decryption cost for each user compared with logical-key-hierarchy-based approaches.

...read moreread less

Journal Article•DOI•

Tuning On-Air Signatures for Balancing Performance and Confidentiality

[...]

Baihua Zheng, Wang-Chien Lee¹, Peng Liu¹, Dik Lun Lee², Xuhua Ding - Show less +1 more•Institutions (2)

Pennsylvania State University¹, Hong Kong University of Science and Technology²

01 Dec 2009-IEEE Transactions on Knowledge and Data Engineering

TL;DR: An analysis of the trade off between performance and confidentiality in signature-based air indexing schemes for wireless data broadcast reveals that false drop probability and false guess probability share a similar trend as the tuning parameters of a signature scheme change and it is impossible to achieve a low false drop probabilities and a high false guess probabilities simultaneously.

...read moreread less

Abstract: In this paper, we investigate the trade off between performance and confidentiality in signature-based air indexing schemes for wireless data broadcast. Two metrics, namely, false drop probability and false guess probability, are defined to quantify the filtering efficiency and confidentiality loss of a signature scheme. Our analysis reveals that false drop probability and false guess probability share a similar trend as the tuning parameters of a signature scheme change and it is impossible to achieve a low false drop probability and a high false guess probability simultaneously. In order to balance the performance and confidentiality, we perform an analysis to provide a guidance for parameter settings of the signature schemes to meet different system requirements. In addition, we propose the jump pointer technique and the XOR signature scheme to further improve the performance and confidentiality. A comprehensive simulation has been conducted to validate our findings.

...read moreread less

Proceedings Article•DOI•

m-LIGHT: Indexing Multi-Dimensional Data over DHTs

[...]

Yuzhe Tang¹, Jianliang Xu², Shuigeng Zhou¹, Wang-Chien Lee³•Institutions (3)

Fudan University¹, Hong Kong Baptist University², Pennsylvania State University³

22 Jun 2009

TL;DR: Compared to the state-of-the-art indexing schemes, m- LIGHT substantially saves the index maintenance overhead, achieves a more balanced load distribution, and improves the range query performance in both bandwidth consumption and response latency.

...read moreread less

Abstract: In this paper, we study the problem of indexing multidimensional data in the P2P networks based on distributed hash tables (DHTs). We identify several design issues and propose a novel over-DHT indexing scheme called m- LIGHT. To preserve data locality, m-LIGHT employs a clever naming mechanism that gracefully maps the index tree into the underlying DHT so that it achieves efficient index maintenance and query processing. Moreover, m- LIGHT leverages a new data-aware index splitting strategy to achieve optimal load balance among peer nodes. We conduct an extensive performance evaluation for m-LIGHT. Compared to the state-of-the-art indexing schemes, m- LIGHT substantially saves the index maintenance overhead, achieves a more balanced load distribution, and improves the range query performance in both bandwidth consumption and response latency.

...read moreread less

Proceedings Article•DOI•

Monitoring minimum cost paths on road networks

[...]

Yuan Tian¹, Ken C. K. Lee¹, Wang-Chien Lee¹•Institutions (1)

Pennsylvania State University¹

04 Nov 2009

TL;DR: The notion of query scope is introduced, based on which a query scope index (QSI) is developed to identify affected path queries and a partial path computation algorithm (PPCA) is devised to quickly recompute the updated paths.

...read moreread less

Abstract: On a road network, the minimum cost path (or min-cost path for short) from a source location to a destination is a path with the smallest travel cost among all possible paths. Despite that min-cost path queries on static networks have been well studied, the problem of monitoring min-cost paths on a road network in presence of updates is not fully explored. In this paper, we present PathMon, an efficient system for monitoring min-cost paths in dynamic road networks. PathMon addresses two important issues of the min-cost path monitoring problem, namely, (i) path invalidation that identifies min-cost paths returned to path queries affected by network changes, and (ii) path update that replaces invalid paths with new ones for those affected path queries. For (i), we introduce the notion of query scope, based on which a query scope index (QSI) is developed to identify affected path queries. For (ii), we devise a partial path computation algorithm (PPCA) to quickly recompute the updated paths. Through a comprehensive performance evaluation by simulation, QSI and PPCA are demonstrated to be effective on the path invalidation and path update issues.

...read moreread less

Proceedings Article•DOI•

92-Gb/s field trial with ultra-high PMD tolerance of 107-ps DGD

[...]

T. J. Xia¹, Glenn A. Wellbrock¹, Michael D. Pollock¹, Wang-Chien Lee¹, Daniel L. Peterson¹, D. Doucet², John Sitch², K. Ghazian², P. Bryan², P. Rochon² - Show less +6 more•Institutions (2)

Verizon Communications¹, Nortel²

22 Mar 2009

TL;DR: In this paper, a field trial shows that a 92-Gb/s DS-DP-QPSK channel has much higher PMD tolerance than an ordinary 10.7-battery-powered OOK channel, allowing 100G transport over fiber that could not be used even for 10G.

...read moreread less

Abstract: A field trial shows a 92-Gb/s DS-DP-QPSK channel has much higher PMD tolerance than an ordinary 10.7-Gb/s OOK channel, allowing 100G transport over fiber that could not be used even for 10G.

...read moreread less

Proceedings Article•DOI•

OPAQUE: Protecting Path Privacy in Directions Search

[...]

Ken C. K. Lee¹, Wang-Chien Lee¹, Hong Va Leong, Baihua Zheng²•Institutions (2)

Pennsylvania State University¹, Singapore Management University²

29 Mar 2009

TL;DR: To protect user privacy from accessing directions search services, the OPAQUE system is introduced, which consists of an obfuscator that formulates obfuscated path queries by mixing true and fake sources/destinations and an obfuscation processor installed in the server for obfuscate path query processing.

...read moreread less

Abstract: Directions search returns the shortest path from a source to a destination on a road network. However, the search interests of users may be exposed to the service providers, thus raising privacy concerns. For instance, a path query that finds a path from a resident address to a clinic may lead to a deduction about "who is related to what disease". To protect user privacy from accessing directions search services, we introduce the OPAQUE system, which consists of two major components: (1) an obfuscator that formulates obfuscated path queries by mixing true and fake sources/destinations; and (2) an obfuscated path query processor installed in the server for obfuscated path query processing. OPAQUE reduces the likelihood of path queries being revealed and allows retrieval of requested paths. We propose two types of obfuscated path queries, namely, independently obfuscated path query and shared obfuscated path query to strike a balance between privacy protection strength and query processing overhead, and to enhance privacy protection against collusion attacks.

...read moreread less

Proceedings Article•DOI•

Efficient valid scope computation for location-dependent spatial queries in mobile and wireless environments

[...]

Ken C. K. Lee¹, Wang-Chien Lee², Hong Va Leong¹, Brandon Unger², Baihua Zheng³ - Show less +1 more•Institutions (3)

Hong Kong Polytechnic University¹, Pennsylvania State University², Singapore Management University³

15 Feb 2009

TL;DR: This paper designs efficient algorithms to compute the valid scope for common types of LDSQs, including nearest neighbor queries and range queries, and shows how contention on wireless channel and client energy consumed for data transmission can be considerably reduced.

...read moreread less

Abstract: In mobile and wireless environments, mobile clients can access information with respect to their locations by submitting Location-Dependent Spatial Queries (LDSQs) to Location-Based Service (LBS) servers. Owing to scarce wireless channel bandwidth and limited client battery life, frequent LDSQ submission from clients must be avoided. Observing that LDSQs issued from similar client positions would normally return the same results, we explore the idea of valid scope, that represents a spatial area in which a set of LDSQs will retrieve exactly the same query results. With a valid scope derived and an LDSQ result cached at the client side, a client can assert whether the new LDSQs can be answered with the maintained LDSQ result, thus eliminating the LDSQs sent to the server. As such, contention on wireless channel and client energy consumed for data transmission can be considerably reduced. In this paper, we design efficient algorithms to compute the valid scope for common types of LDSQs, including nearest neighbor queries and range queries. Through an extensive set of experiments, our proposed valid scope computation algorithms are shown to significantly outperform existing approaches.

...read moreread less

Proceedings Article•DOI•

An analytical study of GWAP-based geospatial tagging systems

[...]

Ling-Jyh Chen¹, Yu-Song Syu¹, Bo-Chun Wang¹, Wang-Chien Lee²•Institutions (2)

Academia Sinica¹, Pennsylvania State University²

28 Dec 2009

TL;DR: This study designs three metrics to evaluate the system performance, develops five task assignment algorithms for a GWAP-based geotagging system, and finds that the Least-Throughput-First Assignment algorithm (LTFA) is the most effective approach because it can achieve competitive system utility, while its computational complexity remains moderate.

...read moreread less

Abstract: Geospatial tagging (geotagging) is an emerging and very promising application that can help users find a wide variety of location-specific information, and facilitate the development of future location-based services. Conventional geotagging systems share some limitations, such as the use of a two-phase operating model and the tendency to tag popular objects with simple contexts. To address these problems, geotagging systems based on the concept of ‘Games with a Purpose’ (GWAP) have been developed recently. In this study, we use analysis to investigate these new systems. Based on our analysis results, we design three metrics to evaluate the system performance, and develop five task assignment algorithms for a GWAP-based system. Using a comprehensive set of simulations under both synthetic and realistic mobility scenarios, we find that the Least-Throughput-First Assignment algorithm (LTFA) is the most effective approach because it can achieve competitive system utility, while its computational complexity remains moderate. We also find that, to improve the system utility, it is better to assign as many tasks as possible in each round. However, because players may feel annoyed if too many tasks are assigned at the same time, it is recommended that multiple tasks be assigned one by one in each round in order to achieve higher system utility.

...read moreread less

Journal Article•DOI•

Editorial: Special Section: Scalable information systems

[...]

Wang-Chien Lee¹, Jianliang Xu², Jianzhong Li³, Fabrizio Silvestri•Institutions (3)

Pennsylvania State University¹, Hong Kong Baptist University², Harbin Institute of Technology³

01 Jan 2009-Future Generation Computer Systems

Proceedings Article•DOI•

SolutionFinder: Intelligent Knowledge Integration and Dissemination for Solution Retrieval in IT Support Services

[...]

Huajing Li¹, Maja Vukovic², Gopal Pingali², Wang-Chien Lee¹•Institutions (2)

Pennsylvania State University¹, IBM²

21 Sep 2009

TL;DR: SolutionFinder is presented, an autonomous framework, which dynamically integrates online resources to enrich the knowledge base for IT support systems and provides context-aware search support to remove the textual ambiguity embedded in user queries.

...read moreread less

Abstract: Online support centers are emerging as a cost-effective and innovative solution designed to enable end-users to resolve technical problems more effectively without relying on live support from contact center agents. However, the capacity limitation of corporate knowledge bases prevents online support centers from effectively resolving user problems. In addition, traditional textual search techniques employed by most online support centers fall short from accurately interpreting user queries due to the ambiguity of user requests and the heterogeneity of technical problems. In this paper, we present SolutionFinder, an autonomous framework, which dynamically integrates online resources to enrich the knowledge base for IT support systems. SolutionFinder provides context-aware search support to remove the textual ambiguity embedded in user queries. Furthermore, SolutionFinder transforms solution documents into solution paths to analyze their similarity to provide high-quality solution recommendations. Evaluation results suggest by leveraging our proposed algorithms, a support service can accurately locate Web solution resources and provide high-quality services.

...read moreread less

Book Chapter•DOI•

Clustering Data in Peer-to-Peer Systems

[...]

Mei Li¹, Wang-Chien Lee²•Institutions (2)

Microsoft¹, Pennsylvania State University²

01 Jan 2009

TL;DR: This chapter focuses on clustering, one of the most important data mining tasks, in P2P systems, and outlines the challenges and review the start-of-the-art in this area.

...read moreread less

Abstract: With the advances in network communication, many large scale network systems have emerged. Peer-topeer (P2P) systems, where a large number of nodes self-form into a dynamic information sharing system, are good examples. It is extremely valuable for many P2P applications, such as market analysis, scientific exploration, and smart query answering, to discover the knowledge hidden in this distributed data repository. In this chapter, we focus on clustering, one of the most important data mining tasks, in P2P systems. We outline the challenges and review the start-of-the-art in this area. Clustering is a data mining technique to group a set of data objects into classes of similar data objects. Data objects within the same class are similar to each other, while data objects across classes are considered as dissimilar. Clustering has a wide range of applications, e.g., pattern recognition, spatial data analysis, custom/market analysis, document classification and access pattern discovery in WWW, etc. Data mining community have been intensively studying clustering techniques for the last decade. As a result, various clustering algorithms have been proposed. Majority of these proposed algorithms is designed for traditional centralized systems where all data to be clustered resides in (or is transferred to) a central site. However, it is not desirable to transfer all the data from widely spread data sources to a centralized server for clustering in P2P systems. This is due to the following three reasons: 1) there is no central control in P2P systems; 2) transferring all data objects to a central site would incur excessive communication overheads, and 3) participants of P2P systems reside in a collaborating yet competing environment, and thus they may like to expose as little information as possible to other peers for various reasons. In addition, these existing algorithms are designed to minimize disk access cost. In P2P system, the communication cost is a dominating factor. Therefore, we need to reexamine the problem of clustering in P2P systems. A general idea to perform clustering in P2P systems is to first cluster the local data objects at each peer and then combine the local clustering results to form a global clustering result. Based on this general idea, clustering in P2P systems essentially consists of two steps, i.e., local clustering and cluster assembly. While local clustering can be done by employing existing clustering techniques, cluster assembly is a nontrivial issue, which concerns representation model (what should be communicated among peers) and communication model (how peers communicate with each other). In this chapter, we review three representation models (including two approximate representation models and an exact representation model) and three communication models (including flooding-based communication model, centralized communication model, and hierarchical communication model). The rest of this chapter is organized as follows. In next section, we provide some background knowledge on P2P systems and clustering techniques. The details of representation models and communication models are presented in Section 3. We discuss future trend and draw the conclusion in Section 4 and Section 5, respectively.

...read moreread less