scispace - formally typeset
Search or ask a question

Showing papers presented at "International Conference on Peer-to-Peer Computing in 2011"


Proceedings ArticleDOI
10 Oct 2011
TL;DR: This paper tackles the research question: is it possible to build a decentralized OSN over a social overlay, i.e., an overlay network whose links among nodes mirror the social network relationships among the nodes' owners?
Abstract: Online social networks (OSN) have attracted millions of users worldwide. This enormous success is not without problems; the centralized architectures of OSNs, storing the users' personal data, provides ample opportunity for privacy violation — a fact that has raised the demand for open, decentralized alternatives. We tackle the research question: is it possible to build a decentralized OSN over a social overlay, i.e., an overlay network whose links among nodes mirror the social network relationships among the nodes' owners? This paper provides a stepping stone to the answer, by focusing on the key OSN functionality of disseminating profile updates. Our approach relies on gossip protocols. We show that mainstream gossip protocols are inefficient, due to the properties that characterize social networks. We then leverage these very same properties towards our goal, by appropriately modifying gossip forwarding rules. Our evaluation, performed in simulation over a crawled real-world social network, shows that our protocols provide acceptable latency, foster load balancing across nodes, and tolerate churn.

72 citations


Proceedings ArticleDOI
10 Oct 2011
TL;DR: It is shown that the problem of obtaining maximal availability while minimizing redundancy is NP-complete; in addition, an exploratory study on data placement strategies is performed, and their performance in terms of redundancy needed and availability obtained.
Abstract: Friend-to-friend networks, i.e. peer-to-peer networks where data are exchanged and stored solely through nodes owned by trusted users, can guarantee dependability, privacy and uncensorability by exploiting social trust. However, the limitation of storing data only on friends can come to the detriment of data availability: if no friends are online, then data stored in the system will not be accessible. In this work, we explore the tradeoffs between redundancy (i.e., how many copies of data are stored on friends), data placement (the choice of which friend nodes to store data on) and data availability (the probability of finding data online). We show that the problem of obtaining maximal availability while minimizing redundancy is NP-complete; in addition, we perform an exploratory study on data placement strategies, and we investigate their performance in terms of redundancy needed and availability obtained. By performing a trace-based evaluation, we show that nodes with as few as 10 friends can already obtain good availability levels.

56 citations


Proceedings ArticleDOI
10 Oct 2011
TL;DR: In this article, the resilience of generalized regenerating codes (supporting multi-repairs, using collaboration among newcomers) in the presence of two classes of Byzantine nodes, relatively benign selfish (non-cooperating) nodes, as well as under more active, malicious polluting nodes.
Abstract: Recent years have witnessed a slew of coding techniques custom designed for networked storage systems. Network coding inspired regenerating codes are the most prolifically studied among these new age storage centric codes. A lot of effort has been invested in understanding the fundamental achievable trade-offs of storage and bandwidth usage to maintain redundancy in presence of different models of failures, showcasing the efficacy of regenerating codes with respect to traditional erasure coding techniques. For practical usability in open and adversarial environments, as is typical in peer-to-peer systems, we need however not only resilience against erasures, but also from (adversarial) errors. In this paper, we study the resilience of generalized regenerating codes (supporting multi-repairs, using collaboration among newcomers) in the presence of two classes of Byzantine nodes, relatively benign selfish (non-cooperating) nodes, as well as under more active, malicious polluting nodes. We give upper bounds on the resilience capacity of regenerating codes, and show that the advantages of collaborative repair can turn to be detrimental in the presence of Byzantine nodes. We further exhibit that system mechanisms can be combined with regenerating codes to mitigate the effect of rogue nodes.

48 citations


Proceedings ArticleDOI
10 Oct 2011
TL;DR: This paper demonstrates a privacy-aware decentralized OSN called My312, where users can exercise full access control on their data and exploits trust relationships in the social network for providing the necessary decentralized storage infrastructure.
Abstract: As the online social networks (OSNs), such as Facebook, witness explosive growth, the privacy challenges gain critical consideration from governmental and law agencies due to concentration of vast amount of personal information within a single administrative domain. In this paper, we demonstrate a privacy-aware decentralized OSN called My312, where users can exercise full access control on their data. Our system exploits trust relationships in the social network for providing the necessary decentralized storage infrastructure. By taking users' geographical locations and online time statistics into account, it also addresses availability and storage performance issues.

47 citations


Proceedings ArticleDOI
10 Oct 2011
TL;DR: The architecture of PeerfactSim.KOM is described, the workflow, selected experiences and lessons learned in Section III, and the overview of the simulator is presented, which aims at the evaluation of interdependencies in multi-layered p2p systems.
Abstract: Research on peer-to-peer (p2p) and distributed systems needs evaluation tools to predict and observe the behavior of protocols and mechanisms in large scale networks. PeerfactSim.KOM1 [1] is a simulator for large scale distributed/p2p systems aiming at the evaluation of interdependencies in multi-layered p2p systems. The simulator is written in Java, is event-based and mainly used in p2p research projects4. The main development of PeerfactSim.KOM started in 2005 and is driven since 2006 by the project "QuaP2P"2, which aims at the systematic improvement and benchmarking of p2p systems. Further users of the simulator are working in the project "On-the-fly Computing"3 aiming at researching p2p-based service oriented architectures. Both projects5 state severe requirements on the evaluation of multi-layered and large-scale distributed systems. We describe the architecture of PeerfactSim.KOM supporting these requirements in Section II, present the workflow, selected experiences and lessons learned in Section III and conclude the overview in Section IV.

39 citations


Proceedings ArticleDOI
10 Oct 2011
TL;DR: This paper proposes a novel approach that further reduces the economic cost of cloud computing by exploiting a passive storage service like Amazon S3 not only to distribute content to clients, but also to build and manage the P2P network linking them.
Abstract: Peer-to-peer (P2P) and cloud computing, two of the Internet trends of the last decade, hold similar promises: the (virtually) infinite availability of computing and storage resources. But there are important differences: the cloud provides highly-available resources, but at a cost; P2P resources are for free, but their availability is shaky. Several academic and commercial projects have explored the possibility of mixing the two, creating a large number of peer-assisted applications, particularly in the field of content distribution, where the cloud provides a highly-available and persistent service, while P2P resources are exploited for free whenever possible to reduce the economic cost. While executing active servers on elastic computing facilities like Amazon EC2 and pairing them with user-provided peers is definitely one way to go, this paper proposes a novel approach that further reduces the economic cost. Here, a passive storage service like Amazon S3 is exploited not only to distribute content to clients, but also to build and manage the P2P network linking them. An effort is made to guarantee that the read/write load imposed on the storage remains constant, regardless of the number of peers/clients. These two choices allows us to keep the monetary cost of the cloud always under control, in the presence of just one peer or with a million of them. We show the feasibility of our approach by discussing two cases studies for content distribution: the Dilbert's comic strips and the hourly News Update podcast from CNN.

39 citations


Proceedings ArticleDOI
10 Oct 2011
TL;DR: It is found that in the Mainline BitTorrent DHT (MDHT), probably the largest DHT overlay on the Internet, many lookups already yield results in less than a second, albeit not consistently, and backwards-compatible modifications are shown that not only can they reduce median latencies to between 100 and 200 ms, but also consistently achieve sub-second lookups.
Abstract: Previous studies of large-scale (multimillion node) Kademlia-based DHTs have shown poor performance, measured in seconds; in contrast to the far more optimistic results from theoretical analysis, simulations and testbeds. In this paper, we unexpectedly find that in the Mainline BitTorrent DHT (MDHT), probably the largest DHT overlay on the Internet, many lookups already yield results in less than a second, albeit not consistently. With our backwards-compatible modifications, we show that not only can we reduce median latencies to between 100 and 200 ms, but also consistently achieve sub-second lookups. These results suggest that it is possible to deploy latency-sensitive applications on top of large-scale DHT overlays on the Internet, contrary to what some might have concluded based on previous results reported in the literature.

37 citations


Proceedings ArticleDOI
10 Oct 2011
TL;DR: It is found that BitTorrent flashcrowds only occur in very small fractions of the swarms but that they can affect over ten million users, and an algorithm is developed that identifies BitTorrentflashc Crowds.
Abstract: Flashcrowds — sudden surges of user arrivals — do occur in BitTorrent, and they can lead to severe service deprivation. However, very little is known about their occurrence patterns and their characteristics in real-world deployments, and many basic questions about BitTorrent flashcrowds, such as How often do they occur? and How long do they last?, remain unanswered. In this paper, we address these questions by studying three datasets that cover millions of swarms from two of the largest BitTorrent trackers. We first propose a model for BitTorrent flashcrowds and a procedure for identifying, analyzing, and modeling BitTorrent flashcrowds. Then we evaluate quantitatively the impact of flashcrowds on BitTorrent users, and we develop an algorithm that identifies BitTorrent flashcrowds. Finally, we study statistically the properties of BitTorrent flashcrowds identified from our datasets, such as their arrival time, duration, and magnitude, and we investigate the relationship between flashcrowds and swarm growth, and the arrival rate of flashcrowds in BitTorrent trackers. In particular, we find that BitTorrent flashcrowds only occur in very small fractions (0.3–2%) of the swarms but that they can affect over ten million users.

35 citations


Proceedings ArticleDOI
10 Oct 2011
TL;DR: Using measurement data collected over a week by instrumenting Spotify clients, general network properties such as the correspondence between individual user accounts and the number of IP addresses they connect from and the prevalence of Network Address Translation devices (NATs) are analyzed.
Abstract: Spotify is a streaming service offering low-latency access to a large library of music. Streaming is performed by a combination of client-server access and a peer-to-peer protocol. The service currently has a user base of over 10 million and is available in seven European countries. We provide a background on the Spotify protocol with emphasis on the formation of the peer-to-peer overlay. Using measurement data collected over a week by instrumenting Spotify clients, we analyze general network properties such as the correspondence between individual user accounts and the number of IP addresses they connect from and the prevalence of Network Address Translation devices (NATs). We also discuss the performance of one of the two peer discovery mechanisms used by Spotify.

31 citations


Proceedings ArticleDOI
10 Oct 2011
TL;DR: A fluid model is proposed to study the effects of oversupply under SRE, which predicts the average downloading speed, the average seeding time, and the average upload capacity utilization for users in communities that employ SRE.
Abstract: Many private BitTorrent communities employ Sharing Ratio Enforcement (SRE) schemes to incentivize users to contribute their upload resources. It has been demonstrated that communities that use SRE are greatly oversupplied, i.e., they have much higher seeder-to-leecher ratios than communities in which SRE is not employed. The first order effect of oversupply under SRE is a positive increase in the average downloading speed. However, users are forced to seed for extremely long times to maintain adequate sharing ratios to be able to start new downloads. In this paper, we propose a fluid model to study the effects of oversupply under SRE, which predicts the average downloading speed, the average seeding time, and the average upload capacity utilization for users in communities that employ SRE. We notice that the phenomenon of oversupply has two undesired negative effects: a) Peers are forced to seed for long times, even though their seeding efforts are often not very productive (in terms of low upload capacity utilization); and b) SRE discriminates against peers with low bandwidth capacities and forces them to seed for longer durations than peers with high capacities. To alleviate these problems, we propose four different strategies for SRE, which have been inspired by ideas in social sciences and economics. We evaluate these strategies through simulations. Our results indicate that these new strategies release users from needlessly long seeding durations, while also being fair towards peers with low capacities and maintaining high system-wide downloading speeds.

26 citations


Proceedings ArticleDOI
10 Oct 2011
TL;DR: It is shown that allowing even a small flexibility in choosing nodes after the peer selection step results in large improvements on time to complete transfers, and that even simple informed scheduling policies can significantly reduce transfer time overhead.
Abstract: In Peer-to-Peer storage and backup applications, large amounts of data have to be transferred between nodes. In general, recipient of data transfers are not chosen randomly from the whole set of nodes in the Peer-to-Peer networks, but they are chosen according to peer selection rules imposing several criteria, such as resource contributions, position in DHTs, or trust between nodes. Imposing too stringent restrictions on the choice of nodes that are eligible to receive data can have a negative impact on the amount of time needed to complete data transfer, and scheduling choices influence this result as well. We formalize the problem of data transfer scheduling, and devise means for calculating (knowing a posteriori the availability patterns of nodes) optimal scheduling choices; we then propose and evaluate realistic scheduling policies, and evaluate their overheads in transfer times with respect to the optimal. We show that allowing even a small flexibility in choosing nodes after the peer selection step results in large improvements on time to complete transfers, and that even simple informed scheduling policies can significantly reduce transfer time overhead.

Proceedings ArticleDOI
10 Oct 2011
TL;DR: Flexible Routing Tables-Chord is presented, an FRT-based distributed hash table, and proof that it achieves O(log N)-hop lookups is given, and experiments with its implementation show that the routing table refining process proceeds as designed.
Abstract: This paper presents Flexible Routing Tables (FRT), a method for designing routing algorithms for overlay networks. FRT facilitates extending routing algorithms to reflect factors other than node identifiers. An FRT-based algorithm defines a total order on the set of all patterns of a routing table, and performs identifier-based routing according to that order. The algorithm gradually refines its routing table along the order by three operations: guarantee of reachability, entry learning, and entry filtering. This paper presents FRT-Chord, an FRT-based distributed hash table, and gives proof that it achieves O(log N)-hop lookups. Experiments with its implementation show that the routing table refining process proceeds as designed. Grouped FRT (GFRT), which introduces node groups into FRT, is also presented to demonstrate FRT's flexibility. GFRT-Chord resulted in a smaller numbers of routing hops between node groups than both Chord and FRT-Chord.

Proceedings ArticleDOI
10 Oct 2011
TL;DR: It is demonstrated that current BitTorrent swarms are experiencing a marked locality phenomenon at the overlay construction level (or connectivity graph), which suggests that an important portion of the BitTorrent traffic is currently confined within the ISPs.
Abstract: BitTorrent is one of the most popular application in the current Internet. However, we still have little knowledge about the topology of real BitTorrent swarms and how the traffic is actually exchanged among peers. This paper addresses fundamental questions regarding the topology of live BitTorrent swarms. For this purpose we have collected the evolution of the graph topology of 250 real torrents from its birth during a period of 15 days. Using this dataset we first demonstrate that real BitTorrent swarms are neither random graphs nor small world networks. Furthermore, we will see how some factors such as the torrent popularity affect the swarm topology. Secondly, the paper proposes a novel methodology in order to infer the clustered peers in real BitTorrent swarms, something that was not possible so far. Finally, we dedicate special effort to demonstrate that current BitTorrent swarms are experiencing a marked locality phenomenon at the overlay construction level (or connectivity graph). This locality effect is even more pronounced when we consider the exchange traffic relationships between peers. This suggests that an important portion of the BitTorrent traffic is currently confined within the ISPs. This opens a discussion regarding the relative gain of the locality solution proposed so far.

Proceedings ArticleDOI
10 Oct 2011
TL;DR: Results show that the time required to backup data in the network is comparable to a server-assisted approach, while substantially improving the time to restore data, which drops from a few days to a few hours.
Abstract: The availability of end devices of peer-to-peer storage and backup systems has been shown critical for usability and for system reliability in practice. This has led to the adoption of hybrid architectures composed of both peers and servers. Such architectures mask the instability of peers thus approaching the performances of client-server systems while providing scalability at a low cost. In this paper, we advocate the replacement of such servers by a cloud of residential gateways, as they are already present in users' homes, thus pushing the required stable components at the edge of the network. In our gateway-assisted system, gateways act as buffers between peers, compensating for their intrinsic instability. This enables to offload backup tasks quickly from the user's machine to the gateway, while significantly lowering the retrieval time of backed up data. We evaluate our proposal using real world traces including existing traces from Skype and Jabber as well as a trace of residential gateways for availability, and a residential broadband trace for bandwidth. Results show that the time required to backup data in the network is comparable to a server-assisted approach, while substantially improving the time to restore data, which drops from a few days to a few hours. As gateways are becoming increasingly powerful in order to enable new services, we expect such a proposal to be leveraged on a short term basis.

Proceedings ArticleDOI
10 Oct 2011
TL;DR: A new way to detect content pollution in the KAD network is proposed by analyzing all filenames linked to a content with a metric based on the Tversky index and which gives very low error rates.
Abstract: Content pollution is one of the major issues affecting P2P file sharing networks. However, since early studies on FastTrack and Overnet, no recent investigation has reported its impact on current P2P networks. In this paper, we present a method and the supporting architecture to quantify the pollution of contents in the KAD network. We first collect information on many popular files shared in this network. Then, we propose a new way to detect content pollution by analyzing all filenames linked to a content with a metric based on the Tversky index and which gives very low error rates. By analyzing a large number of popular files, we show that 2/3 of the contents are polluted, one part by index poisoning but the majority by a new, more dangerous, form of pollution that we call index falsification.

Proceedings ArticleDOI
10 Oct 2011
TL;DR: An improved measurement method for BitTorrent swarms that many unreachable peers is proposed, which increases the number of unique contacted peers by 112 % compared to the conventional method and increases the total volume of downloaded pieces by 66 %.
Abstract: BitTorrent is one of the most popular P2P file sharing applications in the world. Each BitTorrent network is called a swarm and millions of peers may join multiple swarms. However, there are many unreachable peers (NATed, Fire-Walled, or inactive at the time of the measurement) in each swarm. Due to this unreachable peers problem, the existing work can measure only a part of the entire peers in a swarm. In this paper, we propose an improved measurement method for BitTorrent swarms that many unreachable peers. In a nutshell, our crawler obtains peers behind NAT and firewalls by letting them connect to our crawlers through actively advertising our crawlers addresses to them. The evaluation result shows that our proposed method increases the number of unique contacted peers by 112 % compared to the conventional method. The proposed method also increases the total volume of downloaded pieces by 66 %. We then investigate the sampling bias among our proposed method and conventional methods, and find that different measurement methods can lead to significantly different measurement results.

Proceedings ArticleDOI
10 Oct 2011
TL;DR: A novel hybrid neighbor selection strategy with the flexibility to elect neighbors based on either type of network awareness with different probabilities is proposed, finding that network awareness in terms of both capacity and locality potentially degrades system QoS as a whole and that capacity awareness faces effort-based unfairness but enables contribution-based fairness.
Abstract: P2P content providers are motivated to localize traffic within Autonomous Systems and therefore alleviate the tension with ISPs stemming from costly inter-AS traffic generated by geographically distributed P2P users. In this paper, we first present a new three-tier framework to conduct a thorough study on the impact of various capacity aware or locality aware neighbor selection and chunk scheduling strategies. Specifically, we propose a novel hybrid neighbor selection strategy with the flexibility to elect neighbors based on either type of network awareness with different probabilities. We find that network awareness in terms of both capacity and locality potentially degrades system QoS as a whole and that capacity awareness faces effort-based unfairness, but enables contribution-based fairness. Extensive simulations show that hybrid neighbor selection can not only promote traffic locality but lift streaming quality and that the crux of traffic locality promotion is active overlay construction. Based on this observation, we then propose a totally decentralized network awareness protocol, equipped with hybrid neighbor selection. In realistic simulation environments, this protocol can reduce inter-AS traffic from 95% to 38% — a locality performance comparable with tracker-side strategies (35%) under the premise of high streaming quality. Our performance evaluation results provide valuable insights for both theoretical study on selfish topologies and real-deployed system design.

Proceedings ArticleDOI
10 Oct 2011
TL;DR: This paper uses data from two BitTorrent communities and presents results from trace-based simulations to investigate whether currently prevalent inter-swarm resource allocation mechanisms perform acceptably or call for improvements.
Abstract: A considerable body of research shows that Bit-Torrent provides very efficient resource allocation inside single swarms. Many BitTorrent clients also allow users to participate in multiple swarms simultaneously, and implement inter-swarm resource-allocation mechanisms that are used by millions of people. However, resource allocation across multiple swarms in BitTorrent has received much less attention. In this paper, we investigate whether currently prevalent inter-swarm resource allocation mechanisms perform acceptably or call for improvements. We use data from two BitTorrent communities and present results from trace-based simulations. Two use-cases for allocation mechanisms drive our evaluation: (1) file-sharing communities, whose objective is maximizing throughput, and (2) video-streaming communities, whose objective is maximizing the number of users receiving sufficient resources for uninterrupted streaming. To put the results from the analyzed mechanisms into perspective, we devise theoretical efficiency bounds for inter-swarm resource allocation, for which we map the resource allocation problem to a graph-theoretical flow network problem. In this formalism, the goal of the file-sharing use-case, throughput maximization, is equivalent to maximizing the flow in the network. The goal of the video-streaming use-case translates into finding a max-min fair allocation for BitTorrent downloading sessions, a problem for which we devise a new algorithm.

Proceedings ArticleDOI
10 Oct 2011
TL;DR: The understanding of uTP is refined, by gauging its impact on the primary BitTorrent user-centric metric, namely the torrent download time, by means of packet level simulation, and results show that in case uTP clients fully substitute TCP clients, no performance difference arise.
Abstract: BitTorrent, one of the most widespread file sharing P2P applications, recently introduced uTP, an application-level congestion control protocol which aims to efficiently use the available link capacity, while avoiding to interfere with the rest of user traffic (e.g., Web, VoIP and gaming) sharing the same access bottleneck. Research on uTP has so far focused on the investigation of the congestion control behavior on rather simple settings (i.e., single bottleneck, few backlogged flows, etc.), that are fairly far from the P2P settings in which the protocol is deployed. Moreover, prior work typically addressed questions, such as fairness and efficiency, that are natural from a congestion control context perspective, but are not directly related with the performance of the overall P2P system. In this work, we refine the understanding of uTP, by gauging its impact on the primary BitTorrent user-centric metric, namely the torrent download time, by means of packet level simulation. Results of our initial investigations show that: (i) in case uTP clients fully substitute TCP clients, no performance difference arise; (ii) in case of heterogeneous swarms, comprising peers using uTP and TCP congestion control, completion time of uTP peers can possibly benefit of lower uplink queuing delays, as signaling traffic (e.g., chunk requests) are not slowed down by long waits in the ADSL buffers.

Proceedings ArticleDOI
10 Oct 2011
TL;DR: An adaptive load balancing mechanism is designed and evaluated that solves problems related to saturated peers, that entail a significant drop in the diversity of references to objects, and if coupled with an enhanced content search procedure, allows a more fair and efficient usage of peer resources, at a reasonable cost.
Abstract: The endeavor of this work is to study the impact of content popularity in a large-scale Peer-to-Peer network, namely KAD. Armed with the insights gained from an extensive measurement campaign, which pinpoints several deficiencies of the present KAD design in handling popular objects, we set off to design and evaluate an adaptive load balancing mechanism. Our mechanism is backward compatible with KAD, as it only modifies its inner algorithms, and presents several desirable properties: (i) it drives the process that selects the number and location of peers responsible to store references to objects, based on their popularity; (ii) it solves problems related to saturated peers, that entail a significant drop in the diversity of references to objects, and (iii) if coupled with an enhanced content search procedure, it allows a more fair and efficient usage of peer resources, at a reasonable cost. Our evaluation uses a trace-driven simulator that features realistic peer churn and a precise implementation of the inner components of KAD.

Proceedings ArticleDOI
10 Oct 2011
TL;DR: An upper bound for the number of peers that can be admitted in the system over time is derived and it is found that there is a trade-off between having the seeders minimize the upload of pieces already injected recently and high peer QoS.
Abstract: The efficiency of BitTorrent in content distribution has inspired a number of peer-to-peer (P2P) protocols for on-demand video (VoD) streaming systems (henceforth BitTorrent-like VoD systems). However, the fundamental quality-of-service (QoS) requirements of VoD (i.e. providing peers with a smooth playback continuity and a short startup delay) make the design of these systems more challenging than normal file-sharing systems. In particular, the bandwidth allocation strategy is an important aspect in the design of BitTorrent-like VoD systems, which becomes even more crucial in a scenario where a large number of peers joins in a short period of time, a phenomenon known as flashcrowd. In fact, the new joining peers all demand for content while having few or no pieces of content to offer in return yet. An unwise allocation of the limited bandwidth actually available during this phase may cause peers to experience poor QoS. In this work, we analyze the effects of a flashcrowd on the scalability of a BitTorrent-like VoD system and propose a number of mechanisms to make the bandwidth allocation in this phase more effective. In particular, we derive an upper bound for the number of peers that can be admitted in the system over time and we find that there is a trade-off between having the seeders minimize the upload of pieces already injected recently and high peer QoS. Based on the insights gained from our analysis, we devise some flashcrowd-handling algorithms for the allocation of peer bandwidth to improve peer QoS during flashcrowd. We validate the effectiveness of our proposals by means of extensive simulations.

Proceedings ArticleDOI
10 Oct 2011
TL;DR: This paper presents an approach to estimate the available bandwidth of peers in a P2P system, based on a combination of traffic observation and the strategic injection of traffic into the system, and shows that it is accurate and responsive in settings with variable bandwidth while resulting in limited interference with the system.
Abstract: Many peer-to-peer (P2P) systems require accurate information about their peer's available bandwidth, e.g., for load balancing. Determining this information is difficult, as a suitable approach must address two challenges. First, it must be able to deal with fluctuating bandwidth. Second, it must incur low overhead to prevent interference with the operation of the P2P system. In this paper we present an approach to estimate the available bandwidth of peers in a P2P system, based on a combination of traffic observation and the strategic injection of traffic into the system. We evaluate our approach and show that it is accurate and responsive in settings with variable bandwidth while resulting in limited interference with the system.

Proceedings ArticleDOI
10 Oct 2011
TL;DR: This paper designs a novel incentive mechanism that incentivizes peers to select storage partners uniformly at random, and to establish asymmetric exchange relationships with them, and reduces the overall amount of contributed resources as well as the resources contributed by each peer individually.
Abstract: In P2P storage systems peers need to contribute some local storage resources in order to obtain a certain online and reliable storage capacity. To guarantee that the storage service works, P2P storage systems have to meet two main requirements. First, the storage system needs to maintain fairness among peers by ensuring that peers consuming more online storage capacity contribute more local storage resources. And second, to reduce redundancy costs and improve reliability, the storage system must incentivize low-available peers to improve their online availability. Traditionally, P2P storage systems achieved these two requirements by (i) using symmetric reciprocal exchanges between peers, and by (ii) allowing peers to selfishly select their set of storage partners. However, in this paper we show that these two mechanisms are suboptimal in terms of the overall storage resources contributed by all peers. To minimize this amount of contributed resources, we design a novel incentive mechanism based on asymmetric reciprocal exchanges between peers. Our mechanism incentivizes peers to select storage partners uniformly at random, and to establish asymmetric exchange relationships with them. These asymmetric exchange relationships allow low-available peers to compensate the increase of redundancy of high-available peers by giving them more storage capacity. We show that our solution reduces the overall amount of contributed resources as well as the resources contributed by each peer individually. Using real P2P availability traces, we show that our incentive mechanism can reduce the overall savings up to 60%, and individual savings from 2% up to 75%, depending on peers' availabilities.

Proceedings ArticleDOI
10 Oct 2011
TL;DR: HRC is proposed, a novel scheme to control the speed at which peers offer chunks to other peers, ultimately controlling peer uplink capacity utilization, and consistently outperforms the Quality of Experience achieved by non-adaptive schemes.
Abstract: In this paper we consider mesh based P2P streaming systems focusing on the problem of regulating peer upload rate to match the system demand while not overloading each peer upload link capacity. We propose Hose Rate Control (HRC), a novel scheme to control the speed at which peers offer chunks to other peers, ultimately controlling peer uplink capacity utilization. This is of critical importance for heterogeneous scenarios like the one faced in the Internet, where peer upload capacity is unknown and varies widely. HRC nicely adapts to the actual peer available upload bandwidth and system demand, so that users' Quality of Experience is greatly enhanced. Both simulations and actual experiments involving up to 1000 peers are presented to assess performance in real scenarios. Results show that HRC consistently outperforms the Quality of Experience achieved by non-adaptive schemes.

Proceedings ArticleDOI
10 Oct 2011
TL;DR: The model of a projection graph that is the result of mapping a social graph onto a peer-to-peer network is introduced and the relation between metrics in the social graph and in the projection graph is analyzed, which shows that when mapping communities of 50–150 users on a peer, there is an optimal organization of the projectionGraph.
Abstract: Social applications implemented on a peer-to-peer (P2P) architecture mine the social graph of their users for improved performance in search, recommendations, resource sharing and others. In such applications, the social graph that connects their users is distributed on the peer-to-peer system: the traversal of the social graph translates to a socially-informed routing in the peer-to-peer layer. In this work we introduce the model of a projection graph that is the result of mapping a social graph onto a peer-to-peer network. We analytically formulate the relation between metrics in the social graph and in the projection graph. We focus on three such graph metrics: degree centrality, node betweenness centrality, and edge betweenness centrality. We evaluate experimentally the feasibility of estimating these metrics in the projection graph from the metrics of the social graph. Our experiments on real networks show that when mapping communities of 50–150 users on a peer, there is an optimal organization of the projection graph with respect to degree and node betweenness centrality. In this range, the association between the properties of the social graph and the projection graph is the highest, and thus the properties of the (dynamic) projection graph can be inferred from the properties of the (slower changing) social graph. We discuss the applicability of our findings to aspects of peer-to-peer systems such as data dissemination, social search, peer vulnerability, and data placement and caching.

Proceedings ArticleDOI
10 Oct 2011
TL;DR: The similarity-based replication of filters that is a key element of the solution is shown to mitigate the effect of hotspots that arise due to the fact that some document terms are substantially more popular than the others, both inside documents and queries.
Abstract: Keyword-based content alert services, e.g., Google Alerts and Microsoft Live Alerts, empower the end users with the ability to automatically receive useful and most recent content. In this paper, we leverage the favorable properties of DHTs, such as scalability, and propose a design of a scalable keyword-based content alert service. The DHT-based architecture matches textual documents with queries based on document terms: For each term, the implementation assigns a home node that is responsible for handling documents and queries that contain the term. The main challenge of this keyword-based matching scheme is the high number of terms that appear in a typical document resulting in a high publication cost. Fortunately, a document can be forwarded to the home nodes of a carefully selected subset of terms without incurring false negatives. In this paper we focus on the MTAF problem of minimizing the number of selected terms to forward the published content. We show that the problem is NP-hardness, and consider centralized and DHT-based solutions. Experimental results based on real datasets indicate that the proposed solutions are efficient compared to existing approaches. In particular, the similarity-based replication of filters that is a key element of our solution is shown to mitigate the effect of hotspots that arise due to the fact that some document terms are substantially more popular than the others, both inside documents and queries.

Proceedings ArticleDOI
10 Oct 2011
TL;DR: This work defines a common set of workloads and metrics for peer-to-peer overlays for Networked Virtual Environments and presents a benchmarking methodology which allows for a fair comparison of those systems.
Abstract: Peer-to-peer overlays for Networked Virtual Environments have recently gained much research interest, resulting in a variety of different approaches for spatial information dissemination. Although designed for the same purpose, the evaluation methodologies used by particular authors differ widely. This makes any comparison of existing systems difficult, if not impossible. To overcome this problem we present a benchmarking methodology which allows for a fair comparison of those systems. We, therefore, define a common set of workloads and metrics. We demonstrate the feasibility of our approach by testing four typical systems for spatial information dissemination and discovering their specific performance profiles.

Proceedings ArticleDOI
10 Oct 2011
TL;DR: This work implemented a messaging system called MONAC, by which users can disseminate Twitter messages via smartphone-based DTN with dynamic selection of available underlay network and seamless authentication for SNS account of cloud.
Abstract: In this work, we implemented a messaging system called MONAC, by which users can disseminate Twitter messages via smartphone-based DTN. MONAC is implemented on an overlay network middleware called PIAX. We extended PIAX to enable DTN with dynamic selection of available underlay network and seamless authentication for SNS account of cloud. By these mechanisms, messages can be disseminated via DTN on heterogeneous underlay networks without falsification and user ID frauds.

Proceedings ArticleDOI
10 Oct 2011
TL;DR: This paper presents an accurate model for capturing the influence of churn on the process of building reputations and reports reductions of 50% or greater in the convergence time in environments with high churn rates.
Abstract: Reputation systems rely on historical information to account for uncertainty about the intention of users to cooperate. In peer-to-peer (P2P) systems, however, accumulating experience tends to be slow due to the high rates of churn — the continuous process of arrival and departure of peers. The flow of transactions is continuously interrupted by departures, which can significantly affect the convergence of reputation systems. To shed light on this, this paper presents an accurate model for capturing the influence of churn on the process of building reputations. Using our model, system architects can determine the minimal transaction rate that guarantees fast convergence and design their systems accordingly. Unfortunately, the natural transaction rate of users is sometimes too low (e.g., due to physical constraints like network bandwidth, etc.) that many of them are likely to experience significant delays in the process of building reputations for their neighbors. We face this problem by leveraging the inherent trust in social networks. The basic idea is that users ask their social links to transact with strangers and together generate reputation ratings in a short time scale. Our simulation results report reductions of 50% or greater in the convergence time in environments with high churn rates.

Proceedings ArticleDOI
Zhi Yang1, Yuanjian Xing1, Feng Xiao1, Zhi Qu1, Xiaoming Li1, Yafei Dai1 
10 Oct 2011
TL;DR: This work makes a first attempt to experimentally reduce heterogeneous churn and resource dynamics to simple distributions with individual-specific parameters, and provides an empirical support for the heterogeneous Markov model of churn.
Abstract: The heterogeneous nature of Peer-to-Peer (P2P) networks can be exploited to optimize a wide range of applications. But this requires an accurate characterization of peer heterogeneity, which is difficult due to huge population and disparate properties of individuals. To overcome this barrier, we conduct a thorough inspection of peer heterogeneity in terms of individual churn and resource capacity. We make a first attempt to (i) experimentally reduce heterogeneous churn and resource dynamics to simple distributions with individual-specific parameters, and to (ii) provide an empirical support for the heterogeneous Markov model of churn. We further demonstrate how our characterization can be leveraged to optimize practical systems, through two case studies: fast backup in online storage and fault-tolerance in cycle-sharing systems. Both applications gain remarkable performance improvement by incorporating our model of peer heterogeneity.