scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Swarm Intelligence Based File Replication and Consistency Maintenance in Structured P2P File Sharing Systems

01 Oct 2015-IEEE Transactions on Computers (IEEE)-Vol. 64, Iss: 10, pp 2953-2967
TL;DR: SWARM is presented, a file replication mechanism based on swarm intelligence that can reduce querying latency, reduce the number of replicas, and reduce the consistency maintenance overhead by 49-99 percent compared to previous consistency maintenance methods.
Abstract: In peer-to-peer file sharing systems, file replication helps to avoid overloading file owners and improve file query efficiency. There exists a tradeoff between minimizing the number of replicas (i.e., replication overhead) and maximizing the replica hit rate (which reduces file querying latency). More replicas lead to increased replication overhead and higher replica hit rates and vice versa. An ideal replication method should generate a low overhead burden to the system while providing low query latency to the users. However, previous replication methods either achieve high hit rates at the cost of many replicas or produce low hit rates. To reduce replicas while guaranteeing high hit rate, this paper presents SWARM, a file replication mechanism based on swarm intelligence. Recognizing the power of collective behaviors, SWARM identifies node swarms with common node interests and close proximity. Unlike most earlier methods, SWARM determines the placement of a file replica based on the accumulated query rates of nodes in a swarm rather than a single node. Replicas are shared by the nodes in a swarm, leading to fewer replicas and high querying efficiency. In addition, SWARM has a novel consistency maintenance algorithm that propagates an update message between proximity-close nodes in a tree fashion from the top to the bottom. Experimental results from the real-world PlanetLab testbed and the PeerSim simulator demonstrate the effectiveness of the SWARM mechanism in comparison with other file replication and consistency maintenance methods. SWARM can reduce querying latency by 40-58 percent, reduce the number of replicas by 39-76 percent, and achieves more than 84 percent higher hit rates compared to previous methods. It also can reduce the consistency maintenance overhead by 49-99 percent compared to previous consistency maintenance methods.
Citations
More filters
01 Jan 2003
TL;DR: A super-peer is a node in a peer-to-peer network that operates both as a server to a set of clients, and as an equal in a network of super-peers.
Abstract: A super-peer is a node in a peer-to-peer network that operates both as a server to a set of clients, and as an equal in a network of super-peers. Super-peer networks strike a balance between the efficiency of centralized search, and the autonomy, load balancing and robustness to attacks provided by distributed search. Furthermore, they take advantage of the heterogeneity of capabilities (e.g., bandwidth, processing power) across peers, which recent studies have shown to be enormous. Hence, new and old P2P systems like KaZaA and Gnutella are adopting super-peers in their design. Despite their growing popularity, the behavior of super-peer networks is not well understood. For example, what are the potential drawbacks of super-peer networks? How can super-peers be made more reliable? How many clients should a super-peer take on to maximize efficiency? we examine super-peer networks in detail, gaming an understanding of their fundamental characteristics and performance tradeoffs. We also present practical guidelines and a general procedure for the design of an efficient super-peer network.

916 citations

Journal ArticleDOI
TL;DR: A new strategy of online replica deduplication (SORD), achieving to reduce the impact on other nodes when deleting a redundant replica, which obtains superior performances in access latency around 5–15% on average and better load balance than other similar methods.
Abstract: In online Cloud-P2P system, more replicas can lead to lower access delay but more maintenance overhead and vice versa. The traditional strategies of online replica deduplication usually utilize the method of dynamic threshold to delete the redundant replicas. Since the replicas access amount has varied over time, and every replica can bear a certain amount of requests, the replica of being deleted may impact on other nodes, lead to these nodes overload, deteriorating the system performance. But this impact is not paid enough attention in the traditional strategy. To deal with the problem, this paper proposes a new strategy of online replica deduplication (SORD), achieving to reduce the impact on other nodes when deleting a redundant replica. In order to reduce the impact, SORD adopts the method of prediction evaluation to delete the redundant replica. Before deleting a replica, it applies the method of fuzzy clustering analysis to get the optimal deletion replica from the file’s replica set. Based on the historical visiting information of the optimal deletion replica and the capacity of nodes, SORD evaluates the impact on other nodes to decide whether a replica can be deleted. Extensive experiments demonstrate that SORD obtains superior performances in access latency around 5–15% on average and better load balance than other similar methods. Meanwhile, it can remove about 65% redundant replicas.

83 citations

Journal ArticleDOI
TL;DR: DARS addresses the problem of replica creation time, including obtaining the replica creation opportune moment and finding the optimal replica placement node, by a decentralized self-adaptive manner.

26 citations


Cites methods from "Swarm Intelligence Based File Repli..."

  • ...performance with other two replica strategy in dynamic environments: SWARM [18] andDARS (Without Replicas)....

    [...]

  • ...Paper [18] presents a file replicationmechanism SWARM based on swarm intelligence....

    [...]

Journal ArticleDOI
TL;DR: RRSD has superior performance regarding load balance, data reliability and storage consumption and can deliver an improvement of 10% for load balance and reduce storage consumption by 60% while meeting data reliability requirement compared with other similar methods.

19 citations

Journal ArticleDOI
TL;DR: In this paper, a biologically inspired derivative-free global optimization algorithm called Firebug Swarm Optimization (FSO) inspired by reproductive swarming behaviour of Firebugs (Pyrrhocoris apterus) is proposed.
Abstract: A new biologically inspired derivative-free global optimization algorithm called Firebug Swarm Optimization (FSO) inspired by reproductive swarming behaviour of Firebugs (Pyrrhocoris apterus) is proposed. The search for fit reproductive partners by individual bugs in a swarm of Firebugs can be viewed naturally as a search for optimal solutions in a search space. This work proposes a mathematical model for five different Firebug behaviours most relevant to optimization and uses these behaviours as the basis of a new global optimization algorithm. Performance of the FSO algorithm is compared with 17 popular heuristic algorithms on the Congress of Evolutionary Computation 2013 (CEC 2013) benchmark suite that contains high dimensional multimodal as well as shifted and rotated functions. Statistical analysis based on Wilcoxon Rank-Sum Test indicates that the proposed FSO algorithm outperforms 17 popular state-of-the-art heuristic global optimization algorithms like Guided Sparks Fireworks Algorithm (GFWA), Dynamic Learning PSO (DNLPSO), and Artificial Bee Colony Bollinger Bands (ABCBB) on the CEC 2013 benchmark.

15 citations

References
More filters
Book ChapterDOI
TL;DR: Pastry as mentioned in this paper is a scalable, distributed object location and routing substrate for wide-area peer-to-peer ap- plications, which performs application-level routing and object location in a po- tentially very large overlay network of nodes connected via the Internet.
Abstract: This paper presents the design and evaluation of Pastry, a scalable, distributed object location and routing substrate for wide-area peer-to-peer ap- plications. Pastry performs application-level routing and object location in a po- tentially very large overlay network of nodes connected via the Internet. It can be used to support a variety of peer-to-peer applications, including global data storage, data sharing, group communication and naming. Each node in the Pastry network has a unique identifier (nodeId). When presented with a message and a key, a Pastry node efficiently routes the message to the node with a nodeId that is numerically closest to the key, among all currently live Pastry nodes. Each Pastry node keeps track of its immediate neighbors in the nodeId space, and notifies applications of new node arrivals, node failures and recoveries. Pastry takes into account network locality; it seeks to minimize the distance messages travel, according to a to scalar proximity metric like the number of IP routing hops. Pastry is completely decentralized, scalable, and self-organizing; it automatically adapts to the arrival, departure and failure of nodes. Experimental results obtained with a prototype implementation on an emulated network of up to 100,000 nodes confirm Pastry's scalability and efficiency, its ability to self-organize and adapt to node failures, and its good network locality properties.

7,423 citations

Proceedings ArticleDOI
01 Aug 2000
TL;DR: This paper explores and evaluates the use of directed diffusion for a simple remote-surveillance sensor network and its implications for sensing, communication and computation.
Abstract: Advances in processor, memory and radio technology will enable small and cheap nodes capable of sensing, communication and computation. Networks of such nodes can coordinate to perform distributed sensing of environmental phenomena. In this paper, we explore the directed diffusion paradigm for such coordination. Directed diffusion is datacentric in that all communication is for named data. All nodes in a directed diffusion-based network are application-aware. This enables diffusion to achieve energy savings by selecting empirically good paths and by caching and processing data in-network. We explore and evaluate the use of directed diffusion for a simple remote-surveillance sensor network.

6,061 citations

Journal ArticleDOI
TL;DR: This work reviews localization techniques and evaluates the effectiveness of a very simple connectivity metric method for localization in outdoor environments that makes use of the inherent RF communications capabilities of these devices.
Abstract: Instrumenting the physical world through large networks of wireless sensor nodes, particularly for applications like environmental monitoring of water and soil, requires that these nodes be very small, lightweight, untethered, and unobtrusive. The problem of localization, that is, determining where a given node is physically located in a network, is a challenging one, and yet extremely crucial for many of these applications. Practical considerations such as the small size, form factor, cost and power constraints of nodes preclude the reliance on GPS of all nodes in these networks. We review localization techniques and evaluate the effectiveness of a very simple connectivity metric method for localization in outdoor environments that makes use of the inherent RF communications capabilities of these devices. A fixed number of reference points in the network with overlapping regions of coverage transmit periodic beacon signals. Nodes use a simple connectivity metric, which is more robust to environmental vagaries, to infer proximity to a given subset of these reference points. Nodes localize themselves to the centroid of their proximate reference points. The accuracy of localization is then dependent on the separation distance between two-adjacent reference points and the transmission range of these reference points. Initial experimental results show that the accuracy for 90 percent of our data points is within one-third of the separation distance. However, future work is needed to extend the technique to more cluttered environments.

3,723 citations

Journal ArticleDOI
TL;DR: Results from theoretical analysis and simulations show that Chord is scalable: Communication cost and the state maintained by each node scale logarithmically with the number of Chord nodes.
Abstract: A fundamental problem that confronts peer-to-peer applications is the efficient location of the node that stores a desired data item. This paper presents Chord, a distributed lookup protocol that addresses this problem. Chord provides support for just one operation: given a key, it maps the key onto a node. Data location can be easily implemented on top of Chord by associating a key with each data item, and storing the key/data pair at the node to which the key maps. Chord adapts efficiently as nodes join and leave the system, and can answer queries even if the system is continuously changing. Results from theoretical analysis and simulations show that Chord is scalable: Communication cost and the state maintained by each node scale logarithmically with the number of Chord nodes.

3,518 citations


"Swarm Intelligence Based File Repli..." refers background or methods in this paper

  • ...algorithms [29] to transfer the responsibilities of repository nodes to their predecessors and successors as new repository nodes, and relies on its stabilization process to update the routing tables in repository nodes....

    [...]

  • ...2 shows an example of information marshaling in Chord with the nodes in Fig....

    [...]

  • ...To handle repository node dynamism, SWARM relies on Chord’s node join and leave algorithms [29] to transfer the responsibilities of repository nodes to their predecessors and successors as new repository nodes, and relies on its stabilization process to update the routing tables in repository nodes....

    [...]

  • ...We constructed the 2,048 virtual peers into a Chord P2P network with the dimension equals 11....

    [...]

  • ...Relying on the Chord routing algorithm [29], the path length is log n in the average case for one message....

    [...]

Journal ArticleDOI
12 Nov 2000
TL;DR: OceanStore monitoring of usage patterns allows adaptation to regional outages and denial of service attacks; monitoring also enhances performance through pro-active movement of data.
Abstract: OceanStore is a utility infrastructure designed to span the globe and provide continuous access to persistent information. Since this infrastructure is comprised of untrusted servers, data is protected through redundancy and cryptographic techniques. To improve performance, data is allowed to be cached anywhere, anytime. Additionally, monitoring of usage patterns allows adaptation to regional outages and denial of service attacks; monitoring also enhances performance through pro-active movement of data. A prototype implementation is currently under development.

3,376 citations