A Balanced Consistency Maintenance Protocol for Structured P2P Systems
Summary (4 min read)
Introduction
- P2P dynamic network characteristics combined with the diverse application consistency requirements and heterogeneous peer resource constraints also impose unique challenges for P2P consistency management.
- Since each replica node takes charge of its children in update propagation and consistency maintenance, the work of consistency maintenance is evenly distributed.
- D ESCRIPTION OFBCOM BCoM aims to: (1) provide bounded consistency for maintaining a large number of replicas of a mutable object; (2) balance the consistency strictness, availability and performance in response to dynamic network conditions, update workload, an resource constraints; (3) make the consistency maintenanc robust against frequently node churns and failures.
- The authors first introduce thedDT structure, and then explain the three techniques in detail.
A. Dissemination Tree Structure
- For each object BCoM builds a tree with node degreed rooted at the node whose ID is closest to the object ID in the overlay identifier space.
- Node1 assigned node3 as its child, since it had a space for a new child.
- Sec.II-C explains how to contact an ancestor for rejoining.
- Algorithm 1 dDT Construction(p, q) Input: nodep receives nodeq’s join request Output: parent of nodeq in dDT if p does not haved children then Subno.(p) = +Subno.(q) return p else find a childf of p s.t. f has the smallestSubno.
- This makes the real time maintenance of the tree depth quite difficult and unnecessary when tree nodes are frequently joining and leaving.
B. Sliding Window Update Protocol
- 1) Basic Operation in Sliding Window Update: Sliding window regulates the consistency bound for update propagations to all replica nodes in adDT .
- When receiving an RACK from a child, the root sends the next update to this child if there is a buffered update that has not been sent to this child.
- The sliding window sizeki plays a critical role in balancing the consistency strictness, the object availability and the update dissemination performance.
- The aggregation is performed as follows: each leaf node initializes the tree height to zero (L = 0) and the bottleneck service timeµL to its update propagation time.
- Once an internal node receives the maintenance messages from all children, it updatesL as the maximum value of its children’s tree height plus 1 and µL as the maximum value among its and every child’s service time.
C. Ancestor Cache Maintenance
- Each replica node maintains a cache ofm ancestors starting from its parent leading to the root in thedDT .
- This can be detected by ACK and maintenance message transmissions.
- The root is finally contacted for relocation if all the other ancestors crash.
- The ancestor cache provides fast recovery from node and link failures with a small overhead and high success probability.
- Each node refers to the newly received ancestor list to refresh its cache.
D. Tree Node Migration
- Any internal node with the subtree rooted at it will be blocked from receiving new updates if one of its slowest child is blocked due to the sliding window constraint.
- When blocking occurs, node0 can swap the bottleneck node1 with a faster descendant with more recent updates, like node4, to remove the blocking.
- The performance improvement through node migration is confirmed by their queuing model ofdDT in Fig.3.
- The non-blocking migration helps promote the faster nodes to upper layers, which makes the searching in blocking-triggered migration easier.
- Since the overlay DHT routing in structured P2P networks relies on cooperative nodes, the authors assume BCoM is run by these cooperative P2P nodes transparent to the end users.
E. Basic Operations in BCoM
- BCoM provides three basic operations: Subscribe: when a nodep wants to read the objecti and keep it updated,p sends the subscription request to the root ofdDTi by overlay routing.
- The message overhead for a node leaving is O(1), since the number of the affected node is no more than d, and each has constant overhead to update the related maintenance information.
- After subscribing, if a nodep wants to update the object, it sends the update request directly to the root using IP routing, also known as Update.
- Updates are serialized at the root by their arrival time.
- The specific policy for resolving conflicts is application dependent.
III. A NALYTICAL MODEL FORSLIDING WINDOW SETTING
- The unstableness of P2P systems forbids us to use any complicated optimization techniques that require severalhours of computation at workstations (e.g. [28]) or every node information in the entire system (e.g. [31]).
- BCoM adjusts the sliding window size timely to dyanmic P2P systems relying on limited information.
- This section presents the analytical model of the sliding window sizeki of object i, where the update propagation to all replica nodes is modeled by a queuing system.
- The authors first analyze the queueing behavior when an update is discarded, then calculate the update discard probability and the expected latency for a replica node to receive an update, finally, they set ki to balance the availability and latency constrained by consistency bounds.
B. Availability and Latency Computation
- Define the update request intensity asρ. ρ = λ µL−1 (1) Define the probability ofn updates in the queue asπn.
- Based on the queueing theory forM/M/1 finite queue [6],πn is represented as Eq.2. πn = ρ nπ0 (2) The discard probability isπL∗ki , which indicates the buffer overflow.
C. Window Size Setting
- The effectiveness of a consistency protocol is measured by three attributes: consistency strictness, object availability and latency for receiving an update, and the three are in subtle tension towards each other.
- It is hard to accurately model the delay for an update to be received by each replica node, since besides the queueing delay at each node, the dynamic node joining and leaving cause disturbance on the update propagation process.
- In their simulation, empiricallysetting Ts to 1.3 achieves good results shown in Fig.6 and Fig.7, the discard probability is improved from almost100% to 5% at the cost of latency increases less than one third most of the time.
- The authors extend the P2PSim tool [1] to simulate the heterogeneous node capacities and transmission latency.
A. Simulation Setting
- The authors simulate a network of1000 nodes because anything larger cannot be executed stably in P2PSim.
- Given that transmitting one update uses only10 to 100 slots, the number of time slots covered in a simulation cycle (i.e.7.2 ∗ 106) is large enough to generate sustainable results.
- Network topology is simulated by two transit-stub topologies generated by GT ITM [9] to model and sparse networks: (1) ts1ksmall -2 transit domains each with4 transit nodes,4 stub domains attached to each transit node, and31 nodes in each stub domain.
- The node degree is set to5, since the average Gnutella node degree is3 to 5.
B. Efficiency of the Window Size
- This simulation explores the efficiency of applying sliding window protocol.
- The curves in Fig.4 and Fig.5 show that by increasing the window size from1 to 20, the discard rate is dropped from80% to around5% and the latency is increased only by20%, which confirms that BCoM significantly improves the availability with slight sacrifice of latency performance compared to the sequential consistency.
C. Scalability of BCoM
- This simulation verifies the scalability of BCoM with comparison to SCOPE by varying the number of replica nodes and the update rate of each object.
- The sliding window protocol and the adaptive window size setting contribute to good availability maintenance under dynamic system conditions.
- But the increase is controlled within1/3rd of the latency of SCOPE, which matches with the latency increase bound in window size setting for improved discard rate.
- The results of Fig.9 show that the latency of BCoM is similar to that of SCOPE when update rates are low, and longer than SCOPE when update rates are high.
- Such good balance confirms the objectives in the analytical model of the window size setting.
E. Fault Tolerance of BCoM
- This simulation examines BCoM’s robustness against node failures by varying the node mean life time.
- The smaller the life time is, the more frequently the nodes join and leave.
- The results of SCOPE are not presented because their discard rate is nearly100% when the nodes are joining or leaving.
- The results of Fig.11, Fig.12, and Fig.13 show that BCoM keeps the tree depth, the discard rate and the latency in good status for different frequencies of node joining and leaving.
- And adaptive window size setting keeps the availability and latency performance stable.
A. Consistency Maintenance in P2P systems
- In structured P2P systems, strong consistency is provided by organizing replica nodes to an auxiliary structure on top of the overlay for update propagation, like the tree structure in SCOPE [24], two-tired structure in OceanStore [16], and a hybrid of tree and two-tired structure in [29].
- The tree constructions in [24] [29] follow the node ID partitioning, instead,dDT inserts the new node to the smallest subtree to make it balanced under dynamic node joining and leaving.
- The impact of churn rate on discard rate needs to check validity with the source to serve the following read requests.
B. Overlay Content Distribution
- Update delivery in P2P overlay has four requirements: (1) a bounded delay for update delivery, (2) robustness to frequent node churns and update workload changes, (3) awareness of heterogeneous peer capacities, and (4) scalability with a large number of peers.
- The major difference is that LagOver improves the performance to meet the individual replica node’s requirement, while node migration improves performance system-wide.
- The “side link” is used in content dissemination tree in [11] to address (2), where each node keeps multiple side links from other subtrees to minimize the impact of loss multiplication in a tree structure.
- The authors ancestor cache achieves the same goal by only caching ancestors and contacting the ancestor one layer above the failed nodes.
- Besides, in BCoM a node sequentially contacts the cached ancestors to avoid conflict relocation decisions while in [11] a node uses multiple side links in parallel to retrieve the lost packets, serving different aims.
C. Tunable Consistency Models
- Previous works have explored continuous models for consistency maintenance [17] [27], which have been extended by a composable consistency model [23] for P2P applications.
- Hybrid push and pull methods are also used to provide application tailored cache consistency [32] [25].
- While in BCoM updates are serialized to eliminate the update conflicts and potential cascading effects.
- Updates in highly unreliable, replicated peer-to-peer systems.
Did you find this useful? Give us your feedback
Citations
14 citations
Additional excerpts
...Index Terms—Peer-to-peer, consistency, protocol design, simulations Ç...
[...]
10 citations
5 citations
Cites background or methods from "A Balanced Consistency Maintenance ..."
...tured P2P systems [91] Distributed Y Peer-to-Peer P2PSim Availability, la-...
[...]
...The spatial constraints of consistency describes coherence predicate [91] that indicates the degree of replica consistency to the primary copy....
[...]
...Gossip [91] based replica consistency maintenance is the push and pull combined algorithm, namely, Balanced Consistency Maintenance (BCoM), in which an update message is pushed to the replica nodes actively by the node which creates the update message and in pull approach replica node sends query messages to obtain the current updated replica....
[...]
4 citations
Cites methods from "A Balanced Consistency Maintenance ..."
...To allow the source easy discovery of replica location, these networks often employ structured P2P networks to establish a mapping from file IDs to nodes where they can be found [5], [7], [28], [33], [43], [55], [58]....
[...]
4 citations
Cites background from "A Balanced Consistency Maintenance ..."
...How to keep balanced on load degree between high-speed network nodes becomes a problem urgent to be solved [18-19]....
[...]
...In fact, load degree emerges strong unbalance, the reason for this can be outlined as follows [19]: (1) node performance exists enormous difference, for instance, node ability, quantity of shared documents, besides different files exist magnitude order discrepancy in popularity, that affects the search request distribution....
[...]
References
[...]
6,991 citations
3,396 citations
3,376 citations
"A Balanced Consistency Maintenance ..." refers background in this paper
...Examples are the tree structure in SCOPE [2], two-tiered structure in OceanStore [8], and a hybrid of tree and two-tiered structure in [9]....
[...]
1,901 citations
"A Balanced Consistency Maintenance ..." refers methods in this paper
...[17] ZHAO, B., HUANG, L., STRIBLING, J., RHEA, S., JOSEPH, A., AND KUBIATOWICZ, J. Tapestry: a resilient global-scale overlay for service deployment....
[...]
...While BCoM can be applied to every type of structured P2P system, we choose Tapestry [17] as a representative network for simulations....
[...]
1,764 citations
"A Balanced Consistency Maintenance ..." refers methods in this paper
...Network topology is simulated by two transitstub topologies generated by GT ITM [4] to model dense and sparse networks....
[...]