Distributed Maintenance of Cache Freshness in Opportunistic Mobile Networks
Summary (5 min read)
Introduction
- In recent years, personal hand-held mobile devices such as smartphones are capable of storing, processing and displaying various types of digital media contents including news, music, pictures or video clips.
- In these networks, it is generally difficult to maintain end-toend communication links among mobile users.
- There is only limited research on maintaining the freshness of cached data in the network, despite the fact that media contents may be refreshed periodically.
- The authors basic idea is to organize the caching nodes1 as a tree structure during data access, and let each caching node be responsible for refreshing the data cached at its children in a distributed and hierarchical manner.
- The rest of this paper is organized as follows.
A. Models
- Opportunistic contacts among nodes are described by a network contact graph 𝐺(𝑉,𝐸), where the contact process between a node pair 𝑖, 𝑗 ∈ 𝑉 is modeled as an edge 𝑒𝑖𝑗 ∈ 𝐸. Similar to previous work [1], [34], the authors consider the pairwise node inter-contact time as exponentially distributed, also known as 1) Network Model.
- There are cases where an application might have specific requirements on Δ and 𝑝 to achieve sufficient levels of data freshness.
- Letting 𝑢𝑖𝑗 denote the update of data from version 𝑖 to version 𝑗, the authors assume that any caching node is able to refresh the cached data as 𝑑𝑖⊗𝑢𝑖𝑗 → 𝑑𝑗 , where 𝑑𝑖 and 𝑑𝑗 denote the data with version 𝑖 and 𝑗, respectively.
- 𝑑𝑗 cannot be refreshed to 𝑑𝑘 by 𝑢𝑖𝑘 even if 𝑗 > 𝑖.
B. Caching Scenario
- Mobile nodes share data generated by themselves or obtained from the Internet.
- Each cached data item is associated with a finite lifetime and is automatically removed from cache when it expires.
- In practice, when multiple data items with varied popularity compete for the limited buffer of caching nodes, more popular data is prioritized to ensure that the cumulative data access delay is minimized.
- After having its query satisfied by 𝑆, 𝐴 may lose its connection with 𝑆 due to mobility, and hence 𝐴 is unaware of the data cached at nodes 𝐵, 𝐷 and 𝐸.
C. Basic Idea
- The authors basic idea for maintaining cache freshness is to refresh the cached data in a distributed and hierarchical manner.
- Particularly, the topology of DAT may change due to the expiration of cached data.
- When node 𝐴 contacts node 𝐷 at time 𝑡6, 𝐴 updates the data cached at 𝐷 from 𝑑1 to 𝑑3.
- Instead, 𝐴 has to transmit the complete data 𝑑3 to 𝐷 with 2The update 𝑢13 can only be calculated using 𝑑1 and 𝑑3.
IV. REFRESHING PATTERNS OF WEB CONTENTS
- The authors investigate the refreshing patterns of realistic web contents, as well as their temporal variations during different time periods in a day.
- These patterns highlight the homogeneity of data refreshing behaviors among different data sources and categories, and suggest appropriate calculation of utilities of data updates for refreshing cached data.
B. Distribution of Inter-Refreshing Time
- The authors provide both empirical and analytical evidence of a dichotomy in the Complementary Cumulative Distribution Function (CCDF) of the inter-refreshing time, which is defined Fig. 1) Aggregate distribution: Figure 4 shows the aggregate CCDF of inter-refreshing time for all the RSS feeds, in loglog scale.
- For the remaining 10% of inter-refreshing time with values larger than the boundary, the CCDF values exhibit linear decay which suggests a power-law tail.
- A similar test is performed on the inter-contact times with larger values for the generalized Pareto distribution.
- The significance levels (𝛼) for these null hypotheses being accepted are listed in Table II.
C. Temporal Variations
- Section IV-C shows that the refreshing patterns of web RSS data is temporally skewed, such that the majority of data updates are generated during specific time periods of a day.
- The authors evaluate such temporal variation on the DieselNet trace.
- In general, the temporal skewness can be found in all three evaluation metrics, and is determined by the temporal distributions of both node contacts and data updates available during different hours in a day.
- As shown in Figure 14(a), the refreshing ratio during the time period between 8AM and 4PM is generally higher than the average refreshing ratio, because majority of node contacts have been generated during this time period according to [15].
- In summary, the authors conclude that the transient performance of maintaining cache freshness differs a lot from the cumulative maintenance performance, and cache freshness can be further improved by appropriately exploiting the temporal variations of data refreshing pattern and node contact process.
A. Utility of Data Updates
- In practice, the requirement of cache freshness may not be satisfied due to the limited nodes’ contact capability.
- When a node 𝐵 in the DAT maintains the data update for its child 𝐷, it calculates the utility of this update which is equal to the probability that this update carried by 𝐵 satisfies the freshness requirement for data cached at𝐷.
- According to Eq. (3), the utility should be calculated following Eq. (4) when the value of 𝑡−𝑡0−Δ is small.
B. Opportunistic Replication of Data Updates
- If a node in the DAT finds out that the utility of the data update it carries is lower than the required probability 𝑝 for maintaining cache freshness, it opportunistically replicates the data update to other nodes outside of the DAT.
- Such a replication process is illustrated in Figure 8.
- 𝑅𝑘 outside of the DAT, it determines whether to replicate the data update for refreshing 𝐵 to 𝑅𝑘.
- The replication when the utilities of data update at the 𝑘 selected relays satisfy 1− 𝑘∏ 𝑖=0 (1− 𝑈𝑅𝑖) ≥ 𝑝, (7) i.e., the probability that the requirement of cache freshness at 𝐵 is satisfied by at least one relay is equal to or larger than 𝑝.
- Note that the selected relays are only able to refresh the specific data cached in the DAT, but are unable to provide data access to other nodes outside of the DAT.
VI. OPPORTUNISTIC REFRESHING
- In addition to intentionally refreshing data cached at its children in the DAT, a node also refreshes other cached data with older versions whenever possible upon opportunistic contacts.
- The authors propose a probabilistic approach to efficiently make cache refreshing decisions and optimize the tradeoff between cache freshness and network transmission overhead.
A. Probabilistic Decision
- Opportunistic refreshing is generally more expensive because the complete data usually needs to be transmitted, and its size is much larger than that of data update.
- As a result, it is important to make appropriate decisions on opportunistic refreshing, so as to optimize the tradeoff between cache freshness and network transmission overhead, and to avoid inefficient consumption of network resources.
- The authors propose a probabilistic approach to efficiently refresh the cache data, and the data is only refreshed if its required freshness cannot be satisfied by intentional refreshing.
- Hence, 𝑈𝐵𝐷(𝑡𝐶) can be calculated by 𝐷 and is available to 𝐴 when 𝐴 contacts 𝐷. Since additional relays may be used for delivering data updates in intentional refreshing as described in Section V-B, the utility 𝑈𝐵𝐷(𝑡𝐶) calculated by 𝐷 essentially provides a lower bound on the actual effectiveness of intentional refreshing.
B. Side-Effect of Opportunistic Refreshing
- Due to possible version inconsistency among different data copies cached in the DAT, opportunistic refreshing may have some side-effects on cache freshness.
- Such side-effect is illustrated in Figure 9.
- When 𝐴 opportunistically contacts node 𝐷 and refreshes 𝐷’s cached data from 𝑑1 to 𝑑3, it is unaware of the data cached at 𝐵 with a newer version 𝑑4.
VII. PERFORMANCE EVALUATIONS
- The authors compare the performance of their proposed cache refreshing scheme with the following schemes: ∙ Passive Refreshing: a caching node only refreshes data cached at another node upon contact.
- It is different from their opportunistic refreshing scheme in Section VI in that it does not consider the tradeoff between cache freshness and network transmission overhead.
- Every time when the source updates data, it actively disseminates the date update to the whole network, also known as ∙ Active Refreshing.
- The following metrics are used for evaluations.
- Each simulation is repeated multiple times with random data sources and user queries for statistical convergence.
A. Simulation Setup
- The authors evaluations are conducted on two realistic opportunistic mobile network traces, which record contacts among users carrying Bluetooth-enabled mobile devices.
- These devices periodically detect their peers nearby, and a contact is recorded when two devices move close to each other.
- The datasets described in Section IV are exploited to simulate the data being cached in the network, as well as the interrefreshing time of data.
- Since the pairwise node contact frequency is generally lower than the data refreshing frequency, the authors pick up the 4 RSS feeds listed in Table I with average interrefreshing time longer than 0.5 hours for their evaluations.
- Every time 𝑇 , each node determines whether to request data 𝑗 with probability 𝑃𝑗 .
B. Performance of Maintaining Cache Freshness
- The authors first compare the performance of their proposed hierarchical refreshing scheme with other schemes by varying the lifetime (𝐿) of the cached data.
- The evaluation results are shown in Figure 11.
- Active Refreshing outperforms their scheme by 10%-15%, but Figure 11(c) shows that such performance is achieved at the cost of much higher refreshing overhead.
- The parameter values are set by default as Δ = 1.5 hours and 𝑝 = 60%, and are varied during different simulations.
- As described in Section V-B, increasing 𝑝 stimulates the caching nodes to replicate data updates, and hence increases the refreshing overhead as shown in Figure 13(b).
VIII. CONCLUSION
- The authors focus on maintaining the freshness of cached data in opportunistic mobile networks.
- The authors basic idea is to let each caching node be only responsible for refreshing a specific set of caching nodes, so as to maintain cache freshness in a distributed and hierarchical manner.
- Based on the experimental investigation results on the refreshing patterns of real websites, the authors probabilistically replicate data updates, and analytically ensure that the freshness requirements of cached data are satisfied.
- The performance of their proposed scheme on maintaining cache freshness is evaluated by extensive tracedriven simulations on realistic mobile traces.
Did you find this useful? Give us your feedback
Citations
2 citations
Cites methods from "Distributed Maintenance of Cache Fr..."
...Method Methods for Comparison Social Properties PodNet [73] No caching, Most solicited, Least solicited, Uniform, Inverse proportional Community Cooperative [77] Selfish Community, Friendship ContentPlace [74] [75] [76] MFV, MLN, F, P, US Community, Friendship Mixcommunity [78] Withincommunity Community, Similarity SocialCast [79] No Prediction Community, Friendship MOPS [80] Push, Pull, Neighbors Community, Closeness centrality Habit [81] Epidemic, Wait-For-Destination, Oracle Community, Friendship MF-RRWP [82] MF-ORWP, SODA, GADA Community Passive Refreshing [83] Active Refreshing, Publish/Subscribe, Hierarchical Rereshing Community...
[...]
2 citations
1 citations
Cites background from "Distributed Maintenance of Cache Fr..."
...The incredibly rapid growth of mobile users is leading to a promising future, in which they can form a mobile opportunistic network (MON) [40,90], which is also known as pocket switched network (PSNs) [52], to freely share files or forward packets between each other without the support of cellular infrastructures....
[...]
..., smartphones, laptops, and tablets), mobile opportunistic networks (MONs) [40,90] consisting of portable devices have attracted much attention recently....
[...]
1 citations
Cites background from "Distributed Maintenance of Cache Fr..."
...al.[40] propose to efficiently maintain the cache freshness by organizing the caching users as a tree structure during content access....
[...]
Cites background from "Distributed Maintenance of Cache Fr..."
...A special approach is suggested to maintain the freshness of the cache [26]....
[...]
...[26] Wei Gao, Guohong Cao, Mudhakar Srivatsa, Arun Iyengar....
[...]
References
4,945 citations
4,355 citations
"Distributed Maintenance of Cache Fr..." refers background in this paper
...Since the data source does not maintain any information regarding the caching nodes, such dissemination is generally realized via Epidemic routing [31]....
[...]
3,694 citations
"Distributed Maintenance of Cache Fr..." refers methods in this paper
...Such prioritization is generally formulated as a knapsack problem [17] and can be solved in pseudopolynomial time using a dynamic programming approach [25]....
[...]
3,582 citations
"Distributed Maintenance of Cache Fr..." refers methods in this paper
...We assume that the query pattern follows a Zipf distribution which has been widely used for modelling web data access [3]....
[...]
Related Papers (5)
Frequently Asked Questions (7)
Q2. Why is data forwarded in a “carry-and-forward” manner?
Due to the intermittent network connectivity in opportunistic mobile networks, data is forwarded in a “carry-and-forward” manner.
Q3. What is the effect of intentional refreshing on the cache freshness of data?
Due to possible version inconsistency among different data copies cached in the DAT, opportunistic refreshing may have some side-effects on cache freshness.
Q4. How many times did the decay of the CCDF of the inter-refreshing time?
Their results show that up to a boundary on the order of several minutes, the decay of the CCDF is well approximated as exponential.
Q5. What is the effect of changing the value of p on the refreshing overhead?
since different values of 𝑝 do not affect the calculation of utilities of data updates, such increase of refreshing overhead is relatively smaller than that of decreasing Δ.Section IV-C shows that the refreshing patterns of web RSS data is temporally skewed, such that the majority of data updates are generated during specific time periods of a day.
Q6. How is the performance of the proposed scheme evaluated?
The performance of their proposed scheme on maintaining cache freshness is evaluated by extensive tracedriven simulations on realistic mobile traces.
Q7. What is the effect of reducing the refreshing delay?
From Figure 12 the authors observe that, when the value of Δ is small, the cache freshness is mainly constrained by the network contact capability, and the actual refreshing delay is much higher than the required Δ. Such inability to satisfy the cache freshness requirements leads to more replications of data updates as described in Section V-B, and makes caching nodes more prone to perform opportunistic refreshing.