scispace - formally typeset
Proceedings ArticleDOI

Web caching and Zipf-like distributions: evidence and implications

TLDR
This paper investigates the page request distribution seen by Web proxy caches using traces from a variety of sources and considers a simple model where the Web accesses are independent and the reference probability of the documents follows a Zipf-like distribution, suggesting that the various observed properties of hit-ratios and temporal locality are indeed inherent to Web accesse observed by proxies.
Abstract
This paper addresses two unresolved issues about Web caching. The first issue is whether Web requests from a fixed user community are distributed according to Zipf's (1929) law. The second issue relates to a number of studies on the characteristics of Web proxy traces, which have shown that the hit-ratios and temporal locality of the traces exhibit certain asymptotic properties that are uniform across the different sets of the traces. In particular, the question is whether these properties are inherent to Web accesses or whether they are simply an artifact of the traces. An answer to these unresolved issues will facilitate both Web cache resource planning and cache hierarchy design. We show that the answers to the two questions are related. We first investigate the page request distribution seen by Web proxy caches using traces from a variety of sources. We find that the distribution does not follow Zipf's law precisely, but instead follows a Zipf-like distribution with the exponent varying from trace to trace. Furthermore, we find that there is only (i) a weak correlation between the access frequency of a Web page and its size and (ii) a weak correlation between access frequency and its rate of change. We then consider a simple model where the Web accesses are independent and the reference probability of the documents follows a Zipf-like distribution. We find that the model yields asymptotic behaviour that are consistent with the experimental observations, suggesting that the various observed properties of hit-ratios and temporal locality are indeed inherent to Web accesses observed by proxies. Finally, we revisit Web cache replacement algorithms and show that the algorithm that is suggested by this simple model performs best on real trace data. The results indicate that while page requests do indeed reveal short-term correlations and other structures, a simple model for an independent request stream following a Zipf-like distribution is sufficient to capture certain asymptotic properties observed at Web proxies.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Improving lookup latency in distributed hash table systems using random sampling

TL;DR: This paper analytically proves that LPRS can result in lookup latencies proportional to the average unicast latency of the network, provided the underlying physical topology has a power-law latency expansion, and indicates that the Internet router-level topology resembles power- law latency expansion.
Proceedings ArticleDOI

Caching-aided coded multicasting with multiple random requests

TL;DR: In this article, the authors consider the case where each user places L ≥ 1 independent requests according to the same common demand distribution and propose an achievable scheme based on random vector (packetized) caching placement and multiple groupcast index coding, shown to be order-optimal in the asymptotic regime in which the number of packets per file B goes to infinity.
Journal ArticleDOI

Double Coded Caching in Ultra Dense Networks: Caching and Multicast Scheduling via Deep Reinforcement Learning

TL;DR: Numerical results demonstrate that the proposed double coded caching scheme increases the probability of the successful transmission, and the caching and scheduling policy can effectively reduce the delay and the power consumption.
Proceedings ArticleDOI

Power-aware prefetch in mobile environments

TL;DR: It is shown by analysis that the proposed VAP scheme can indeed achieve the optimal performance in terms of stretch when power consumption is considered, and it is demonstrated that the algorithm significantly outperforms existing prefetching algorithms under various scenarios.
Journal ArticleDOI

Adaptive SOA Solution Stack

TL;DR: This paper presents the concept of an Adaptive SOA Solution Stack (AS3), an extension of the S3 model, implemented via uniform application of the AS3 element pattern across different layers of the model.
References
More filters
Proceedings Article

Cost-aware WWW proxy caching algorithms

TL;DR: GreedyDual-Size as discussed by the authors incorporates locality with cost and size concerns in a simple and nonparameterized fashion for high performance, which can potentially improve the performance of main-memory caching of Web documents.
Book

Operating Systems Theory

TL;DR: As one of the part of book categories, operating systems theory always becomes the most wanted book.

Characteristics of WWW Client-based Traces

TL;DR: This paper presents a descriptive statistical summary of the traces of actual executions of NCSA Mosaic, and shows that many characteristics of WWW use can be modelled using power-law distributions, including the distribution of document sizes, the popularity of documents as a function of size, and the Distribution of user requests for documents.
Proceedings ArticleDOI

Characterizing reference locality in the WWW

TL;DR: The authors propose models for both temporal and spatial locality of reference in streams of requests arriving at Web servers and show that temporal locality can be characterized by the marginal distribution of the stack distance trace, and proposed models for typical distributions and compare their cache performance to the traces.
Journal ArticleDOI

Working Sets Past and Present

TL;DR: This paper outlines the argument why it is unlikely that anyone will find a cheaper nonlookahead memory policy that delivers significantly better performance and suggests that a working set dispatcher should be considered.
Related Papers (5)