Main memory caching of Web documents

doi:10.1016/0169-7552(96)00035-9

Journal ArticleDOI

Main memory caching of Web documents

- Vol. 28, pp 893-905

TLDR

It is shown that even a small amount of main memory that is used as a document cache, is enough to hold more than 60% of the documents requested, and that traditional file system cache management methods are inappropriate for managing Main Memory Web caches.

Abstract:

An increasing amount of information is currently becoming available through World Wide Web servers. Document requests to popular Web servers arrive every few tens of milliseconds at peak rate. To reduce the overhead imposed by frequent document requests, we propose the notion of caching a World Wide Web server's documents in its main memory (which we call Main Memory Web Caching). We show that even a small amount of main memory (512 Kbytes) that is used as a document cache, is enough to hold more than 60% of the documents requested. We also show that traditional file system cache management methods are inappropriate for managing Main Memory Web caches, and may result in poor performance. Based on trace-driven simulations of several server traces we quantify our claims, and propose a new cache management that dynamically adjusts itself to the clients' request pattern and cache size. We show that our policy is robust over a variety of parameters and results is better overall performance.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Internet Web servers: workload characterization and performance implications

Martin Arlitt, +1 more

- 01 Oct 1997 -

IEEE ACM Transactions on Networking

TL;DR: The paper concludes with a discussion of caching and performance issues, using the observed workload characteristics to suggest performance enhancements that seem promising for Internet Web servers.

...read moreread less

Proceedings ArticleDOI

Characterizing reference locality in the WWW

Virgilio Almeida, +3 more

TL;DR: The authors propose models for both temporal and spatial locality of reference in streams of requests arriving at Web servers and show that temporal locality can be characterized by the marginal distribution of the stack distance trace, and proposed models for typical distributions and compare their cache performance to the traces.

...read moreread less

Journal ArticleDOI

Changes in Web client access patterns: Characteristics and caching implications

Paul Barford, +3 more

- 15 Jan 1999 -

World Wide Web

TL;DR: This study compares two measurements of Web client workloads separated in time by three years, both captured from the same computing facility at Boston University and finds that for the computing facility represented by traces between 1995 and 1998, the benefits of using size‐based caching policies have diminished and the potential for caching requested files in the network has declined.

...read moreread less

Journal ArticleDOI

Caching on the World Wide Web

Charu C. Aggarwal, +2 more

- 01 Jan 1999 -

IEEE Transactions on Knowledge and Data ...

TL;DR: This paper gives an overview of caching policies designed specifically for Web objects and provides a new algorithm of their own, regarded as a generalization of the standard LRU algorithm.

...read moreread less

Journal ArticleDOI

Size-based scheduling to improve web performance

Mor Harchol-Balter, +3 more

- 01 May 2003 -

ACM Transactions on Computer Systems

TL;DR: A method for improving the performance of web servers servicing static HTTP requests to give preference to requests for small files or requests with short remaining file size, in accordance with the SRPT (Shortest Remaining Processing Time) scheduling policy.

...read moreread less

Collapse

References

PDF

Open Access

More filters

ReportDOI

A hierarchical internet object cache

Anawat Chankhunthod, +4 more

TL;DR: The design and performance of a hierarchical proxy-cache designed to make Internet information systems scale better are discussed, and performance measurements indicate that hierarchy does not measurably increase access latency.

...read moreread less

Journal ArticleDOI

Caching in the Sprite network file system

Michael N. Nelson, +2 more

- 01 Feb 1988 -

ACM Transactions on Computer Systems

TL;DR: The Sprite network operating system as mentioned in this paper uses large main memory disk block caches to achieve high performance in its file system and provides non-write-through file caching on both client and server machines.

...read moreread less

Journal ArticleDOI

Information retrieval in the World-Wide Web: making client-based searching feasible

P.M.E. De Bra, +1 more

TL;DR: The paper shows how combining the fish search with a cache greatly reduces these problems and highlights the properties and implementation of a client-based search tool called the “ fish-search ” algorithm, and compares it to other approaches.

...read moreread less

Proceedings ArticleDOI

Application-level document caching in the Internet

Azer Bestavros, +5 more

TL;DR: The results suggest that distinguishing between documents produced locally and those produced remotely can provide useful leverage in designing caching policies, because of differences in the potential for sharing these two document types among multiple users.

...read moreread less

Proceedings ArticleDOI

Demand-based document dissemination to reduce traffic and balance load in distributed information systems

Azer Bestavros

TL;DR: This work proposes a hierarchical demand-based replication strategy that optimally disseminates information from its producer to servers that are closer to its consumers, and shows that by disseminating the most popular documents on servers closer to clients, network traffic could be reduced considerably, while servers are load-balanced.

...read moreread less

Main memory caching of Web documents

Citations

Internet Web servers: workload characterization and performance implications

Characterizing reference locality in the WWW

Changes in Web client access patterns: Characteristics and caching implications

Caching on the World Wide Web

Size-based scheduling to improve web performance

References

A hierarchical internet object cache

Caching in the Sprite network file system

Information retrieval in the World-Wide Web: making client-based searching feasible

Application-level document caching in the Internet

Demand-based document dissemination to reduce traffic and balance load in distributed information systems

Related Papers (5)

Caching Proxies: Limitations and Potentials

Removal policies in network caches for World-Wide Web documents

Cost-aware WWW proxy caching algorithms

A hierarchical internet object cache

Web server workload characterization: the search for invariants