scispace - formally typeset
Journal ArticleDOI

Main memory caching of Web documents

TLDR
It is shown that even a small amount of main memory that is used as a document cache, is enough to hold more than 60% of the documents requested, and that traditional file system cache management methods are inappropriate for managing Main Memory Web caches.
Abstract
An increasing amount of information is currently becoming available through World Wide Web servers. Document requests to popular Web servers arrive every few tens of milliseconds at peak rate. To reduce the overhead imposed by frequent document requests, we propose the notion of caching a World Wide Web server's documents in its main memory (which we call Main Memory Web Caching). We show that even a small amount of main memory (512 Kbytes) that is used as a document cache, is enough to hold more than 60% of the documents requested. We also show that traditional file system cache management methods are inappropriate for managing Main Memory Web caches, and may result in poor performance. Based on trace-driven simulations of several server traces we quantify our claims, and propose a new cache management that dynamically adjusts itself to the clients' request pattern and cache size. We show that our policy is robust over a variety of parameters and results is better overall performance.

read more

Citations
More filters
Journal ArticleDOI

Internet Web servers: workload characterization and performance implications

TL;DR: The paper concludes with a discussion of caching and performance issues, using the observed workload characteristics to suggest performance enhancements that seem promising for Internet Web servers.
Proceedings ArticleDOI

Characterizing reference locality in the WWW

TL;DR: The authors propose models for both temporal and spatial locality of reference in streams of requests arriving at Web servers and show that temporal locality can be characterized by the marginal distribution of the stack distance trace, and proposed models for typical distributions and compare their cache performance to the traces.
Journal ArticleDOI

Changes in Web client access patterns: Characteristics and caching implications

TL;DR: This study compares two measurements of Web client workloads separated in time by three years, both captured from the same computing facility at Boston University and finds that for the computing facility represented by traces between 1995 and 1998, the benefits of using size‐based caching policies have diminished and the potential for caching requested files in the network has declined.
Journal ArticleDOI

Caching on the World Wide Web

TL;DR: This paper gives an overview of caching policies designed specifically for Web objects and provides a new algorithm of their own, regarded as a generalization of the standard LRU algorithm.
Journal ArticleDOI

Size-based scheduling to improve web performance

TL;DR: A method for improving the performance of web servers servicing static HTTP requests to give preference to requests for small files or requests with short remaining file size, in accordance with the SRPT (Shortest Remaining Processing Time) scheduling policy.
References
More filters
ReportDOI

A hierarchical internet object cache

TL;DR: The design and performance of a hierarchical proxy-cache designed to make Internet information systems scale better are discussed, and performance measurements indicate that hierarchy does not measurably increase access latency.
Journal ArticleDOI

Caching in the Sprite network file system

TL;DR: The Sprite network operating system as mentioned in this paper uses large main memory disk block caches to achieve high performance in its file system and provides non-write-through file caching on both client and server machines.
Journal ArticleDOI

Information retrieval in the World-Wide Web: making client-based searching feasible

TL;DR: The paper shows how combining the fish search with a cache greatly reduces these problems and highlights the properties and implementation of a client-based search tool called the “ fish-search ” algorithm, and compares it to other approaches.
Proceedings ArticleDOI

Application-level document caching in the Internet

TL;DR: The results suggest that distinguishing between documents produced locally and those produced remotely can provide useful leverage in designing caching policies, because of differences in the potential for sharing these two document types among multiple users.
Proceedings ArticleDOI

Demand-based document dissemination to reduce traffic and balance load in distributed information systems

TL;DR: This work proposes a hierarchical demand-based replication strategy that optimally disseminates information from its producer to servers that are closer to its consumers, and shows that by disseminating the most popular documents on servers closer to clients, network traffic could be reduced considerably, while servers are load-balanced.
Related Papers (5)