The effect of client caching on file server workloads

doi:10.1109/HICSS.1996.495458

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Web server workload characterization: the search for invariants

[...]

Martin Arlitt¹, Carey Williamson¹•Institutions (1)

University of Saskatchewan¹

15 May 1996

TL;DR: This paper concludes with a discussion of caching and performance issues, using the invariants to suggest performance enhancements that seem most promising for Internet Web servers.

...read moreread less

Abstract: The phenomenal growth in popularity of the World Wide Web (WWW, or the Web) has made WWW traffic the largest contributor to packet and byte traffic on the NSFNET backbone. This growth has triggered recent research aimed at reducing the volume of network traffic produced by Web clients and servers, by using caching, and reducing the latency for WWW users, by using improved protocols for Web interaction.Fundamental to the goal of improving WWW performance is an understanding of WWW workloads. This paper presents a workload characterization study for Internet Web servers. Six different data sets are used in this study: three from academic (i.e., university) environments, two from scientific research organizations, and one from a commercial Internet provider. These data sets represent three different orders of magnitude in server activity, and two different orders of magnitude in time duration, ranging from one week of activity to one year of activity.Throughout the study, emphasis is placed on finding workload invariants: observations that apply across all the data sets studied. Ten invariants are identified. These invariants are deemed important since they (potentially) represent universal truths for all Internet Web servers. The paper concludes with a discussion of caching and performance issues, using the invariants to suggest performance enhancements that seem most promising for Internet Web servers.

...read moreread less

858 citations

Journal Article•DOI•

Internet Web servers: workload characterization and performance implications

[...]

Martin Arlitt¹, Carey Williamson¹•Institutions (1)

University of Saskatchewan¹

01 Oct 1997-IEEE ACM Transactions on Networking

TL;DR: The paper concludes with a discussion of caching and performance issues, using the observed workload characteristics to suggest performance enhancements that seem promising for Internet Web servers.

...read moreread less

Abstract: This paper presents a workload characterization study for Internet Web servers. Six different data sets are used in the study: three from academic environments, two from scientific research organizations, and one from a commercial Internet provider. These data sets represent three different orders of magnitude in server activity, and two different orders of magnitude in time duration, ranging from one week of activity to one year. The workload characterization focuses on the document type distribution, the document size distribution, the document referencing behavior, and the geographic distribution of server requests. Throughout the study, emphasis is placed on finding workload characteristics that are common to all the data sets studied. Ten such characteristics are identified. The paper concludes with a discussion of caching and performance issues, using the observed workload characteristics to suggest performance enhancements that seem promising for Internet Web servers.

...read moreread less

771 citations

Report•DOI•

My Cache or Yours? Making Storage More Exclusive

[...]

Theodore M. Wong¹, John Wilkes²•Institutions (2)

Carnegie Mellon University¹, Hewlett-Packard²

10 Jun 2002

TL;DR: In this article, the authors explore the benefits of a simple scheme to achieve exclusive caching, in which a data block is cached at either a client or the disk array, but not both.

...read moreread less

Abstract: Modern high-end disk arrays often have several gigabytes of cache RAM. Unfortunately, most array caches use management policies which duplicate the same data blocks at both the client and array levels of the cache hierarchy: they are inclusive. Thus, the aggregate cache behaves as if it was only as big as the larger of the client and array caches, instead of as large as the sum of the two. Inclusiveness is wasteful: cache RAM is expensive. We explore the benefits of a simple scheme to achieve exclusive caching, in which a data block is cached at either a client or the disk array, but not both. Exclusiveness helps to create the effect of a single, large unified cache. We introduce a DEMOTE operation to transfer data ejected from the client to the array, and explore its effectiveness with simulation studies. We quantify the benefits and overheads of demotions across both synthetic and real-life workloads. The results show that we can obtain useful—sometimes substantial—speedups. During our investigation, we also developed some new cache-insertion algorithms that show promise for multiclient systems, and report on some of their properties.

...read moreread less

285 citations

Book•

Workload Modeling for Computer Systems Performance Evaluation

[...]

Dror G. Feitelson¹•Institutions (1)

Hebrew University of Jerusalem¹

01 Mar 2015

TL;DR: Using this book, readers will be able to analyze collected workload data and clean it if necessary, derive statistical models that include skewed marginal distributions and correlations, and consider the need for generative models and feedback from the system.

...read moreread less

Abstract: Reliable performance evaluations require the use of representative workloads. This is no easy task since modern computer systems and their workloads are complex, with many interrelated attributes and complicated structures. Experts often use sophisticated mathematics to analyze and describe workload models, making these models difficult for practitioners to grasp. This book aims to close this gap by emphasizing the intuition and the reasoning behind the definitions and derivations related to the workload models. It provides numerous examples from real production systems, with hundreds of graphs. Using this book, readers will be able to analyze collected workload data and clean it if necessary, derive statistical models that include skewed marginal distributions and correlations, and consider the need for generative models and feedback from the system. The descriptive statistics techniques covered are also useful for other domains.

...read moreread less

247 citations

Cites background from "The effect of client caching on fil..."

...A case in point is the difference in locality observed by clie nt and server caches in a distributed file system [270]....
[...]
...In addition, cachi ng by proxies modifies the stream of requests en route [270, 265]....
[...]
...The most popular way to quantify temporal locality is by usin g a simulation of an LRU (least recently used) stack [474, 648, 647, 38, 270, 23]....
[...]
...Fo r example, locality is reduced near the servers because repetitions are filtered out by cach es, whereas the merging of different request streams causes interference [270]....
[...]

Journal Article•DOI•

Trace-driven simulation of document caching strategies for internet Web servers

[...]

Martin Arlitt¹, Carey Williamson¹•Institutions (1)

University of Saskatchewan¹

01 Jan 1997

TL;DR: Simulation results show that frequency-based caching strategies, using a variation of the Least Frequently Used (LFU) replacement policy, perform the best for the Web sewer workload traces considered, and thresholds and cache partitioning policies do not appear to be effective.

...read moreread less

Abstract: Given the continued growth of the World-Wide Web, performance of Web sewers is becoming increasingly important. File caching can be used to reduce the time that it takes a Web server to respond to client requests, by storing the most popular files in the main memory of the Web sewer, and by reducing the volume of data that must be transferred between secondary storage and the Web server. In this paper, we use trace-driven simulation to evaluate the effects of various replacement, threshold, and partitioning policies on the performance of a Web sewer. The workload traces for the simulations come from Web server access logs, from six different Internet Web sewers. The traces represent three different orders of magnitude in sewer activity and two different orders of magnitude in time duration. The results from our simulation study show that frequency-based caching strategies, using a variation of the Least Frequently Used (LFU) replacement policy, perform the best for the Web sewer workload traces considered...

...read moreread less

78 citations

Collapse

The effect of client caching on file server workloads

Citations

Cites background from "The effect of client caching on fil..."

References

Related Papers (5)

Trending Questions (2)