scispace - formally typeset
Open Access

Characteristics of WWW Client-based Traces

TLDR
This paper presents a descriptive statistical summary of the traces of actual executions of NCSA Mosaic, and shows that many characteristics of WWW use can be modelled using power-law distributions, including the distribution of document sizes, the popularity of documents as a function of size, and the Distribution of user requests for documents.
Abstract
The explosion of WWW traffic necessitates an accurate picture of WWW use, and in particular requires a good understanding of client requests for WWW documents. To address this need, we have collected traces of actual executions of NCSA Mosaic, reflecting over half a million user requests for WWW documents. In this paper we present a descriptive statistical summary of the traces we collected, which identifies a number of trends and reference patterns in WWW use. In particular, we show that many characteristics of WWW use can be modelled using power-law distributions, including the distribution of document sizes, the popularity of documents as a function of size, the distribution of user requests for documents, and the number of references to documents as a function of their overall rank in popularity (Zipf''s law). In addition, we show how the power-law distributions derived from our traces can be used to guide system designers interested in caching WWW documents. --- Our client-based traces are available via FTP from http://www.cs.bu.edu/techreports/1995-010-www-client-traces.tar.gz http://www.cs.bu.edu/techreports/1995-010-www-client-traces.a.tar.gz

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Web caching and Zipf-like distributions: evidence and implications

TL;DR: This paper investigates the page request distribution seen by Web proxy caches using traces from a variety of sources and considers a simple model where the Web accesses are independent and the reference probability of the documents follows a Zipf-like distribution, suggesting that the various observed properties of hit-ratios and temporal locality are indeed inherent to Web accesse observed by proxies.
Journal ArticleDOI

Self-similarity in World Wide Web traffic: evidence and possible causes

TL;DR: It is shown that the self-similarity in WWW traffic can be explained based on the underlying distributions of WWW document sizes, the effects of caching and user preference in file transfer, the effect of user "think time", and the superimposition of many such transfers in a local-area network.
Journal ArticleDOI

Self-similarity in World Wide Web traffic: evidence and possible causes

TL;DR: It is shown that the self-similarity in WWW traffic can be explained based on the underlying distributions of WWW document sizes, the effects of caching and user preference in file transfer, the effect of user "think time", and the superimposition of many such transfers in a local area network.
Journal ArticleDOI

Summary cache: a scalable wide-area web cache sharing protocol

TL;DR: This paper demonstrates the benefits of cache sharing, measures the overhead of the existing protocols, and proposes a new protocol called "summary cache", which reduces the number of intercache protocol messages, reduces the bandwidth consumption, and eliminates 30% to 95% of the protocol CPU overhead, all while maintaining almost the same cache hit ratios as ICP.
Proceedings ArticleDOI

Generating representative Web workloads for network and server performance evaluation

TL;DR: This paper applies a number of observations of Web server usage to create a realistic Web workload generation tool which mimics a set of real users accessing a server and addresses the technical challenges to satisfying this large set of simultaneous constraints on the properties of the reference stream.
References
More filters
Book

The Fractal Geometry of Nature

TL;DR: This book is a blend of erudition, popularization, and exposition, and the illustrations include many superb examples of computer graphics that are works of art in their own right.
Journal ArticleDOI

The Fractal Geometry of Nature

TL;DR: A blend of erudition (fascinating and sometimes obscure historical minutiae abound), popularization (mathematical rigor is relegated to appendices) and exposition (the reader need have little knowledge of the fields involved) is presented in this article.
Journal ArticleDOI

Characterizing browsing strategies in the World-Wide Web

TL;DR: A study conducted at Georgia Institute of Technology that captured client-side user events of NCSA's XMosaic supplemented the understanding of user navigation strategies as well as provided real interface usage data.
ReportDOI

A hierarchical internet object cache

TL;DR: The design and performance of a hierarchical proxy-cache designed to make Internet information systems scale better are discussed, and performance measurements indicate that hierarchy does not measurably increase access latency.
Related Papers (5)