Internet Web servers: workload characterization and performance implications
Martin Arlitt,Carey Williamson +1 more
Reads0
Chats0
TLDR
The paper concludes with a discussion of caching and performance issues, using the observed workload characteristics to suggest performance enhancements that seem promising for Internet Web servers.Abstract:
This paper presents a workload characterization study for Internet Web servers. Six different data sets are used in the study: three from academic environments, two from scientific research organizations, and one from a commercial Internet provider. These data sets represent three different orders of magnitude in server activity, and two different orders of magnitude in time duration, ranging from one week of activity to one year. The workload characterization focuses on the document type distribution, the document size distribution, the document referencing behavior, and the geographic distribution of server requests. Throughout the study, emphasis is placed on finding workload characteristics that are common to all the data sets studied. Ten such characteristics are identified. The paper concludes with a discussion of caching and performance issues, using the observed workload characteristics to suggest performance enhancements that seem promising for Internet Web servers.read more
Citations
More filters
Journal ArticleDOI
Web usage mining: discovery and applications of usage patterns from Web data
TL;DR: Web usage mining is the application of data mining techniques to discover usage patterns from Web data, in order to understand and better serve the needs of Web-based applications as mentioned in this paper, where preprocessing, pattern discovery, and pattern analysis are described in detail.
Journal ArticleDOI
The World’s Technological Capacity to Store, Communicate, and Compute Information
Martin Hilbert,Priscila López +1 more
TL;DR: An inventory of the world’s technological capacity from 1986 to 2007 reveals the evolution from analog to digital technologies, and the majority of the authors' technological memory has been in digital format since the early 2000s.
Proceedings ArticleDOI
Youtube traffic characterization: a view from the edge
TL;DR: This paper presents a traffic characterization study of the popular video sharing service, YouTube, and finds that as with the traditional Web, caching could improve the end user experience, reduce network bandwidth consumption, and reduce the load on YouTube's core server infrastructure.
Proceedings ArticleDOI
On the placement of Web server replicas
TL;DR: This work develops several placement algorithms that use workload information, such as client latency and request rates, to make informed placement decisions, and evaluates the placement algorithms using both synthetic and real network topologies, as well as Web server traces.
Proceedings ArticleDOI
Workload analysis of a large-scale key-value store
TL;DR: This paper collects detailed traces from Facebook's Memcached deployment, arguably the world's largest, and analyzes the workloads from multiple angles, including: request composition, size, and rate; cache efficacy; temporal patterns; and application use cases.
References
More filters
Journal ArticleDOI
Wide area traffic: the failure of Poisson modeling
Vern Paxson,Sally Floyd +1 more
TL;DR: It is found that user-initiated TCP session arrivals, such as remote-login and file-transfer, are well-modeled as Poisson processes with fixed hourly rates, but that other connection arrivals deviate considerably from Poisson.
Journal ArticleDOI
Self-similarity in World Wide Web traffic: evidence and possible causes
Mark Crovella,Azer Bestavros +1 more
TL;DR: It is shown that the self-similarity in WWW traffic can be explained based on the underlying distributions of WWW document sizes, the effects of caching and user preference in file transfer, the effect of user "think time", and the superimposition of many such transfers in a local-area network.
Journal ArticleDOI
Self-similarity in World Wide Web traffic: evidence and possible causes
Mark Crovella,Azer Bestavros +1 more
TL;DR: It is shown that the self-similarity in WWW traffic can be explained based on the underlying distributions of WWW document sizes, the effects of caching and user preference in file transfer, the effect of user "think time", and the superimposition of many such transfers in a local area network.
Journal ArticleDOI
Scale and performance in a distributed file system
John H. Howard,Michael Kazar,Sherri G. Menees,David A. Nichols,Mahadev Satyanarayanan,Robert N. Sidebotham,Michael J. West +6 more
TL;DR: Observations of a prototype implementation are presented, changes in the areas of cache validation, server process structure, name translation, and low-level storage representation are motivated, and Andrews ability to scale gracefully is quantitatively demonstrated.
Journal ArticleDOI
The World-Wide Web
TL;DR: The World Wide Web (W3) as mentioned in this paper is a pool of human knowledge that allows collaborators in remote sites to share their ideas and all aspects of a common project, which is the basis of the Web.
Related Papers (5)
Self-similarity in World Wide Web traffic: evidence and possible causes
Mark Crovella,Azer Bestavros +1 more
Web server workload characterization: the search for invariants
Martin Arlitt,Carey Williamson +1 more