scispace - formally typeset
Search or ask a question
Journal ArticleDOI

The cache location problem

01 Oct 2000-IEEE ACM Transactions on Networking (IEEE Press)-Vol. 8, Iss: 5, pp 568-582
TL;DR: There is a surprising consistency over time in the relative amount of web traffic from the server along a path, lending a stability to the TERC location solution and these techniques can be used by network providers to reduce traffic load in their network.
Abstract: This paper studies the problem of where to place network caches. Emphasis is given to caches that are transparent to the clients since they are easier to manage and they require no cooperation from the clients. Our goal is to minimize the overall flow or the average delay by placing a given number of caches in the network. We formulate these location problems both for general caches and for transparent en-route caches (TERCs), and identify that, in general, they are intractable. We give optimal algorithms for line and ring networks, and present closed form formulae for some special cases. We also present a computationally efficient dynamic programming algorithm for the single server case. This last case is of particular practical interest. It models a network that wishes to minimize the average access delay for a single web server. We experimentally study the effects of our algorithm using real web server data. We observe that a small number of TERCs are sufficient to reduce the network traffic significantly. Furthermore, there is a surprising consistency over time in the relative amount of web traffic from the server along a path, lending a stability to our TERC location solution. Our techniques can be used by network providers to reduce traffic load in their network.

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI
23 Apr 2006
TL;DR: This paper develops a basic scheme as a building block for all other advanced algorithms of the VN assignment problem and develops a selective VN reconfiguration scheme that prioritizes the reconfigurations of the most critical VNs.
Abstract: Recent proposals for network virtualization provide a promising way to overcome the Internet ossification. The key idea of network virtualization is to build a diversified Internet to support a variety of network services and architectures through a shared substrate. A major challenge in network virtualization is the assigning of substrate resources to virtual networks (VN) efficiently and on-demand. This paper focuses on two versions of the VN assignment problem: VN assignment without reconfiguration (VNA-I) and VN assignment with reconfiguration (VNAII). For the VNA-I problem, we develop a basic scheme as a building block for all other advanced algorithms. Subdividing heuristics and adaptive optimization strategies are then presented to further improve the performance. For the VNA-II problem, we develop a selective VN reconfiguration scheme that prioritizes the reconfiguration of the most critical VNs. Extensive simulation experiments demonstrate that the proposed algorithms can achieve good performance under a wide range of network conditions.

818 citations

Proceedings ArticleDOI
17 Aug 2012
TL;DR: The results show reduction of up to 20% in server hits, and up to 10% in the number of hops required to hit cached contents, but, most importantly, reduction of cache-evictions by an order of magnitude in comparison to universal caching.
Abstract: In-network caching necessitates the transformation of centralised operations of traditional, overlay caching techniques to a decentralised and uncoordinated environment. Given that caching capacity in routers is relatively small in comparison to the amount of forwarded content, a key aspect is balanced distribution of content among the available caches. In this paper, we are concerned with decentralised, real-time distribution of content in router caches. Our goal is to reduce caching redundancy and in turn, make more efficient utilisation of available cache resources along a delivery path.Our in-network caching scheme, called ProbCache, approximates the caching capability of a path and caches contents probabilistically in order to: i) leave caching space for other flows sharing (part of) the same path, and ii) fairly multiplex contents of different flows among caches of a shared path.We compare our algorithm against universal caching and against schemes proposed in the past for Web-Caching architectures, such as Leave Copy Down (LCD). Our results show reduction of up to 20% in server hits, and up to 10% in the number of hops required to hit cached contents, but, most importantly, reduction of cache-evictions by an order of magnitude in comparison to universal caching.

615 citations


Cites background from "The cache location problem"

  • ...Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for pro.t or commercial advantage and that copies bear this notice and the full citation on the .rst page....

    [...]

  • ...Our goal is to reduce caching redundancy and in turn, make more e.cient utili­sation of available cache resources along a delivery path....

    [...]

Journal ArticleDOI
TL;DR: This paper forms a model for studying the benefits of cooperation between nodes, which provides insight into peer-to-peer content distribution and shows that the problem of optimally replicating objects in CDN servers is NP complete.

471 citations


Cites background from "The cache location problem"

  • ...In [9] the authors consider the placement of intercepting proxies inside the network to reduce the download time....

    [...]

Journal ArticleDOI
28 Aug 2000
TL;DR: A grouping of clients that are close together topologically and likely to be under common administrative control are introduced, using a ``network-aware" method, based on information available from BGP routing table snapshots.
Abstract: Being able to identify the groups of clients that are responsible for a significant portion of a Web site's requests can be helpful to both the Web site and the clients. In a Web application, it is beneficial to move content closer to groups of clients that are responsible for large subsets of requests to an origin server. We introduce clusters---a grouping of clients that are close together topologically and likely to be under common administrative control. We identify clusters using a ``network-aware" method, based on information available from BGP routing table snapshots.

383 citations

Book ChapterDOI
21 May 2012
TL;DR: A centrality-based caching algorithm is proposed by exploiting the concept of (ego network) betweenness centrality to improve the caching gain and eliminate the uncertainty in the performance of the simplistic random caching strategy.
Abstract: Ubiquitous in-network caching is one of the key aspects of information-centric networking (ICN) which has recently received widespread research interest. In one of the key relevant proposals known as Networking Named Content (NNC), the premise is that leveraging in-network caching to store content in every node it traverses along the delivery path can enhance content delivery. We question such indiscriminate universal caching strategy and investigate whether caching less can actually achieve more . Specifically, we investigate if caching only in a subset of node(s) along the content delivery path can achieve better performance in terms of cache and server hit rates. In this paper, we first study the behavior of NNC's ubiquitous caching and observe that even naive random caching at one intermediate node within the delivery path can achieve similar and, under certain conditions, even better caching gain. We propose a centrality-based caching algorithm by exploiting the concept of (ego network) betweenness centrality to improve the caching gain and eliminate the uncertainty in the performance of the simplistic random caching strategy. Our results suggest that our solution can consistently achieve better gain across both synthetic and real network topologies that have different structural properties.

360 citations


Cites background from "The cache location problem"

  • ...While it was shown in [10] that, in its general formulation, this class of cache location problem is intractable, we have shown in our previous work in [11] that the critical point is to cache content closer to the users regardless of their relative locations in the topology....

    [...]

References
More filters
Book
01 Jan 1979
TL;DR: The second edition of a quarterly column as discussed by the authors provides a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book "Computers and Intractability: A Guide to the Theory of NP-Completeness,” W. H. Freeman & Co., San Francisco, 1979.
Abstract: This is the second edition of a quarterly column the purpose of which is to provide a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book ‘‘Computers and Intractability: A Guide to the Theory of NP-Completeness,’’ W. H. Freeman & Co., San Francisco, 1979 (hereinafter referred to as ‘‘[G&J]’’; previous columns will be referred to by their dates). A background equivalent to that provided by [G&J] is assumed. Readers having results they would like mentioned (NP-hardness, PSPACE-hardness, polynomial-time-solvability, etc.), or open problems they would like publicized, should send them to David S. Johnson, Room 2C355, Bell Laboratories, Murray Hill, NJ 07974, including details, or at least sketches, of any new proofs (full papers are preferred). In the case of unpublished results, please state explicitly that you would like the results mentioned in the column. Comments and corrections are also welcome. For more details on the nature of the column and the form of desired submissions, see the December 1981 issue of this journal.

40,020 citations

Proceedings Article
01 Jan 1997
TL;DR: The Hypertext Transfer Protocol is an application-level protocol for distributed, collaborative, hypermedia information systems, which can be used for many tasks beyond its use for hypertext through extension of its request methods, error codes and headers.
Abstract: The Hypertext Transfer Protocol (HTTP) is an application-level protocol for distributed, collaborative, hypermedia information systems. It is a generic, stateless, protocol which can be used for many tasks beyond its use for hypertext, such as name servers and distributed object management systems, through extension of its request methods, error codes and headers [47]. A feature of HTTP is the typing and negotiation of data representation, allowing systems to be built independently of the data being transferred.

3,881 citations


"The cache location problem" refers background in this paper

  • ...Current protocols allow caches to validate the freshness of locally stored data [2], [ 16 ]....

    [...]

Proceedings ArticleDOI
21 Mar 1999
TL;DR: This paper investigates the page request distribution seen by Web proxy caches using traces from a variety of sources and considers a simple model where the Web accesses are independent and the reference probability of the documents follows a Zipf-like distribution, suggesting that the various observed properties of hit-ratios and temporal locality are indeed inherent to Web accesse observed by proxies.
Abstract: This paper addresses two unresolved issues about Web caching. The first issue is whether Web requests from a fixed user community are distributed according to Zipf's (1929) law. The second issue relates to a number of studies on the characteristics of Web proxy traces, which have shown that the hit-ratios and temporal locality of the traces exhibit certain asymptotic properties that are uniform across the different sets of the traces. In particular, the question is whether these properties are inherent to Web accesses or whether they are simply an artifact of the traces. An answer to these unresolved issues will facilitate both Web cache resource planning and cache hierarchy design. We show that the answers to the two questions are related. We first investigate the page request distribution seen by Web proxy caches using traces from a variety of sources. We find that the distribution does not follow Zipf's law precisely, but instead follows a Zipf-like distribution with the exponent varying from trace to trace. Furthermore, we find that there is only (i) a weak correlation between the access frequency of a Web page and its size and (ii) a weak correlation between access frequency and its rate of change. We then consider a simple model where the Web accesses are independent and the reference probability of the documents follows a Zipf-like distribution. We find that the model yields asymptotic behaviour that are consistent with the experimental observations, suggesting that the various observed properties of hit-ratios and temporal locality are indeed inherent to Web accesses observed by proxies. Finally, we revisit Web cache replacement algorithms and show that the algorithm that is suggested by this simple model performs best on real trace data. The results indicate that while page requests do indeed reveal short-term correlations and other structures, a simple model for an independent request stream following a Zipf-like distribution is sufficient to capture certain asymptotic properties observed at Web proxies.

3,582 citations


"The cache location problem" refers background or result in this paper

  • ...servers [5]....

    [...]

  • ...It has been shown [4], [5] that amongst all pages on a server, only a small fraction are very popular, and our measurements support this observation....

    [...]

  • ...[5], most of the traffic generated in the Internet comes from a handful of very popular web servers....

    [...]