scispace - formally typeset
Search or ask a question

Showing papers on "Smart Cache published in 2000"


Journal ArticleDOI
TL;DR: This paper demonstrates the benefits of cache sharing, measures the overhead of the existing protocols, and proposes a new protocol called "summary cache", which reduces the number of intercache protocol messages, reduces the bandwidth consumption, and eliminates 30% to 95% of the protocol CPU overhead, all while maintaining almost the same cache hit ratios as ICP.
Abstract: The sharing of caches among Web proxies is an important technique to reduce Web traffic and alleviate network bottlenecks. Nevertheless it is not widely deployed due to the overhead of existing protocols. In this paper we demonstrate the benefits of cache sharing, measure the overhead of the existing protocols, and propose a new protocol called "summary cache". In this new protocol, each proxy keeps a summary of the cache directory of each participating proxy, and checks these summaries for potential hits before sending any queries. Two factors contribute to our protocol's low overhead: the summaries are updated only periodically, and the directory representations are very economical, as low as 8 bits per entry. Using trace-driven simulations and a prototype implementation, we show that, compared to existing protocols such as the Internet cache protocol (ICP), summary cache reduces the number of intercache protocol messages by a factor of 25 to 60, reduces the bandwidth consumption by over 50%, eliminates 30% to 95% of the protocol CPU overhead, all while maintaining almost the same cache hit ratios as ICP. Hence summary cache scales to a large number of proxies. (This paper is a revision of Fan et al. 1998; we add more data and analysis in this version.).

2,174 citations


Patent
24 Nov 2000
TL;DR: In this paper, a preloader uses a cache replacement manager to manage requests for retrievals, insertions, and removal of web page components in a component cache, and a profile server predicts a user's next content request.
Abstract: A preloader works in conjunction with a web/app server and optionally a profile server to cache web page content elements or components for faster on-demand and anticipatory dynamic web page delivery. The preloader uses a cache manager to manage requests for retrievals, insertions, and removal of web page components in a component cache. The preloader uses a cache replacement manager to manage the replacement of components in the cache. While the cache replacement manager may utilize any cache replacement policy, a particularly effective replacement policy utilizes predictive information to make replacement decisions. Such a policy uses a profile server, which predicts a user's next content request. The components that can be cached are identified by tagging them within the dynamic scripts that generate them. The preloader caches components that are likely to be accessed next, thus improving a web site's scalability.

366 citations


Patent
23 Mar 2000
TL;DR: In this paper, a plurality of cache servers capable of caching WWW information provided by the information servers are provided in association with the wireless network, and the cache servers can be managed by receiving a message indicating at least a connected location of a mobile computer in a wireless network from the mobile computer, selecting one or more cache servers located nearby the mobile computers according to the message, and controlling these one or multiple cache servers to cache selected web information selected for the mobile Computer, so as to enable faster accesses to the selected Web information by the Mobile Computer.
Abstract: In the disclosed information delivery scheme for delivering WWW information provided by information servers on the Internet to mobile computers connected to the Internet through a wireless network, a plurality of cache servers capable of caching WWW information provided by the information servers are provided in association with the wireless network. The cache servers can be managed by receiving a message indicating at least a connected location of a mobile computer in the wireless network from the mobile computer, selecting one or more cache servers located nearby the mobile computer according to the message, and controlling these one or more cache servers to cache selected WWW information selected for the mobile computer, so as to enable faster accesses to the selected WWW information by the mobile computer. Also, the cache servers can be managed by selecting one or more cache servers located within a geographic range defined for an information provider who provides WWW information from an information server, and controlling these one or more cache servers to cache selected WWW information selected for the information provider, so as to enable faster accesses to the selected WWW information by the mobile computer.

300 citations


Proceedings ArticleDOI
01 Aug 2000
TL;DR: The results show that semantic caching is more flexible and effective for use in LDD applications than page caching, whose performance is quite sensitive to the database physical organization.
Abstract: Location-dependent applications are becoming very popular in mobile environments. To improve system performance and facilitate disconnection, caching is crucial to such applications. In this paper, a semantic caching scheme is used to access location dependent data in mobile computing. We first develop a mobility model to represent the moving behaviors of mobile users and formally define location dependent queries. We then investigate query processing and cache management strategies. The performance of the semantic caching scheme and its replacement strategy FAR is evaluated through a simulation study. Our results show that semantic caching is more flexible and effective for use in LDD applications than page caching, whose performance is quite sensitive to the database physical organization. We also notice that the semantic cache replacement strategy FAR, which utilizes the semantic locality in terms of locations, performs robustly under different kinds of workloads.

292 citations


Proceedings ArticleDOI
01 Aug 2000
TL;DR: This paper focuses on the features of the M340 cache sub-system and illustrates the effect on power and performance through benchmark analysis and actual silicon measurements.
Abstract: Advances in technology have allowed portable electronic devices to become smaller and more complex, placing stringent power and performance requirements on the devices components. The M7CORE M3 architecture was developed specifically for these embedded applications. To address the growing need for longer battery life and higher performance, an 8-Kbyte, 4-way set-associative, unified (instruction and data) cache with pro-grammable features was added to the M3 core. These features allow the architecture to be optimized based on the applications requirements. In this paper, we focus on the features of the M340 cache sub-system and illustrate the effect on power and perfor-mance through benchmark analysis and actual silicon measure-ments.

253 citations


Proceedings ArticleDOI
01 Feb 2000
TL;DR: The rules of thumb for the design of data storage systems are reexamines with a particular focus on performance and price/performance, and the 5-minute rule for disk caching becomes a cache-everything rule for Web caching.
Abstract: This paper reexamines the rules of thumb for the design of data storage systems Briefly, it looks at storage, processing, and networking costs, ratios, and trends with a particular focus on performance and price/performance Amdahl's ratio laws for system design need only slight revision after 35 years-the major change being the increased use of RAM An analysis also indicates storage should be used to cache both database and Web data to save disk bandwidth, network bandwidth, and people's time Surprisingly, the 5-minute rule for disk caching becomes a cache-everything rule for Web caching

232 citations


Proceedings ArticleDOI
01 May 2000
TL;DR: A practical, fully associative, software-managed secondary cache system that provides performance competitive with or superior to traditional caches without OS or application involvement is presented.
Abstract: As DRAM access latencies approach a thousand instruction-execution times and on-chip caches grow to multiple megabytes, it is not clear that conventional cache structures continue to be appropriate. Two key features—full associativity and software management—have been used successfully in the virtual-memory domain to cope with disk access latencies. Future systems will need to employ similar techniques to deal with DRAM latencies. This paper presents a practical, fully associative, software-managed secondary cache system that provides performance competitive with or superior to traditional caches without OS or application involvement. We see this structure as the first step toward OS- and application-aware management of large on-chip caches.This paper has two primary contributions: a practical design for a fully associative memory structure, the indirect index cache (IIC), and a novel replacement algorithm, generational replacement, that is specifically designed to work with the IIC. We analyze the behavior of an IIC with generational replacement as a drop-in, transparent substitute for a conventional secondary cache. We achieve miss rate reductions from 8% to 85% relative to a 4-way associative LRU organization, matching or beating a (practically infeasible) fully associative true LRU cache. Incorporating these miss rates into a rudimentary timing model indicates that the IIC/generational replacement cache could be competitive with a conventional cache at today's DRAM latencies, and will outperform a conventional cache as these CPU-relative latencies grow.

224 citations


Patent
25 Jan 2000
TL;DR: In this paper, a relatively high-speed, intermediate-volume storage device is operated as a user-configurable cache, where data is preloaded and responsively cached in the cache memory based on user preferences.
Abstract: An apparatus and method for caching data in a storage device (26) of a computer system (10). A relatively high-speed, intermediate-volume storage device (25) is operated as a user-configurable cache. Requests to access a mass storage device (46) such as a disk or tape (26, 28) are intercepted by a device driver (32) that compares the access request against a directory (51) of the contents of the user-configurable cache (25). If the user-configurable cache contains the data sought to be accessed, the access request is carried out in the user-configurable cache instead of being forwarded to the device driver for the target mass storage device (46). Because the user-cache is implemented using memory having a dramatically shorter access time than most mechanical mass storage devices, the access request is fulfilled much more quickly than if the originally intended mass storage device was accessed. Data is preloaded and responsively cached in the user-configurable cache memory based on user preferences.

205 citations


Proceedings ArticleDOI
01 Dec 2000
TL;DR: The design and evaluation of the compression cache (CC) is presented which is a first level cache that has been designed so that each cache line can either hold one uncompressed line or two cache lines which have been compressed to at least half their lengths.
Abstract: Since the area occupied by cache memories on processor chips continues to grow, an increasing percentage of power is consumed by memory. We present the design and evaluation of the compression cache (CC) which is a first level cache that has been designed so that each cache line can either hold one uncompressed line or two cache lines which have been compressed to at least half their lengths. We use a novel data compression scheme based upon encoding of a small number of valves that appear frequently during memory accesses. This compression scheme preserves the ability to randomly access individual data items. We observed that the contents of 40%, 52% and 51% of the memory blocks of size 4, 8, and 16 words respectively in SPECint95 benchmarks can be compressed to at least half their sizes by encoding the top 2, 4, and 8 frequent valves respectively. Compression allows greater amounts of data to be stored leading to substantial reductions in miss rates (0-36.4%), off-chip traffic (3.9-48.1%), and energy consumed (1-27%). Traffic and energy reductions are in part derived by transferring data over external buses in compressed form.

195 citations


Proceedings ArticleDOI
10 Apr 2000
TL;DR: This work presents an on-line algorithm that effectively captures and maintains an accurate popularity profile of Web objects requested through a caching proxy that is superior to a host of recently-proposed and widely-used algorithms using extensive trace-driven simulations and a variety of performance metrics.
Abstract: Web caching aims at reducing network traffic, server load and user-perceived retrieval delays by replicating popular content on proxy caches that are strategically placed within the network. While key to effective cache utilization, popularity information (e.g. relative access frequencies of objects requested through a proxy) is seldom incorporated directly in cache replacement algorithms. Rather other properties of the request stream (e.g. temporal locality and content size), which are easier to capture in an online fashion, are used to indirectly infer popularity information, and hence drive cache replacement policies. Recent studies suggest that the correlation between these secondary properties and popularity is weakening due in part to the prevalence of efficient client and proxy caches. This trend points to the need for proxy cache replacement algorithms that directly capture popularity information. We present an on-line algorithm that effectively captures and maintains an accurate popularity profile of Web objects requested through a caching proxy. We propose a novel cache replacement policy that uses such information to generalize the well-known greedy dual-size algorithm, and show the superiority of our proposed algorithm by comparing it to a host of recently-proposed and widely-used algorithms using extensive trace-driven simulations and a variety of performance metrics.

184 citations


Proceedings ArticleDOI
01 Dec 2000
TL;DR: Dynamic zero compression reduces the energy required for cache accesses by only writing and reading a single bit for every zero-valued byte and an instruction recoding technique is described that increases instruction cache energy savings to 18%.
Abstract: Dynamic zero compression reduces the energy required for cache accesses by only writing and reading a single bit for every zero-valued byte. This energy-conscious compression is invisible to software and is handled with additional circuitry embedded inside the cache RAM arrays and the CPU. The additional circuitry imposes a cache area overhead of 9% and a read latency overhead of around two F04 gate delays. Simulation results show that we can reduce total data cache energy by around 26% and instruction cache energy by around 10% for SPECint95 and MediaBench benchmarks. We also describe the use of an instruction recoding technique that increases instruction cache energy savings to 18%.

Patent
31 Jul 2000
TL;DR: In this paper, the authors present a cache control method for caching disk data in a disk drive configured to receive commands for both streaming and non-streaming data from a host, where a lossy state record is provided for memory segments in a cache memory.
Abstract: The present invention may be embodied in a cache control method for caching disk data in a disk drive configured to receive commands for both streaming and non-streaming data from a host. A lossy state record is provided for memory segments in a cache memory. The lossy state record allows hosts commands to be mixed for streaming and non-streaming data without flushing of cache data for a command mode change.

Proceedings ArticleDOI
01 May 2000
TL;DR: A generalization of time skewing for multiprocessor architectures is given, and techniques for using multilevel caches reduce the LI cache requirement, which would otherwise be unacceptably high for some architectures when using arrays of high dimension.
Abstract: Time skewing is a compile-time optimization that can provide arbitrarily high cache hit rates for a class of iterative calculations, given a sufficient number of time steps and sufficient cache memory. Thus, it can eliminate processor idle time caused by inadequate main memory bandwidth. In this article, we give a generalization of time skewing for multiprocessor architectures, and discuss time skewing for multilevel caches. Our generalization for multiprocessors lets us eliminate processor idle time caused by any combination of inadequate main memory bandwidth, limited network bandwidth, and high network latency, given a sufficiently large problem and sufficient cache. As in the uniprocessor case, the cache requirement grows with the machine balance rather than the problem size. Our techniques for using multilevel caches reduce the LI cache requirement, which would otherwise be unacceptably high for some architectures when using arrays of high dimension.

Proceedings ArticleDOI
01 Jun 2000
TL;DR: A way to improve the performance of embedded processors running data-intensive applications by allowing software to allocate on-chip memory on an application-specific basis via a novel hardware mechanism, called column caching.
Abstract: We propose a way to improve the performance of embedded processors running data-intensive applications by allowing software to allocate on-chip memory on an application-specific basis. On-chip memory in the form of cache can be made to act like scratch-pad memory via a novel hardware mechanism, which we call column caching. Column caching enables dynamic cache partitioning in software, by mapping data regions to a specified sets of cache “columns” or “ways.” When a region of memory is exclusively mapped to an equivalent sized partition of cache, column caching provides the same functionality and predictability as a dedicated scratchpad memory for time-critical parts of a real-time application. The ratio between scratchpad size and cache size can be easily and quickly varied for each application, or each task within an application. Thus, software has much finer software control of on-chip memory, providing the ability to dynamically tradeoff performance for on-chip memory.

Journal ArticleDOI
TL;DR: The importance of different Web proxy workload characteristics in making good cache replacement decisions is analyzed, and results indicate that higher cache hit rates are achieved using size-based replacement policies.

Patent
26 Apr 2000
TL;DR: In this paper, the cache employs one or more prefetch ways for storing prefetch cache lines and accessed cache lines, while cache lines fetched in response to cache misses for requests initiated by a microprocessor connected to the cache are stored into non-prefetch ways.
Abstract: A cache employs one or more prefetch ways for storing prefetch cache lines and one or more ways for storing accessed cache lines. Prefetch cache lines are stored into the prefetch way, while cache lines fetched in response to cache misses for requests initiated by a microprocessor connected to the cache are stored into the non-prefetch ways. Accessed cache lines are thereby maintained within the cache separately from prefetch cache lines. When a prefetch cache line is presented to the cache for storage, the prefetch cache line may displace another prefetch cache line but does not displace an accessed cache line. A cache hit in either the prefetch way or the non-prefetch ways causes the cache line to be delivered to the requesting microprocessor in a cache hit fashion. The cache is further configured to move prefetch cache lines from the prefetch way to the non-prefetch way if the prefetch cache lines are requested (i.e. they become accessed cache lines). Instruction cache lines may be moved immediately upon access, while data cache line accesses may be counted and a number of accesses greater than a predetermined threshold value may occur prior to moving the data cache line from the prefetch way to the non-prefetch way. Additionally, movement of an accessed cache line from the prefetch way to the non-prefetch way may be delayed until the accessed cache line is to be replaced by a prefetch cache line.

Patent
07 Dec 2000
TL;DR: In this paper, the cache memory is partitioned among a set of threads of a multi-threaded processor, and when a cache miss occurs, a replacement line is selected in a partition of the cache space which is allocated to the particular thread from which the access causing the cache miss originated, thereby preventing pollution to partitions belonging to other threads.
Abstract: A method and apparatus which provides a cache management policy for use with a cache memory for a multi-threaded processor. The cache memory is partitioned among a set of threads of the multi-threaded processor. When a cache miss occurs, a replacement line is selected in a partition of the cache memory which is allocated to the particular thread from which the access causing the cache miss originated, thereby preventing pollution to partitions belonging to other threads.

Patent
14 Dec 2000
TL;DR: In this paper, a cache system and method in accordance with the invention includes a cache near the target devices and another cache at the requesting host side so that the data traffic across the computer network is reduced.
Abstract: A cache system and method in accordance with the invention includes a cache near the target devices and another cache at the requesting host side so that the data traffic across the computer network is reduced. A cache updating and invalidation method are described.

Patent
09 May 2000
TL;DR: In this article, a technique for cache segregation utilizes logic for storing and communicating thread identification (TID) bits, which can be inserted at the most significant bits of the cache index.
Abstract: A processor includes logic (612) for tagging a thread identifier (TID) for usage with processor blocks that are not stalled. Pertinent non-stalling blocks include caches, translation look-aside buffers (TLB) (1258, 1220), a load buffer asynchronous interface, an external memory management unit (MMU) interface (320, 330), and others. A processor (300) includes a cache that is segregated into a plurality of N cache parts. Cache segregation avoids interference, 'pollution', or 'cross-talk' between threads. One technique for cache segregation utilizes logic for storing and communicating thread identification (TID) bits. The cache utilizes cache indexing logic. For example, the TID bits can be inserted at the most significant bits of the cache index.

Patent
20 Oct 2000
TL;DR: In this paper, a cache server among a plurality of cache servers among a cluster server apparatus cache server, while optimally distributing loads on the plurality of the cache servers, is considered.
Abstract: A cluster server apparatus operable to continuously carrying out data distribution to terminals even if among a plurality of cache servers of the cluster server apparatus cache server, while optimally distributing loads on the plurality of cache servers. A cluster control unit of the cluster server apparatus distributes requests from the terminals based on the load of each of the plurality of cache servers. A cache server among the plurality of cache servers distributes, requested data (streaming data) to a terminal if the requested data is stored in a streaming data storage unit of the cache server, while distributing data from a content server the requested data if it is not stored in the streaming data storage unit. The data distributed from the content server is redundantly stored in the respective streaming data storage units of two or more cache servers. One cache server detects the state of distribution of the other cache server that stores the same data as that stored in the one cache server. If the one cache server becomes unable to carry out distribution, the other cache server continues data distribution instead.

Journal ArticleDOI
TL;DR: This work proposes sacrificing some performance in exchange for energy efficiency by filtering cache references through an unusually small first level cache, which results in a 51 percent reduction in the energy-delay product when compared to a conventional design.
Abstract: Most modern microprocessors employ one or two levels of on-chip caches in order to improve performance. Caches typically are implemented with static RAM cells and often occupy a large portion of the chip area. Not surprisingly, these caches can consume a significant amount of power. In many applications, such as portable devices, energy efficiency is more important than performance. We propose sacrificing some performance in exchange for energy efficiency by filtering cache references through an unusually small first level cache. We refer to this structure as the filter cache. A second level cache, similar in size and structure to a conventional first level cache, is positioned behind the filter cache and serves to mitigate the performance loss. Extensive experiments indicate that a small filter cache still can achieve a high hit rate and good performance. This approach allows the second level cache to be in a low power mode most of the time, thus resulting in power savings. The filter cache is particularly attractive in low power applications, such as the embedded processors used for communication and multimedia applications. For example, experimental results across a wide range of embedded applications show that a direct mapped 255-byte filter cache achieves a 58 percent power reduction while reducing performance by 21 percent. This trade-off results in a 51 percent reduction in the energy-delay product when compared to a conventional design.

Patent
03 Jul 2000
TL;DR: In this article, a cache management processor is coupled with a cache cache management memory by a second link to manipulate the cache management structure in a hash table with linked lists at each hash queue element in accordance with cache management command and search key.
Abstract: A system and method for managing data stored in a cache block in a cache memory includes a cache block is located at a cache block address in the cache memory, and the data in the cache block corresponds to a storage location in a storage array identified by a storage location identifier. A storage processor accesses the cache block in the cache memory and provides a cache management command to a command processor. A processor memory coupled to the storage processor stores a search key based on the storage location identifier corresponding to the cache block. A command processor coupled to the storage processor receives a cache management command specified by the storage processor and transfers the storage location identifier from the processor memory. A cache management memory stores a cache management structure including the cache block address and the search key. A cache management processor is coupled to the cache management memory by a second link to manipulate the cache management structure in a hash table with linked lists at each hash queue element within the cache management memory in accordance with the cache management command and the search key.

Journal ArticleDOI
TL;DR: The results show that Cache Investment can significantly improve the overall performance of a system and demonstrate the trade-offs among various alternative policies.
Abstract: Emerging distributed query-processing systems support flexible execution strategies in which each query can be run using a combination of data shipping and query shipping. As in any distributed environment, these systems can obtain tremendous performance and availability benefits by employing dynamic data caching. When flexible execution and dynamic caching are combined, however, a circular dependency arises: Caching occurs as a by-product of query operator placement, but query operator placement decisions are based on (cached) data location. The practical impact of this dependency is that query optimization decisions that appear valid on a per-query basis can actually cause suboptimal performance for all queries in the long run.To address this problem, we developed Cache Investment - a novel approach for integrating query optimization and data placement that looks beyond the performance of a single query. Cache Investment sometimes intentionally generates a “suboptimal” plan for a particular query in the interest of effecting a better data placement for subsequent queries. Cache Investment can be integrated into a distributed database system without changing the internals of the query optimizer. In this paper, we propose Cache Investment mechanisms and policies and analyze their performance. The analysis uses results from both an implementation on the SHORE storage manager and a detailed simulation model. Our results show that Cache Investment can significantly improve the overall performance of a system and demonstrate the trade-offs among various alternative policies.

Patent
27 Jun 2000
TL;DR: An active cache as mentioned in this paper is an OAP system that can not only answer queries that match data stored in the cache, but can also handle queries that require aggregation or other computation of the stored data.
Abstract: An “active cache”, for use by On-Line Analytic Processing (OLAP) systems, that can not only answer queries that match data stored in the cache, but can also answer queries that require aggregation or other computation of the data stored in the cache.

Patent
Achmed R. Zahir1, Jeffrey Baxter1
31 Mar 2000
TL;DR: In this paper, the authors propose a technique to evict the identified data from a cache ahead of other eviction candidates that are likely to be used during further program execution, thus making better use of cache memory.
Abstract: Program instructions permit software management of a processor cache. The program instructions may permit a software designer to provide software deallocation hints identifying data that is not likely to be used during further program execution. The program instructions may permit a processor to evict the identified data from a cache ahead of other eviction candidates that are likely to be used during further program execution. Thus, these software hints provide for better use of cache memory.

Patent
27 Dec 2000
TL;DR: In this paper, the authors proposed a data storage and retrieval technique that utilizes a cache which is preferred to a consumer of a data element stored within that cache, so that the consumer has less contention for access to the preferred cache vis-a-vis a cache of a conventional data storage system implementation.
Abstract: The invention is directed to data storage and retrieval techniques that utilize a cache which is preferred to a consumer of a data element stored within that cache. Since the cache is preferred to the consumer, the consumer has less contention for access to the preferred cache vis-a-vis a cache of a conventional data storage system implementation which is typically equally shared throughout the data storage system. Preferably, the preferred cache is on the same circuit board as the consumer so that memory accesses are on the order of a few hundred nanoseconds, rather than several microseconds when the cache and the consumer are on different circuit boards as in a conventional data storage implementation. One arrangement of the invention is directed to a data storage system having a first circuit board, a second circuit board and a connection mechanism that connects the first and second circuit boards together. The first circuit board includes (i) a front-end interface circuit for connecting to an external host, (ii) an on-board cache, and (iii) an on-board switch having a first port that connects to the front-end interface circuit, a second port that connects to the on-board cache, and a third port that connects to the connection mechanism. The second circuit board has a back-end interface circuit for connecting to a storage device. When the front-end interface circuit retrieves (on behalf of a host) a data element (e.g., a block of data) from the storage device through the on-board switch of the first circuit board, the connection mechanism and the back-end interface circuit of the second circuit board, the on-board cache of the first circuit board can retain a copy of the data element for quick access in the future. By configuring the on-board cache to be preferred to the front-end interface circuit and because both the on-board cache and the front-end interface circuit reside on the first circuit board, when the front-end interface circuit accesses the copy of the data element in the on-board cache, there will be less contention and latency compared to that for a highly shared cache of a conventional data storage system implementation.

Patent
Le Trong Nguyen1
10 Oct 2000
TL;DR: In this paper, the authors propose an integrated digital signal processor (IDS) architecture which includes a cache subsystem, a first bus and a second bus, which provides caching and data routing for the processors.
Abstract: To achieve high performance at low cost, an integrated digital signal processor uses an architecture which includes both a general purpose processor and a vector processor. The integrated digital signal processor also includes a cache subsystem, a first bus and a second bus. The cache subsystem provides caching and data routing for the processors and buses. Multiple simultaneous communication paths can be used in the cache subsystem for the processors and buses. Furthermore, simultaneous reads and writes are supported to a cache memory in the cache subsystem.

Proceedings ArticleDOI
08 Jan 2000
TL;DR: This paper presents a method of decompressing programs using software that relies on using a software-managed instruction cache under control of the decompressor, and considers selective compression (determining which procedures in a program should be compressed).
Abstract: Compressed representations of programs can be used to improve the code density in embedded systems. Several hardware decompression architectures have been proposed recently. In this paper, we present a method of decompressing programs using software. It relies on using a software-managed instruction cache under control of the decompressor. This is achieved by employing a simple cache management instruction that allows explicit writing into a cache line. We also consider selective compression (determining which procedures in a program should be compressed) and show that selection based on cache miss profiles can substantially outperform the usual execution time based profiles for some benchmarks.

Patent
Bernd Lamberts1
14 Sep 2000
TL;DR: A cooperative disk cache management and rotational positioning optimization (RPO) method for a data storage device such as a disk drive makes cache decisions that decrease the total access times for all data.
Abstract: A cooperative disk cache management and rotational positioning optimization (RPO) method for a data storage device, such as a disk drive, makes cache decisions that decrease the total access times for all data The cache memory provides temporary storage for data either to be written to disk or that has been read from disk Data access times from cache are significantly lower than data access times from the storage device, and it is advantageous to store in cache data that is likely to be referenced again For each data block that is a candidate to store in cache, a cost function is calculated and compared with analogous cost functions for data already in cache The data having the lowest cost function is removed from cache and replaced with data having a higher cost function The cost function C measures the expected additional cost, in time, of not storing the data in cache, and is given by C=(T d −T c )P, where T d is the disk access time, T c is the cache access time, and P is an access probability for the data Access times are calculated according to an RPO algorithm that includes both seek times and rotational latencies

Proceedings ArticleDOI
06 Nov 2000
TL;DR: " # %$& ' ( ) * + ! ,*./ *&) 0 12 3 4 ! 5 ! 6 $/ &78 + 9 " 4* $/ 6 :; 6 < %*=(, < * ! 5 > 5< 1? @ A ! B CD 5 ! ' ; 5< @E' F *G ? C 9$/ 0:& 0 HCD 6 7I + ! "JLK#."
Abstract: " # %$& ' ( ) * + ! ,*./ *&) 0 12 3 4 ! 5 ! 6 $/ &78 + 9 " 4* $/ 6 :; 6 < %*=(, < * ! 5 > 5< 1? @ A ! B CD 5 ! ' ; 5< @E' F *G ? C 9$/ 0:& 0 HCD 6 7I + ! "JLK# $/ 6 ( 0 M !73 < N ;* ! 5 5< 1O P< @E' @ 2 < Q @ 6 5 9 # 6 * 6 @1 *+78 ! C <&) *F/ * ! 0 O 0: 0 6 J RS T ;$" 9 @ U V 6 (W* ! 5 C=* 9 ; ! ,*?7I E% ' N 6 @ ' ;* " P @ (Q < F* 1" G !7 6 @ ' X ! P< Y + 1 ' Z O " ) *& + * -& / * ! 0 X 0: 0 6 -D 6 " Q P< 1 "J. [ ; < @ .C, !CD 6 (, TC& "CD " O ! 4 P\ ' 1" )]*A 5< C ^ ! ' QCD " 6:'( _%`Qa&bc(,78 ! 3 " &) * + ! ,* -& / * ! 0 JG [ L_%`Qa&bc(X < M @ &d, N 7 * ! 5 F V (.* ! 5 A 6) $ &*& @ :'( ! eC& "-, ![:O *3 C=* 9 W7^ E' 6: ) @*& 6 *f "1" P < 6 JOg% @ 9 @ )]-, ! *;CD 6 7I + O $" ! , ! < 2 < ! + < ;C& "CD *?CD " 6:h 0 5 ' @ :i " & CD 6 78 ! < 3 6j& 0 @ 1k 0 5 ! 61" J 1. INTRODUCTION li < 6 " ' & " m*& C :& ' W 7= ! 6 @ n ! ,*k 6 @ ^ 9 Y 0:& 0) 6 ( 3 " C 1T *M ! ! 3-D 6 " 1 ! f CD 5 ' .C, 9 # !7X " & # 78 "J.oW / "*& 0 )]*f 7I + ! *& , ! p(D78 ! 3 # !C, !:; + T C; 2 4 ! 5 ! : % -D 6 + !7 6 (Q<, ! 2-D 6 " 4 q< ! + ! 5 < N C ~q 4 < N P $/ 6 Jh [ CD "(, < 6 $/ 6 Q* ! < 3 E' 6 0 *;* 9 5 k < ' # < 1 <, $/ f*& §= 6 ' V JG} + 6j C "(m < ;*& ' ' V " A < Y 6-41 P 5 :4 5 ! 1 #7I ̈ $" 6 5 e~% )]-':& 6 Q 2 $') 6 5 ! D 1/ 9) -':% Jeg& " . 7I + ! p(% 5<+ m / "* 5 !\k # ! ,* 0 ' 5~kC (/ + :T 0 5 ' : 5<, ! 1" 9$/ 6 W "JX”. @ &(% "@ ' + : 7^ E' ' :+1" T k* 6 *2 0 5 ! $/ @ ' 5 ! : € f $/ CD Y 6 P…3 * k ;7 ! @ & "JAyY< 6 +78 6 ! k + ~" < * 6 @1 h !7S ' ! P< N + ! , 1 ' 4 5<, ! @ 1" ; { " &) * + ! ,*+/ * ! 0 J [ q < C, !CD 6 ( { 6\ ' 1/ &) *{ ! P< 2 C @ 6 % CD " 6:'(k_%`Qa&bc( fC CD " *G78 f A ) * + ! ,*r" "*%) ! 0 JB L_&`.a&bc(W < f d, 6 ; 7 * ! 5 † © V (#* ! 5 F P) $ e*& @ :'(D* ! 5 2 6 SC& "-, ![:A ! ,*; C=* ! 37I E% P: e " @* 6 *=(' ! ,*O Q1/ ! T78 6 T < 5< ' 1 5 9 < 6 7 ! 6) Q "1" P < 6 # @ S* $/ 6 @ CD *cJ#yZ 2 $ , 9 O < TCD 6 7I + ! O 7 < +C& "CD " *{ 5< + C @ 6 ' CD @ P:'( f 6 7. @ 9) " ; Pj CD P @ 6 % @ ,*& 6 * k C, 9 _&`.a&bM <; ! < 6 5 "*& @ , ! e 5< O C @ 6 ' #CD @ 6 @ 6 (Z J "J (Za «+b4 ! ,*4a «+bp ¬;a ­®s uPw¢J ̄yY< A ^ 9 " ? N < ° <, 9 2 < fC CD " * CD " 6:f_%`Qa&b+ 0 5 ' @ : CD 6 7I < Q 6j& 0 1OCD " J ± T Y ! ~{* §= 6 7^ " 2 < NC& $% @ ~¥ + ! @ :q > Y CD 6 6 3 u …c ! P< m C ^ ! ' p 0 ,*& @ * 3 < W % Pj& Z 7 " &) * 6 + ,*3-& / "*& 0 ́%t"…c 0 6 X " T* 9 5 Q V "(/* ! 5 C=* ! * * 6 6 " p(& + * " 0 m 7c < C $% " m ~D(' Y 6 Permission to make digital or hard copies of part or all of this work or personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. CIKM 2000, McLean, VA USA © ACM 2000 1-58113-320-0/00/11 . . .$5.00