Showing papers on "Cache published in 1997"

PDF

Open Access

Patent•

Unified re-map and cache-index table with dual write-counters for wear-leveling of non-volatile flash RAM mass storage

[...]

Ricardo H. Bruce, Rolando H. Bruce, Earl T. Cohen, Allan J. Christie

25 Aug 1997

TL;DR: In this article, a unified re-map table in a RAM is used to arbitrarily remap all logical addresses from a host system to physical addresses of flash-memory devices, and wear-leveling is performed on a block being written when both total and incremental counts exceed system-wide total and incrementally thresholds.

...read moreread less

Abstract: A flash-memory system provides solid-state mass storage as a replacement to a hard disk. A unified re-map table in a RAM is used to arbitrarily re-map all logical addresses from a host system to physical addresses of flash-memory devices. Each entry in the unified re-map table contains a physical block address (PBA) of the flash memory allocated to the logical address, and a cache valid bit and a cache index. When the cache valid bit is set, the data is read or written to a line in the cache pointed to by the cache index. A separate cache tag RAM is not needed. When the cache valid bit is cleared, the data is read from the flash memory block pointed to by the PBA. Two write count values are stored with the PBA in the table entry. A total-write count indicates a total number of writes to the flash block since manufacture. An incremental-write count indicates the number of writes since the last wear-leveling operation that moved the block. Wear-leveling is performed on a block being written when both total and incremental counts exceed system-wide total and incremental thresholds. The incremental-write count is cleared after a block is wear-leveled, but the total-write count is never cleared. The incremental-write count prevents moving a block again immediately after wear-leveling. The thresholds are adjusted as the system ages to provide even wear.

...read moreread less

592 citations

Proceedings Article•DOI•

Prefetching using Markov predictors

[...]

Doug Joseph¹, Dirk Grunwald²•Institutions (2)

IBM¹, University of Colorado Boulder²

01 May 1997

TL;DR: The Markov prefetcher acts as an interface between the on-chip and off-chip cache, and can be added to existing computer designs and reduces the overall execution stalls due to instruction and data memory operations by an average of 54% for various commercial benchmarks while only using two thirds the memory of a demand-fetch cache organization.

...read moreread less

Abstract: Prefetching is one approach to reducing the latency of memory operations in modern computer systems. In this paper, we describe the Markov prefetcher. This prefetcher acts as an interface between the on-chip and off-chip cache, and can be added to existing computer designs. The Markov prefetcher is distinguished by prefetching multiple reference predictions from the memory subsystem, and then prioritizing the delivery of those references to the processor.This design results in a prefetching system that provides good coverage, is accurate and produces timely results that can be effectively used by the processor. In our cycle-level simulations, the Markov Prefetcher reduces the overall execution stalls due to instruction and data memory operations by an average of 54% for various commercial benchmarks while only using two thirds the memory of a demand-fetch cache organization.

...read moreread less

567 citations

Journal Article•DOI•

Continuous profiling: where have all the cycles gone?

[...]

Jennifer M. Anderson, Lance M. Berc, Jeffrey Dean, Sanjay Ghemawat, Monika Henzinger, Shun-Tak Albert Leung, Richard L. Sites, Mark T. Vandevoorde, Carl A. Waldspurger, William E. Weihl - Show less +6 more

01 Oct 1997

TL;DR: The Digital Continuous Profiling Infrastructure is a sampling-based profiling system designed to run continuously on production systems, supporting multiprocessors, works on unmodified executables, and collects profiles for entire systems, including user programs, shared libraries, and the operating system kernel.

...read moreread less

Abstract: This article describes the Digital Continuous Profiling Infrastructure, a sampling-based profiling system designed to run continuously on production systems. The system supports multiprocessors, works on unmodified executables, and collects profiles for entire systems, including user programs, shared libraries, and the operating system kernel. Samples are collected at a high rate (over 5200 samples/sec. per 333MHz processor), yet with low overhead (1–3% slowdown for most workloads). Analysis tools supplied with the profiling system use the sample data to produce a precise and accurate accounting, down to the level of pipeline stalls incurred by individual instructions, of where time is bring spent. When instructions incur stalls, the tools identify possible reasons, such as cache misses, branch mispredictions, and functional unit contention. The fine-grained instruction-level analysis guides users and automated optimizers to the causes of performance problems and provides important insights for fixing them.

...read moreread less

545 citations

Proceedings Article•DOI•

The filter cache: an energy efficient memory structure

[...]

Johnson Kin¹, Munish Gupta¹, William H. Mangione-Smith¹•Institutions (1)

University of California, Los Angeles¹

01 Dec 1997

TL;DR: Experimental results across a wide range of embedded applications show that the filter cache results in improved memory system energy efficiency, and this work proposes to trade performance for power consumption by filtering cache references through an unusually small L1 cache.

...read moreread less

Abstract: Most modern microprocessors employ one or two levels of on-chip caches in order to improve performance. These caches are typically implemented with static RAM cells and often occupy a large portion of the chip area. Not surprisingly, these caches often consume a significant amount of power. In many applications, such as portable devices, low power is more important than performance. We propose to trade performance for power consumption by filtering cache references through an unusually small L1 cache. An L2 cache, which is similar in size and structure to a typical L1 cache, is positioned behind the filter cache and serves to reduce the performance loss. Experimental results across a wide range of embedded applications show that the filter cache results in improved memory system energy efficiency. For example, a direct mapped 256-byte filter cache achieves a 58% power reduction while reducing performance by 21%, corresponding to a 51% reduction in the energy-delay product over conventional design.

...read moreread less

544 citations

Patent•

Recovery from failure of a data processor in a network server

[...]

Uresh K. Vahalia¹, Uday K. Gupta¹, Dennis Ting¹•Institutions (1)

EMC Corporation¹

26 Sep 1997

TL;DR: In this paper, a network file server includes a first set of data processors for receiving requests from clients, and a second set of file processors for accessing read-write file systems.

...read moreread less

Abstract: A network file server includes a first set of data processors for receiving requests from clients, and a second set of data processors for accessing read-write file systems. A respective data processor in the second set is assigned to each file system for exclusive management of locks on the file system. The file server can detect failure of a failed data processor and automatically recover from the failure. When a failure of a data processor in the first set is detected, a spare data processor is programmed with the logical and physical network addresses of the failed data processor so that the spare data processor assumes the network identity of the failed data processor. When a failure of a data processor in the second set is detected, responsibility for management of the locks on each file system managed by the failed data processor is transferred to an operational data processor. Preferably the responsibility is transferred to the operational data processors in such a way as to balance loading on the operational data processors. The data processors can be commodity digital computers for low cost, and a cached disk storage subsystem or file system caches and remote dual copy techniques can be used to ensure high performance and high data availability.

...read moreread less

481 citations

Journal Article•DOI•

A single-chip multiprocessor

[...]

Basem A. Nayfeh¹, Kunle Olukotun•Institutions (1)

Stanford University¹

01 Sep 1997-IEEE Computer

TL;DR: Presents the case for billion-transistor processor architectures that will consist of chip multiprocessors (CMPs): multiple (four to 16) simple, fast processors on one chip, and all processors share a larger level-two cache.

...read moreread less

Abstract: Presents the case for billion-transistor processor architectures that will consist of chip multiprocessors (CMPs): multiple (four to 16) simple, fast processors on one chip. In their proposal, each processor is tightly coupled to a small, fast, level-one cache, and all processors share a larger level-two cache. The processors may collaborate on a parallel job or run independent tasks (as in the SMT proposal). The CMP architecture lends itself to simpler design, faster validation, cleaner functional partitioning, and higher theoretical peak performance. However for this architecture to realize its performance potential, either programmers or compilers will have to make code explicitly parallel. Old ISAs will be incompatible with this architecture (although they could run slowly on one of the small processors).

...read moreread less

434 citations

Patent•

Data transmission over the internet using a cache memory file system

[...]

Jr. James A. Aviani¹•Institutions (1)

Cisco Systems, Inc.¹

25 Sep 1997

TL;DR: In this article, a method for storing a plurality of multimedia objects in a cache memory is described, where first ones of the multimedia objects are written into the cache memory sequentially from the beginning of the cache cache memory in the order in which they are received.

...read moreread less

Abstract: A method for storing a plurality of multimedia objects in a cache memory is described. First ones of the multimedia objects are written into the cache memory sequentially from the beginning of the cache memory in the order in which they are received. When a first memory amount from a most recently stored one of the first multimedia objects to the end of the cache memory is insufficient to accommodate a new multimedia object, the new multimedia object is written from the beginning of the cache memory, thereby writing over a previously stored one of the first multimedia objects. Second ones of the multimedia objects are then written into the cache memory sequentially following the new multimedia object in the order in which they are received, thereby writing over the first ones of the multimedia objects. This cycle is repeated, thereby maintaining a substantially full cache memory.

...read moreread less

398 citations

Patent•

Multi-state non-volatile flash memory capable of being its own two state write cache

[...]

Douglas J. Lee¹, Jian Chen¹•Institutions (1)

SanDisk¹

30 Jul 1997

TL;DR: In this article, a memory system including an array of flash EEPROM cells arranged in blocks of cells that are erasable together, with individual cells storing more than one bit of data as a result of operating the individual cells with more than two detectable threshold ranges or states.

...read moreread less

Abstract: A memory system including an array of flash EEPROM cells arranged in blocks of cells that are erasable together, with individual cells storing more than one bit of data as a result of operating the individual cells with more than two detectable threshold ranges or states. Any portion of the array in which data is not stored can be used as a write cache, where individual ones of the cells store a single bit of data by operating with only two detectable threshold ranges. Data coming into the memory is initially written in available blocks in two states since writing in more than two states takes significantly more time. At a later time, in the background, the cached data is read, compressed and written back into fewer blocks of the memory in multi-state for longer term storage at a reduced cost.

...read moreread less

375 citations

Proceedings Article•DOI•

Knowledge discovery from users Web-page navigation

[...]

Cyrus Shahabi¹, Amir M. Zarkesh, Jafar Adibi¹, V. Shah¹•Institutions (1)

University of Southern California¹

07 Apr 1997

TL;DR: A novel path clustering method based on the similarity of the history of user navigation which is capable of capturing the interests of the user which could persist through several subsequent hypertext link selections is introduced.

...read moreread less

Abstract: The authors propose to detect users' navigation paths to the advantage of Web site owners. First, they explain the design and implementation of a profiler which captures a client's selected links and page order, accurate page viewing time and cache references, using a Java based remote agent. The information captured by the profiler is then utilized by a knowledge discovery technique to cluster users with similar interests. They introduce a novel path clustering method based on the similarity of the history of user navigation. This approach is capable of capturing the interests of the user which could persist through several subsequent hypertext link selections. Finally, they evaluate their path clustering technique via a simulation study on a sample WWW site. They show that, depending on the level of inserted noise, they can recover the correct clusters by 10%-27% of average error margin.

...read moreread less

375 citations

Proceedings Article•DOI•

Potential benefits of delta encoding and data compression for HTTP

[...]

Jeffrey C. Mogul, Fred Douglis¹, Anja Feldmann¹, Balachander Krishnamurthy¹•Institutions (1)

AT&T Labs¹

01 Oct 1997

TL;DR: It is shown that delta encoding can provide remarkable improvements in response size and response delay for an important subset of HTTP content types, and that the combination of delta encoding and data compression yields the best results.

...read moreread less

Abstract: Caching in the World Wide Web currently follows a naive model, which assumes that resources are referenced many times between changes. The model also provides no way to update a cache entry if a resource does change, except by transferring the resource's entire new value. Several previous papers have proposed updating cache entries by transferring only the differences, or "delta," between the cached entry and the current value.In this paper, we make use of dynamic traces of the full contents of HTTP messages to quantify the potential benefits of delta-encoded responses. We show that delta encoding can provide remarkable improvements in response size and response delay for an important subset of HTTP content types. We also show the added benefit of data compression, and that the combination of delta encoding and data compression yields the best results.We propose specific extensions to the HTTP protocol for delta encoding and data compression. These extensions are compatible with existing implementations and specifications, yet allow efficient use of a variety of encoding techniques.

...read moreread less

357 citations

Proceedings Article•DOI•

Analytical energy dissipation models for low-power caches

[...]

Milind B. Kamble¹, Kanad Ghose¹•Institutions (1)

Binghamton University¹

01 Aug 1997

TL;DR: In this article, the authors present detailed analytical models for estimating the energy dissipation in conventional caches as well as low energy cache architectures using run time statistics such as hit/miss counts, fraction of read/write requests and assume stochastical distributions for signal values.

...read moreread less

Abstract: We present detailed analytical models for estimating the energy dissipation in conventional caches as well as low energy cache architectures. The analytical models use the run time statistics such as hit/miss counts, fraction of read/write requests and assume stochastical distributions for signal values. These models are validated by comparing the power estimated using these models against the power estimated using a detailed simulator called CAPE (CAache Power Estimator). The analytical models for conventional caches are found to be accurate to within 2% error. However, these analytical models over-predict the dissipations of low-power caches by as much as 30%. The inaccuracies can be attributed to correlated signal values and locality of reference, both of which are exploited in making some cache organizations energy efficient.

...read moreread less

Patent•

Distributed database using indexed into tags to tracks events according to type, update cache, create virtual update log on demand

[...]

Stephen P. W. Draper¹, Dale A Lowry•Institutions (1)

Novell¹

15 Oct 1997

TL;DR: In this paper, methods and systems for synchronizing local copies of a distributed database, such as a master copy and a partial copy stored in a replica or in a cache, are described.

...read moreread less

Abstract: Methods and systems are provided for synchronizing local copies of a distributed database, such as a master copy and a partial copy stored in a replica or in a cache. Each data item in the database has an associated timestamp or other tag. An index into the tags in maintained. The tag index may be used to create an event list to reduce the time and bandwidth needed to synchronize the local copies. The tag index may also be used to create a virtual update log, thereby removing the need to maintain one or more physical logs recording the history of the copies.

...read moreread less

Proceedings Article•DOI•

ProfileMe: hardware support for instruction-level profiling on out-of-order processors

[...]

Jeffrey Dean, James W. Hicks, Carl A. Waldspurger, William E. Weihl, George Z. Chrysos - Show less +1 more

01 Dec 1997

TL;DR: An inexpensive hardware implementation of ProfileMe is described, a variety of software techniques to extract useful profile information from the hardware are outlined, and several ways in which this information can provide valuable feedback for programmers and optimizers are explained.

...read moreread less

Abstract: Profile data is valuable for identifying performance bottlenecks and guiding optimizations. Periodic sampling of a processor's performance monitoring hardware is an effective, unobtrusive way to obtain detailed profiles. Unfortunately, existing hardware simply counts events, such as cache misses and branch mispredictions, and cannot accurately attribute these events to instructions, especially on out-of-order machines. We propose an alternative approach, called ProfileMe, that samples instructions. As a sampled instruction moves through the processor pipeline, a detailed record of all interesting events and pipeline stage latencies is collected. ProfileMe also supports paired sampling, which captures information about the interactions between concurrent instructions, revealing information about useful concurrency and the utilization of various pipeline stages while an instruction is in flight. We describe an inexpensive hardware implementation of ProfileMe, outline a variety of software techniques to extract useful profile information from the hardware, and explain several ways in which this information can provide valuable feedback for programmers and optimizers.

...read moreread less

Proceedings Article•DOI•

The performance of μ-kernel-based systems

[...]

Hermann Härtig¹, Michael Hohmuth¹, Jochen Liedtke², Sebastian Schönberg¹, J. Wolter¹ - Show less +1 more•Institutions (2)

Dresden University of Technology¹, IBM²

01 Oct 1997

TL;DR: In this paper, a second-generation μ-kernel, L4, has been compared with the first-generation Mach-derived μ-kernels in terms of performance and extensibility.

...read moreread less

Abstract: First-generation μ-kernels have a reputation for being too slow and lacking sufficient flexibility. To determine whether L4, a lean second-generation μ-kernel, has overcome these limitations, we have repeated several earlier experiments and conducted some novel ones. Moreover, we ported the Linux operating system to run on top of the L4 μ-kernel and compared the resulting system with both Linux running native, and MkLinux, a Linux version that executes on top of a first-generation Mach-derived μ-kernel. For L 4 Linux, the AIM benchmarks report a maximum throughput which is only 5% lower than that of native Linux. The corresponding penalty is 5 times higher for a co-located in-kernel version of MkLinux, and 7 times higher for a user-level version of MkLinux. These numbers demonstrate both that it is possible to implement a high-performance conventional operating system personality above a μ-kernel, and that the performance of the μ-kernel is crucial to achieve this. Further experiments illustrate that the resulting system is highly extensible and that the extensions perform well. Even real-time memory management including second-level cache allocation can be implemented at user-level, coexisting with L 4 Linux.

...read moreread less

Patent•

Performance optimizations for computer networks using http

[...]

Chandrashekhar W. Bhide, Jagdeep Singh, Don Oestreicher

20 Nov 1997

TL;DR: In this paper, a connection cache is maintained by an agent on the network access equipment (254) to more quickly respond to request for network connections to the server (256), and the agent may maintian a cache of information to more frequently respond to requests to get an object if it has been modified.

...read moreread less

Abstract: Systems and methods of increasing the performance of computer networks (208), especially networks connecting users to the Web, are provided. Performance is increased by reducing the latency the client (252) experiences between sending a request to the server (256) and receiving a response. A connection cache may be maintained by an agent (260) on the network access equipment (254) to more quickly respond to request for network connections to the server (256). Additionally, the agent may maintian a cache of information to more quickly respond to requests to get an object if it has been modified. These enhancements and other described herein may be implemented singly or in conjunction to reduce the latency involved in sending the requests to the server by saving round-trip times between computer network components.

...read moreread less

Patent•

User name authentication for gateway clients accessing a proxy cache server

[...]

Mark L. Shapiro¹, Anand Subramaniam¹, Muthukumar Muthumavadi¹•Institutions (1)

Novell¹

01 Aug 1997

TL;DR: In this paper, a system and method for regulating access to a proxy cache server residing on an institutional intranet or local network provides a directory for storing user names that are appended to client requests for remote web site information.

...read moreread less

Abstract: A system and method for regulating access to a proxy cache server residing on an institutional intranet or local network provides a directory for storing user names that are appended to client requests for remote web site information. The proxy cache server reads the appended requests and either accepts or denies access to the requested information based upon predetermined access control guidelines relative to the specific user name. The access control guidelines can be stored on the directory, and down-loaded to the proxy cache server's memory as needed. The proxy cache server stores and retrieves requested site information via the Internet, but only retrieves and delivers requested site information to clients if authorization is approved.

...read moreread less

Patent•

System and method for accelerated query evaluation of very large full-text databases

[...]

Graham Spencer

13 Jun 1997

TL;DR: In this paper, a system, method, and various software products provide for improved information retrieval in very large document databases through the use of a predetermined static cache including for terms that appear in a large number of documents, a plurality of documents ordered by a contribution that the term makes to the document score of the document.

...read moreread less

Abstract: A system, method, and various software products provide for improved information retrieval in very large document databases through the use of a predetermined static cache. The static cache includes for terms that appear in a large number of documents, a plurality of documents ordered by a contribution that the term makes to the document score of the document. The contribution is a scalar measure of the influence of the term in the computed document score. The contribution reflects both the within document frequency and the between document frequency of the term. In addition, the static cache includes for each term a lookup table that references selected entries for the term in an inverted index. Queries to the database are then processed by first traversing the static cache and obtaining the contribution information thereform and computing the document score from this information. Additional term frequency information for other terms in the query is obtained by looking up the document in the lookup tables of the other query terms, and obtaining the term frequency information for such terms from the inverted index, or by searching the contribution caches of the query terms.

...read moreread less

Patent•

Shared client-side web caching using globally addressable memory

[...]

John B. Carter, Scott H. Davis, Daniel J. Dietterich, Steven J. Frank, Hsin H. Lee - Show less +1 more

02 May 1997

TL;DR: In this paper, a shared client-side Web cache is provided by implementing a file system shared between nodes, where each browser application stores cached data in files stored in a globally addressable data store.

...read moreread less

Abstract: A shared client-side Web cache is provided by implementing a file system shared between nodes. Each browser application stores cached data in files stored in a globally addressable data store. Since the file system is a shared one, the client-side Web caches are also shared.

...read moreread less

Proceedings Article•DOI•

Improving data cache performance by pre-executing instructions under a cache miss

[...]

James D. Dundas¹, Trevor Mudge¹•Institutions (1)

University of Michigan¹

11 Jul 1997

Patent•

Determining how changes to underlying data affect cached objects

[...]

James R. H. Challenger¹, Paul M. Dantzig¹, Arun Iyengar¹, Gerald A. Spivak¹•Institutions (1)

IBM¹

01 Aug 1997

TL;DR: In this article, an object dependence graph (ODG) is used to represent the data dependencies between objects, which can be used to construct and maintain objects to associate changes in remote data with cached objects.

...read moreread less

Abstract: A determination can be made of how changes to underlying data affect the value of objects. Examples of applications include: caching dynamic Web pages; client-server applications whereby a server sending objects (which are changing all the time) to multiple clients can track which versions are sent to which clients and how obsolete the versions are; and any situation where it is necessary to maintain and uniquely identify several versions of objects, update obsolete objects, quantitatively assess how different two versions of the same object are, and/or maintain consistency among a set of objects. A directed graph, called an object dependence graph, may be used to represent the data dependencies between objects. Another aspect is constructing and maintaining objects to associate changes in remote data with cached objects. If data in a remote data source changes, database change notifications are used to "trigger" a dynamic rebuild of associated objects. Thus, obsolete objects can be dynamically replaced with fresh objects. The objects can be complex objects, such as dynamic Web pages or compound-complex objects, and the data can be underlying data in a database. The update can include either: storing a new version of the object in the cache; or deleting an object from the cache. Caches on multiple servers can also be synchronized with the data in a single common database. Updated information, whether new pages or delete orders, can be broadcast to a set of server nodes, permitting many systems to simultaneously benefit from the advantages of prefetching and providing a high degree of scaleability.

...read moreread less

Patent•

System and method for predictive caching of information pages

[...]

Craig Henry Becker¹, Andrew Frank-Loron¹, James G. McLean¹, Clifford A. Pickover¹•Institutions (1)

IBM¹

07 May 1997

TL;DR: In this article, the predicted-to-be selected page is added to a local cache of the requested pages in the client, and the client can update the appearance of the link to indicate to the user that the page represented by that link is available in the local cache.

...read moreread less

Abstract: A computer, e.g. a server or computer operated by a network provider sends one or more requesting computers (clients) a most likely predicted-to-be selected (predicted) page of information by determining a preference factor for this page based on one or more pages that are requested by the client. This page is added to a local cache of predicted-to-be-selected pages in the client. Once the predicted-to-be selected page is in the cache, the client can update the appearance of the link (i.e. by changing the color or otherwise changing the appearance of the link indicator) to indicate to the user that the page represented by that link is available in the local cache.

...read moreread less

Proceedings Article•DOI•

Maintaining strong cache consistency in the World-Wide Web

[...]

Chenjie Liu¹, Pei Cao¹•Institutions (1)

University of Wisconsin-Madison¹

27 May 1997

TL;DR: It is shown that, contrary to popular belief, strong cache consistency can be maintained for the Web with little or no extra cost than the current weak consistency approaches, and it should be maintained using an invalidation based protocol.

...read moreread less

Abstract: As the Web continues to explode in size, caching becomes increasingly important. With caching comes the problem of cache consistency. Conventional wisdom holds that strong cache consistency is too expensive for the Web, and weak consistency methods such as Time-To-Live (TTL) are most appropriate. The article compares three consistency approaches: adaptive TTL, polling-every-time, and invalidation, using prototype implementation and trace replay in a simulated environment. Our results show that invalidation generates less or a comparable amount of network traffic and server workload than adaptive TTL and has a slightly lower average client response time, while polling-every-time generates more network traffic and longer client response times. We show that, contrary to popular belief, strong cache consistency can be maintained for the Web with little or no extra cost than the current weak consistency approaches, and it should be maintained using an invalidation based protocol.

...read moreread less

Journal Article•DOI•

Proxy caching that estimates page load delays

[...]

Roland P. Wooster¹, Marc D. Abrams¹•Institutions (1)

Virginia Tech¹

01 Sep 1997

TL;DR: Two new caching algorithms that keep in the cache documents that take the longest to retrieve are explored, and are compared to the best three existing policies—LRU, LFU, and SIZE—using three measures-user response time and ability to minimize Web server loads and network bandwidth consumed.

...read moreread less

Abstract: Do users wait less if proxy caches incorporate estimates of the current network conditions into document replacement algorithms? To answer this, we explore two new caching algorithms: (1) keep in the cache documents that take the longest to retrieve; and (2) use a hybrid of several factors, trying to keep in the cache documents from servers that take a long time to connect to, that must be loaded over the slowest Internet links, that have been referenced the most frequently, and that are small. The algorithms work by estimating the Web page download delays or proxy-to-Web server bandwidth using recent page fetches. The new algorithms are compared to the best three existing policies—LRU, LFU, and SIZE—using three measures-user response time and ability to minimize Web server loads and network bandwidth consumed—on workloads from Virginia Tech and Boston University.

...read moreread less

Patent•

Database system with methods for improving query performance with cache optimization strategies

[...]

Brijesh Agarwal¹•Institutions (1)

Sybase¹

28 Feb 1997

TL;DR: In this article, an Optimizer communicates with a Buffer Manager before it formulates the query plan, and the Optimizer formulates a query strategy or plan with "hints", which are ultimately passed to the Cache or Buffer Manager.

...read moreread less

Abstract: Database system and methods are described for improving execution speed of database queries (e.g., for transaction processing and for decision support) by optimizing use of buffer caches. The system includes an Optimizer for formulating an optimal strategy for a given query. More particularly, the Optimizer communicates with a Buffer Manager before it formulates the query plan. For instance, the Optimizer may query the Buffer Manager for the purpose of determining whether the object of interest (e.g., table or index to be scanned) exists in its own buffer cache (i.e., whether it has been bound to a particular named cache). If the object exists in its own cache, the Optimizer may inquire as to how much of the cache (i.e., how much memory) the object requires, together with the optimal I/O size for the cache (e.g., 16K blocks). Based on this information, the Optimizer formulates a query strategy or plan with "hints," which are ultimately passed to the Cache or Buffer Manager. By formulating "hints" for the Buffer Manager at the level of the Optimizer, knowledge of the query is, in effect, passed down to the Buffer Manager so that it may service the query using an optimal caching strategy--one based on the dynamics of the query itself. Based on the "hints" received from the Optimizer, the Buffer Manager can fine tune input/output (i.e., cache management) for the query. Specific Optimizer strategies are described for each scan method available to the system, including heap scan, clustered index, and non-clustered index access. Additional strategies are described for multi-table access during processing of join queries.

...read moreread less

Patent•

Database computer system with application recovery and dependency handling write cache

[...]

David B. Lomet¹•Institutions (1)

Microsoft¹

10 Mar 1997

TL;DR: In this paper, a database computer system and a method for making applications recoverable from system crashes is described, where the application state (i.e., address space) is treated as a single object which can be atomically flushed in a manner akin to flushing individual pages in database recovery techniques.

...read moreread less

Abstract: This invention concerns a database computer system and method for making applications recoverable from system crashes. The application state (i.e., address space) is treated as a single object which can be atomically flushed in a manner akin to flushing individual pages in database recovery techniques. To enable this monolithic treatment of the application, executions performed by the application are mapped to logical loggable operations which can be posted to the stable log. Any modifications to the application state are accumulated and the application state is periodically flushed to stable storage using an atomic procedure. The application recovery integrates with database recovery, and effectively eliminates or at least substantially reduces the need for check pointing applications. In addition, optimization techniques are described to make the read, write, and recovery phases more efficient.

...read moreread less

Patent•

Data storage system having flash memory and disk drive

[...]

Hiroshi Sukegawa¹•Institutions (1)

Toshiba¹

14 Mar 1997

TL;DR: In this article, the storage area of the flash memory unit is logically divided into a permanent storage area, a non-volatile cache area, which are used as cache memory areas of the HDD, and a high-speed access area.

...read moreread less

Abstract: In a data storage system using a flash memory unit and an HDD, the storage area of the flash memory unit is logically divided into a permanent storage area, a non-volatile cache area, which are used as cache memory areas of the HDD, and a high-speed access area. These divided areas are individually managed. The permanent storage area stores data which is used frequently for a relatively long time period. The non-volatile cache area is used as an ordinary cache memory area in which data, which is updated relatively frequently, is stored. The high-speed access area is a storage area to be used by, e.g. an operating system (OS) of a host system. For example, a swap file, which needs to be accessed at high speed, is shifted into the high-speed access area.

...read moreread less

Proceedings Article•DOI•

OS-controlled cache predictability for real-time systems

[...]

Jochen Liedtke¹, Hermann Härtig², Michael Hohmuth²•Institutions (2)

IBM¹, Dresden University of Technology²

09 Jun 1997

TL;DR: An OS-controlled application-transparent cache-partitioning technique that can be transparently assigned to tasks for their exclusive use and the interaction of both are analysed with regard to cache-induced worst case penalties.

...read moreread less

Abstract: Cache-partitioning techniques have been invented to make modern processors with an extensive cache structure useful in real-time systems where task switches disrupt cache working sets and hence make execution times unpredictable. This paper describes an OS-controlled application-transparent cache-partitioning technique. The resulting partitions can be transparently assigned to tasks for their exclusive use. The major drawbacks found in other cache-partitioning techniques, namely waste of memory and additions on the critical performance path within CPUs, are avoided using memory coloring techniques that do nor require changes within the chips of modern CPUs or on the critical path for performance. A simple filter algorithm commonly used in real-time systems, a matrix-multiplication algorithm and the interaction of both are analysed with regard to cache-induced worst case penalties. Worst-case penalties are determined for different widely-used cache architectures. Some insights regarding the impact of cache architectures on worst-case execution are described.

...read moreread less

Proceedings Article•

The Measured Access Characteristics of World-Wide-Web Client Proxy Caches

[...]

Bradley M. Duska¹, David Marwood¹, Michael J. Feeley¹•Institutions (1)

University of British Columbia¹

08 Dec 1997

TL;DR: An analysis of access traces collected from seven proxy servers deployed in various locations throughout the Internet shows that a 2- to 10-GB second-level cache yields hit rates between 24% and 45% with 85% of these hits due to sharing among different clients.

...read moreread less

Abstract: The growing popularity of the World Wide Web is placing tremendous demands on the Internet. A key strategy for scaling the Internet to meet these increasing demands is to cache data near clients and thus improve access latency and reduce network and server load. Unfortunately, research in this area has been hampered by a poor understanding of the locality and sharing characteristics of Web-client accesses. The recent popularity of Web proxy servers provides a unique opportunity to improve this understanding, because a small number of proxy servers see accesses from thousands of clients. This paper presents an analysis of access traces collected from seven proxy servers deployed in various locations throughout the Internet. The traces record a total of 47.4 million requests made by 23,700 clients over a twenty-one day period. We use a combination of static analysis and trace-driven cache simulation to characterize the locality and sharing properties of these accesses. Our analysis shows that a 2- to 10-GB second-level cache yields hit rates between 24% and 45% with 85% of these hits due to sharing among different clients. Caches with more clients exhibit more sharing and thus higher hit rates. Between 2% and 7% of accesses are consistency misses to unmodified objects, using the Squid and CERN proxy cache coherence protocols. Sharing is bimodal. Requests for shared objects are divided evenly between objects that are narrowly shared and those that are shared by many clients; widely shared objects also tend to be shared by clients from unrelated traces.

...read moreread less

Proceedings Article•DOI•

Language model adaptation using mixtures and an exponentially decaying cache

[...]

P.R. Clarkson¹, A.J. Robinson•Institutions (1)

University of Cambridge¹

21 Apr 1997

TL;DR: Two techniques based on augmenting the standard trigram model with a cache component in which the words' recurrence probabilities decay exponentially over time yield a significant reduction in perplexity when faced with a multi-domain test text.

...read moreread less

Abstract: Presents two techniques for language model adaptation. The first is based on the use of mixtures of language models: the training text is partitioned according to topic, a language model is constructed for each component and, at recognition time, appropriate weightings are assigned to each component to model the observed style of language. The second technique is based on augmenting the standard trigram model with a cache component in which the words' recurrence probabilities decay exponentially over time. Both techniques yield a significant reduction in perplexity over the baseline trigram language model when faced with a multi-domain test text, the mixture-based model giving a 24% reduction and the cache-based model giving a 14% reduction. The two techniques attack the problem of adaptation at different scales, and as a result can be used in parallel to give a total perplexity reduction of 30%.

...read moreread less

Patent•

Remote data mirroring having preselection of automatic recovery or intervention required when a disruption is detected

[...]

Yuval Ofek¹•Institutions (1)

EMC Corporation¹

17 Mar 1997

TL;DR: In this article, a number of automatic and non-automatic recovery mechanisms are provided on a logical volume basis for a desired level of data integrity and degree of operator or application program involvement.

...read moreread less

Abstract: Two data storage systems are interconnected by a data link for remote mirroring of data. Each volume of data is configured as local, primary in a remotely mirrored volume pair, or secondary in a remotely mirrored volume pair. Normally, a host computer directly accesses either a local or a primary volume, and data written to a primary volume is automatically sent over the link to a corresponding secondary volume. Each remotely mirrored volume pair can operate in a selected synchronization mode including synchronous, semi-synchronous, adaptive copy--remote write pending, and adaptive copy--disk. A number of automatic and non-automatic recovery mechanisms are provided on a logical volume basis for a desired level of data integrity and degree of operator or application program involvement. If a "volume domino" mode is enabled for a remotely mirrored volume pair, access to a volume of the pair is denied when the other volume is inaccessible. In a "links domino" mode, access to all remotely mirrored volumes is denied when remote mirroring is disrupted by an all-links failure. The domino modes can be used to initiate application-based recovery, for example, recovering a secondary data file using a secondary log file. In an overwrite cache mode, remote write-pending data in cache can be overwritten. In a "log-in-cache" mode, multiple versions of write data can be stored in cache in order to recover from a "rolling disaster".

...read moreread less

Collapse