scispace - formally typeset
Search or ask a question
Topic

Smart Cache

About: Smart Cache is a research topic. Over the lifetime, 7680 publications have been published within this topic receiving 180618 citations.


Papers
More filters
Proceedings ArticleDOI
13 Nov 2010
TL;DR: A classification of applications into four cache usage categories is introduced and how applications from different categories affect each other's performance indirectly through cache sharing is discussed and a scheme to optimize such sharing is devised.
Abstract: Contention for shared cache resources has been recognized as a major bottleneck for multicores--especially for mixed workloads of independent applications. While most modern processors implement instructions to manage caches, these instructions are largely unused due to a lack of understanding of how to best leverage them. This paper introduces a classification of applications into four cache usage categories. We discuss how applications from different categories affect each other's performance indirectly through cache sharing and devise a scheme to optimize such sharing. We also propose a low-overhead method to automatically find the best per-instruction cache management policy. We demonstrate how the indirect cache-sharing effects of mixed workloads can be tamed by automatically altering some instructions to better manage cache resources. Practical experiments demonstrate that our software-only method can improve application performance up to 35% on x86 multicore hardware.

46 citations

Journal Article
TL;DR: This work explores adaptable caching strategies which balance the resource demands of each application and in turn lead to improvements in throughput for the collective workload, which provides chip designers the opportunity to maintain high performance as cache size and power budgets become a concern in the CMP design space.
Abstract: Chip multi-processors (CMP) are rapidly emerging as an important design paradigm for both high performance and embedded processors These machines provide an important performance alternative to increasing the clock frequency In spite of the increase in potential performance, several issues related to resource sharing on the chip can negatively impact the performance of embedded applications In particular, the shared on-chip caches make each job's memory access times dependent on the behavior of the other jobs sharing the cache If not adequately managed, this can lead to problems in meeting hard real-time scheduling constraints This work explores adaptable caching strategies which balance the resource demands of each application and in turn lead to improvements in throughput for the collective workload Experimental results demonstrate speedups of up to 147X for workloads of two co-scheduled applications compared against a fully-shared two-level cache hierarchy Additionally, the adaptable caching scheme is shown to achieve an average speedup of 110X over the leading cache partitioning model By dynamically managing cache storage for multiple application threads at runtime, sizable performance levels are achieved, which provides chip designers the opportunity to maintain high performance as cache size and power budgets become a concern in the CMP design space

46 citations

Patent
10 Jul 2008
TL;DR: In this article, a device driver monitors which software applications currently running on a microprocessor are in a predetermined list and responsively dynamically writes the values to the microprocessor to configure its operating modes, such as data prefetching, branch prediction, instruction cache eviction, instruction execution suspension, sizes of cache memories, reorder buffer, store/load/fill queues, hashing algorithms related to data forwarding and branch target address cache indexing.
Abstract: A computing system includes a microprocessor that receives values for configuring operating modes thereof. A device driver monitors which software applications currently running on the microprocessor are in a predetermined list and responsively dynamically writes the values to the microprocessor to configure its operating modes. Examples of the operating modes the device driver may configure relate to the following: data prefetching; branch prediction; instruction cache eviction; instruction execution suspension; sizes of cache memories, reorder buffer, store/load/fill queues; hashing algorithms related to data forwarding and branch target address cache indexing; number of instruction translation, formatting, and issuing per clock cycle; load delay mechanism; speculative page tablewalks; instruction merging; out-of-order execution extent; caching of non-temporal hinted data; and serial or parallel access of an L2 cache and processor bus in response to an instruction cache miss.

46 citations

Patent
28 Dec 2004
TL;DR: In this paper, the authors present a system and method of common cache management, where a shared external memory is provided and populated by the VMs in the system with cache state information responsive to caching activity.
Abstract: A system and method of common cache management. Plural VMs each have a cache infrastructure component used by one or more additional components within each VM. An external cache is provided and shared by the components of each of the VMs. In one embodiment, a shared external memory is provided and populated by the VMs in the system with cache state information responsive to caching activity. This permits external monitoring of caching activity in the system.

46 citations

Patent
06 Mar 2002
TL;DR: In this paper, the cache controllers and cache memory blocks are associated with second level cache, each processor accesses the second-level cache controllers upon missing in a first level cache of fixed size.
Abstract: A processor integrated circuit capable of executing more than one instruction stream has two or more processors. Each processor accesses instructions and data through a cache controller. There are multiple blocks of cache memory. Some blocks of cache memory may optionally be directly attached to particular cache controllers. The cache controllers access at least some of the multiple blocks of cache memory through high speed interconnect, these blocks being dynamically allocable to more than one cache controller. A resource allocation controller determines which cache memory controller has access to the dynamically allocable cache memory block. In an embodiment the cache controllers and cache memory blocks are associated with second level cache, each processor accesses the second level cache controllers upon missing in a first level cache of fixed size.

46 citations


Network Information
Related Topics (5)
Cache
59.1K papers, 976.6K citations
92% related
Server
79.5K papers, 1.4M citations
88% related
Scalability
50.9K papers, 931.6K citations
88% related
Network packet
159.7K papers, 2.2M citations
85% related
Quality of service
77.1K papers, 996.6K citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202350
2022114
20215
20201
20198
201818