Home
/
Authors
/
Johnson Kin

Author

Johnson Kin

Bio: Johnson Kin is an academic researcher from University of California, Los Angeles. The author has contributed to research in topics: Compiler & Cache. The author has an hindex of 5, co-authored 11 publications receiving 754 citations.

Topics: Compiler, Cache, Instruction-level parallelism, CPU cache, Throughput (business) ...read more

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

The filter cache: an energy efficient memory structure

[...]

Johnson Kin¹, Munish Gupta¹, William H. Mangione-Smith¹•Institutions (1)

University of California, Los Angeles¹

01 Dec 1997

TL;DR: Experimental results across a wide range of embedded applications show that the filter cache results in improved memory system energy efficiency, and this work proposes to trade performance for power consumption by filtering cache references through an unusually small L1 cache.

...read moreread less

Abstract: Most modern microprocessors employ one or two levels of on-chip caches in order to improve performance. These caches are typically implemented with static RAM cells and often occupy a large portion of the chip area. Not surprisingly, these caches often consume a significant amount of power. In many applications, such as portable devices, low power is more important than performance. We propose to trade performance for power consumption by filtering cache references through an unusually small L1 cache. An L2 cache, which is similar in size and structure to a typical L1 cache, is positioned behind the filter cache and serves to reduce the performance loss. Experimental results across a wide range of embedded applications show that the filter cache results in improved memory system energy efficiency. For example, a direct mapped 256-byte filter cache achieves a 58% power reduction while reducing performance by 21%, corresponding to a 51% reduction in the energy-delay product over conventional design.

...read moreread less

544 citations

Journal Article•DOI•

Filtering memory references to increase energy efficiency

[...]

Johnson Kin¹, M. Gupta², William H. Mangione-Smith¹•Institutions (2)

University of California, Los Angeles¹, International Rectifier²

01 Jan 2000-IEEE Transactions on Computers

TL;DR: This work proposes sacrificing some performance in exchange for energy efficiency by filtering cache references through an unusually small first level cache, which results in a 51 percent reduction in the energy-delay product when compared to a conventional design.

...read moreread less

Abstract: Most modern microprocessors employ one or two levels of on-chip caches in order to improve performance. Caches typically are implemented with static RAM cells and often occupy a large portion of the chip area. Not surprisingly, these caches can consume a significant amount of power. In many applications, such as portable devices, energy efficiency is more important than performance. We propose sacrificing some performance in exchange for energy efficiency by filtering cache references through an unusually small first level cache. We refer to this structure as the filter cache. A second level cache, similar in size and structure to a conventional first level cache, is positioned behind the filter cache and serves to mitigate the performance loss. Extensive experiments indicate that a small filter cache still can achieve a high hit rate and good performance. This approach allows the second level cache to be in a low power mode most of the time, thus resulting in power savings. The filter cache is particularly attractive in low power applications, such as the embedded processors used for communication and multimedia applications. For example, experimental results across a wide range of embedded applications show that a direct mapped 255-byte filter cache achieves a 58 percent power reduction while reducing performance by 21 percent. This trade-off results in a 51 percent reduction in the energy-delay product when compared to a conventional design.

...read moreread less

96 citations

Proceedings Article•DOI•

Procedure based program compression

[...]

Darko Kirovski¹, Johnson Kin¹, William H. Mangione-Smith¹•Institutions (1)

University of California, Los Angeles¹

01 Dec 1997

TL;DR: This paper will discuss a new approach to implementing transparent program compression that requires little or no hardware support, and results in an average memory reduction of 40% with a runtime performance overhead of 10%.

...read moreread less

Abstract: Cost and power consumption are two of the most important design factors for many embedded systems, particularly consumer devices. Products such as personal digital assistants, pagers with integrated data services and smart phones have fixed performance requirements but unlimited appetites for reduced cost and increased battery life. Program compression is one technique that can be used to attack both of these problems. Compressed programs require less memory, thus reducing the cost of both direct materials and manufacturing. Furthermore, by relying on compressed memory, the total number of memory references is reduced. This reduction saves power by lowering the traffic on high-capacitance buses. This paper discusses a new approach to implementing transparent program compression that requires little or no hardware support. Procedures are compressed individually, and a directory structure is used to bind them together at run-time. Decompressed procedures are explicitly cached in ordinary RAM as complete units, thus resolving references within each procedure. This approach has been evaluated on a set of 25 embedded multimedia and communications applications, and results in an average memory reduction of 40% with a run-time performance overhead of 10%.

...read moreread less

64 citations

Proceedings Article•DOI•

Power efficient mediaprocessors: design space exploration

[...]

Johnson Kin¹, Chunho Lee¹, William H. Mangione-Smith¹, Miodrag Potkonjak¹•Institutions (1)

University of California, Los Angeles¹

01 Jun 1999

TL;DR: A framework for rapidly exploring the design space of low power application-specific programmable processors (ASPP), in particular mediaprocessors, which focuses on a category of processors that are programmable yet optimized to reduce power consumption for a specific set of applications.

...read moreread less

Abstract: We present a framework for rapidly exploring the design space of low power application-specific programmable processors (ASPP), in particular mediaprocessors. We focus on a category of processors that are programmable yet optimized to reduce power consumption for a specific set of applications. The key components of the framework presented in this paper are a retargetable instruction level parallelism (ILP) compiler, processor simulators, a set of complete media applications written in a high level language and an architectural component selection algorithm. The fundamental idea behind the framework is that with the aid of a retargetable ILP compiler and simulators it is possible to arrange architectural parameters (e.g., the issue width, the size of cache memory units, the number of execution units, etc.) to meet low power design goals under area constraints.

...read moreread less

33 citations

Proceedings Article•DOI•

Media architecture: general purpose vs. multiple application-specific programmable processor

[...]

Chunho Lee¹, Johnson Kin¹, Miodrag Potkonjak¹, William H. Mangione-Smith¹•Institutions (1)

University of California, Los Angeles¹

01 May 1998

TL;DR: A framework that makes it possible for a designer to rapidly explore the application-specific programmable processor design space under area constraints is reported, which can be valuable in making early design decisions such as area and architectural trade-offs, cache and instruction issue width trade-off under area constraint, and the number of branch units and issue width.

...read moreread less

Abstract: In this paper we report a framework that makes it possible for a designer to rapidly explore the application-specific programmable processor design space under area constraints. The framework uses a production-quality compiler and simulation tools to synthesize a high performance machine for an application. Using the framework we evaluate the validity of the fundamental assumption behind the development of application-specific programmable processors. Application-specific processors are based on the idea that applications differ from each other in key architectural parameters, such as the available instruction-level parallelism, demand on various hardware components (e.g. cache memory units, register files) and the need for different number of functional units. We found that the framework introduced in this paper can be valuable in making early design decisions such as area and architectural trade-off, cache and instruction issue width trade-off under area constraint, and the number of branch units and issue width.

...read moreread less

9 citations

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

Wattch: a framework for architectural-level power analysis and optimizations

[...]

David Brooks¹, Vivek Tiwari², Margaret Martonosi¹•Institutions (2)

Princeton University¹, Intel²

01 May 2000

TL;DR: Wattch is presented, a framework for analyzing and optimizing microprocessor power dissipation at the architecture-level and opens up the field of power-efficient computing to a wider range of researchers by providing a power evaluation methodology within the portable and familiar SimpleScalar framework.

...read moreread less

Abstract: Power dissipation and thermal issues are increasingly significant in modern processors. As a result, it is crucial that power/performance tradeoffs be made more visible to chip architects and even compiler writers, in addition to circuit designers. Most existing power analysis tools achieve high accuracy by calculating power estimates for designs only after layout or floorplanning are complete. In addition to being available only late in the design process, such tools are often quite slow, which compounds the difficulty of running them for a large space of design possibilities.This paper presents Wattch, a framework for analyzing and optimizing microprocessor power dissipation at the architecture-level. Wattch is 1000X or more faster than existing layout-level power tools, and yet maintains accuracy within 10% of their estimates as verified using industry tools on leading-edge designs. This paper presents several validations of Wattch's accuracy. In addition, we present three examples that demonstrate how architects or compiler writers might use Wattch to evaluate power consumption in their design process.We see Wattch as a complement to existing lower-level tools; it allows architects to explore and cull the design space early on, using faster, higher-level tools. It also opens up the field of power-efficient computing to a wider range of researchers by providing a power evaluation methodology within the portable and familiar SimpleScalar framework.

...read moreread less

2,848 citations

Proceedings Article•DOI•

Scratchpad memory: a design alternative for cache on-chip memory in embedded systems

[...]

Rajeshwari M. Banakar¹, Stefan Steinke¹, Bo-Sik Lee¹, Mahesh Balakrishnan¹, Peter Marwedel¹ - Show less +1 more•Institutions (1)

Indian Institute of Technology Delhi¹

06 May 2002

TL;DR: The results clearly establish scratch pad memory as a low power alternative in most situations with an average energy reduction of 40% and the average area-time reduction for the scratchpad memory was 46% of the cache memory.

...read moreread less

Abstract: In this paper we address the problem of on-chip memory selection for computationally intensive applications, by proposing scratch pad memory as an alternative to cache. Area and energy for different scratch pad and cache sizes are computed using the CACTI tool while performance was evaluated using the trace results of the simulator. The target processor chosen for evaluation was AT91M40400. The results clearly establish scratehpad memory as a low power alternative in most situations with an average energy reducation of 40%. Further the average area-time reduction for the seratchpad memory was 46% of the cache memory.

...read moreread less

751 citations

Proceedings Article•DOI•

Selective cache ways: on-demand cache resource allocation

[...]

David H. Albonesi¹•Institutions (1)

University of Rochester¹

16 Nov 1999

TL;DR: In this paper, a tradeoff between performance and energy is made between a small performance degradation for energy savings, and the tradeoff can produce a significant reduction in cache energy dissipation.

...read moreread less

Abstract: Increasing levels of microprocessor power dissipation call for new approaches at the architectural level that save energy by better matching of on-chip resources to application requirements. Selective cache ways provides the ability to disable a subset of the ways in a set associative cache during periods of modest cache activity, while the full cache may remain operational for more cache-intensive periods. Because this approach leverages the subarray partitioning that is already present for performance reasons, only minor changes to a conventional cache are required, and therefore, full-speed cache operation can be maintained. Furthermore, the tradeoff between performance and energy is flexible, and can be dynamically tailored to meet changing application and machine environmental conditions. We show that trading off a small performance degradation for energy savings can produce a significant reduction in cache energy dissipation using this approach.

...read moreread less

733 citations

Proceedings Article•DOI•

Gated-V/sub dd/: a circuit technique to reduce leakage in deep-submicron cache memories

[...]

Michael D. Powell¹, Se-Hyun Yang¹, Babak Falsafi¹, Kaushik Roy¹, T. N. Vijaykumar¹ - Show less +1 more•Institutions (1)

Purdue University¹

01 Aug 2000

TL;DR: Results indicate that gated-Vdd together with a novel resizable cache architecture reduces energy-delay by 62% with minimal impact on performance.

...read moreread less

Abstract: Deep-submicron CMOS designs have resulted in large leakage energy dissipation in microprocessors. While SRAM cells in on-chip cache memories always contribute to this leakage, there is a large variability in active cell usage both within and across applications. This paper explores an integrated architectural and circuit-level approach to reducing leakage energy dissipation in instruction caches. We propose, gated-V/sub dd/, a circuit-level technique to gate the supply voltage and reduce leakage in unused SRAM cells. Our results indicate that gated-V/sub dd/ together with a novel resizable cache architecture reduces energy-delay by 62% with minimal impact on performance.

...read moreread less

731 citations

Proceedings Article•DOI•

A decade of reconfigurable computing: a visionary retrospective

[...]

Reiner W. Hartenstein¹•Institutions (1)

Kaiserslautern University of Technology¹

13 Mar 2001

TL;DR: The paper surveys a decade of R&D on coarse grain reconfigurable hardware and related CAD, points out why this emerging discipline is heading toward a dichotomy of computing science, and advocates the introduction of a new soft machine paradigm to replace CAD by compilation.

...read moreread less

Abstract: The paper surveys a decade of R&D on coarse grain reconfigurable hardware and related CAD, points out why this emerging discipline is heading toward a dichotomy of computing science, and advocates the introduction of a new soft machine paradigm to replace CAD by compilation.

...read moreread less

661 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146

Collapse