Topic

Program counter

About: Program counter is a research topic. Over the lifetime, 2388 publications have been published within this topic receiving 26742 citations. The topic is also known as: PC & instruction pointer.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Effective hardware-based data prefetching for high-performance processors

[...]

Tien-Fu Chen¹, Jean-Loup Baer²•Institutions (2)

National Chung Cheng University¹, University of Washington²

01 May 1995-IEEE Transactions on Computers

TL;DR: The results show that the three hardware prefetching schemes all yield significant reductions in the data access penalty when compared with regular caches, the benefits are greater when the hardware assist augments small on-chip caches, and the lookahead scheme is the preferred one cost-performance wise.

...read moreread less

Abstract: Memory latency and bandwidth are progressing at a much slower pace than processor performance. In this paper, we describe and evaluate the performance of three variations of a hardware function unit whose goal is to assist a data cache in prefetching data accesses so that memory latency is hidden as often as possible. The basic idea of the prefetching scheme is to keep track of data access patterns in a reference prediction table (RPT) organized as an instruction cache. The three designs differ mostly on the timing of the prefetching. In the simplest scheme (basic), prefetches can be generated one iteration ahead of actual use. The lookahead variation takes advantage of a lookahead program counter that ideally stays one memory latency time ahead of the real program counter and that is used as the control mechanism to generate the prefetches. Finally the correlated scheme uses a more sophisticated design to detect patterns across loop levels. These designs are evaluated by simulating the ten SPEC benchmarks on a cycle-by-cycle basis. The results show that 1) the three hardware prefetching schemes all yield significant reductions in the data access penalty when compared with regular caches, 2) the benefits are greater when the hardware assist augments small on-chip caches, and 3) the lookahead scheme is the preferred one cost-performance wise. >

...read moreread less

543 citations

Proceedings Article•DOI•

Minos: Control Data Attack Prevention Orthogonal to Memory Model

[...]

Jedidiah R. Crandall¹, Frederic T. Chong¹•Institutions (1)

University of California, Davis¹

04 Dec 2004

TL;DR: A microarchitectural implementation of Minos is presented that achieves negligible impact on cycle time with a small investment in die area, and minor changes to the Linux kernel to handle the tag bits and perform virtual memory swapping.

...read moreread less

Abstract: We introduce Minos, a microarchitecture that implements Biba's low-water-mark integrity policy on individual words of data. Minos stops attacks that corrupt control data to hijack program control flow but is orthogonal to the memory model. Control data is any data which is loaded into the program counter on control flow transfer, or any data used to calculate such data. The key is that Minos tracks the integrity of all data, but protects control flow by checking this integrity when a program uses the data for control transfer. Existing policies, in contrast, need to differentiate between control and non-control data a priori, a task made impossible by coercions between pointers and other data types such as integers in the C language. Our implementation of Minos for Red Hat Linux 6.2 on a Pentium-based emulator is a stable, usable Linux system on the network on which we are currently running a web server. Our emulated Minos systems running Linux and Windows have stopped several actual attacks. We present a microarchitectural implementation of Minos that achieves negligible impact on cycle time with a small investment in die area, and minor changes to the Linux kernel to handle the tag bits and perform virtual memory swapping.

...read moreread less

493 citations

Proceedings Article•DOI•

An effective on-chip preloading scheme to reduce data access penalty

[...]

Jean-Loup Baer¹, Tien-Fu Chen¹•Institutions (1)

University of Washington¹

01 Aug 1991

TL;DR: In this article, a new hardware prefetching scheme based on the prediction of the execution of the instruction stream and associated operand references is proposed. But this scheme requires the use of a reference prediction table and its associated logic.

...read moreread less

Abstract: Conventional cache prefetching approaches can be either hardware-based, generally by using a one-blockIookahead technique, or compiler-directed, with insertions of non-blocking prefetch instructions. We introduce a new hardware scheme based on the prediction of the execution of the instruction stream and associated operand references. It consists of a reference prediction table and a look-ahead program counter and its associated logic. With this scheme, data with regular access patterns is preloaded, independently of the stride size, and preloading of data with irregular access patterns is prevented. We evaluate our design through trace driven simulation by comparing it with a pure data cache approach under three different memory access models. Our experiments show that this scheme is very effective for reducing the data access penalty for scientific programs and that is has moderate success for other applications.

...read moreread less

458 citations

Patent•

Dynamic multi-mode parallel processing array

[...]

Peter M. Kogge¹•Institutions (1)

IBM¹

17 Oct 1994

TL;DR: In this article, a parallel RISC computer system is provided by a multimode dynamic multi-mode parallel processor array with one embodiment illustrating a tightly coupled VLSI embodiment with an architecture which can be extended to more widely placed processing elements through the interconnection network which couples multiple processors capable of MIMD mode processing to one another.

...read moreread less

Abstract: A Parallel RISC computer system is provided by a multi-mode dynamic multi-mode parallel processor array with one embodiment illustrating a tightly coupled VLSI embodiment with an architecture which can be extended to more widely placed processing elements through the interconnection network which couples multiple processors capable of MIMD mode processing to one another with broadcast of instructions to selected groups of units controlled by a controlling processor. The coupling of the processing elements logic enables dynamic mode assignment and dynamic mode switching, allowing processors operating in a SIMD mode to make maximum memory and cycle time usage. On and instruction by instruction level basis, modes can be switched from SIMD to MIMD, and even into SISD mode on the controlling processor for inherently sequential computation allowing a programmer or complier to build a program for the computer system which uses the optimal kind of parallelism (SISD, SIMD, MIMD). Furthermore, this execution, particularly in the SIMD mode, can be set up for running applications at the limit of memory cycle time. With the ALLNODE switch and alternatives paths a system can be dynamically achieved in a few cycles for many many processors. Each processing element and memory and has MIMD capability the processor's an instruction register, condition register and program counter provide common resources which are used in MIMD and SIMD. The program counter become a base register in SIMD mode.

...read moreread less

296 citations

Proceedings Article•DOI•

Improving code density using compression techniques

[...]

Charles R. Lefurgy¹, Peter L. Bird¹, I-Cheng Chen¹, Trevor Mudge¹•Institutions (1)

University of Michigan¹

01 Dec 1997

TL;DR: This work proposes a method for compressing programs in embedded processors where instruction memory size dominates cost and achieves an average size reduction of 39%, 34%, and 26%, respectively, for SPEC CINT95 programs.

...read moreread less

Abstract: Proposes a method for compressing programs in embedded processors where the instruction memory size dominates the cost. A post-compilation analyzer examines a program and replaces common sequences of instructions with a single instruction codeword. A microprocessor executes the compressed instruction sequences by fetching codewords from the instruction memory, expanding them back to the original sequence of instructions in the decode stage, and issuing them to the execution stages. We apply our technique to the PowerPC, ARM and i386 instruction sets and achieve an average size reduction of 39%, 34% and 26%, respectively, for SPEC CINT95 programs.

...read moreread less

245 citations

Collapse

Network Information

Performance

Metrics

2,388

Papers

27,029

Citations

No. of papers in the topic in previous years
Year	Papers
2021	13
2020	15
2019	23
2018	26
2017	24
2016	29

Program counter

Papers published on a yearly basis

Papers

Trending Questions (5)

Network Information

Related Topics (5)

Performance

Metrics