Dual call/return stack branch prediction system

Patent

Dual call/return stack branch prediction system

TLDR

In this paper, the authors propose a branch prediction apparatus that employs dual call/return stacks to predict return addresses in a microprocessor. But their approach requires the microprocessor to decode the instruction before decoding it to know whether it is actually a return instruction.

Abstract:

A branch prediction apparatus that employs dual call/return stacks to predict return addresses in a microprocessor. The apparatus includes a first call/return stack that provides a speculative return address based upon a return instruction hit in a speculative branch target address cache (BTAC) of an instruction cache fetch address prior to decoding of the instruction to know whether it is actually a return instruction. The speculative return address is provided early in the pipeline and the microprocessor speculatively branches to the speculative return address. Later in the pipeline, a second call/return stack provides a non-speculative return address after the instruction is decoded and verified to be a return instruction. A comparator compares the speculative and non-speculative return addresses, and if the two addresses mismatch, the microprocessor branches to the non-speculative return address.

Citations

PDF

Open Access

More filters

Patent

Apparatus and method for selectively overriding return stack prediction in response to detection of non-standard return sequence

G. Glenn Henry, +1 more

TL;DR: In this article, a microprocessor for predicting a target address of a return instruction is disclosed, which includes a BTAC and a return stack that each makes a prediction of the target address.

...read moreread less

Patent

Register file segments for supporting code block execution by using virtual cores instantiated by partitionable engines

Mohammad Abdallah

TL;DR: In this article, a system for executing instructions using a plurality of register file segments for a processor is presented, which includes a global front end scheduler for receiving an incoming instruction sequence, wherein the global front-end scheduler partitions the incoming instructions into a plurality and generates inheritance vectors describing interdependencies between instructions.

...read moreread less

Patent

Method and apparatus for correcting an internal call/return stack in a microprocessor that detects from multiple pipeline stages incorrect speculative update of the call/return stack

Thomas C. McDonald

TL;DR: In this article, an internal call/return stack (CRS) correction apparatus in a pipelined microprocessor is disclosed, which includes two distinct stages that detect invalidating events, such as a branch misprediction or exception.

...read moreread less

Patent

Apparatus and method for handling BTAC branches that wrap across instruction cache lines

Brent Bean, +2 more

TL;DR: The branch target address cache (BTAC) as mentioned in this paper caches indications of whether a branch instruction wraps across two cache lines, and the target address is stored in a register to be used by the instruction cache in order to decode the branch instructions.

...read moreread less

Patent

Apparatus and Method for Processing an Instruction Matrix Specifying Parallel and Dependent Operations

Mohammad A. Abdallah

TL;DR: In this article, a matrix of execution blocks form a set of rows and columns for parallel and dependent execution of a single block of instructions, and the rows support parallel execution of instructions and the columns support dependent instructions.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

A secure and reliable bootstrap architecture

William A. Arbaugh, +2 more

TL;DR: The AEGIS architecture for initializing a computer system validates integrity at each layer transition in the bootstrap process, and it is shown how this results in robust systems.

...read moreread less

Patent

High performance, superscalar-based computer system with out-of-order instruction execution

Le Trong Nguyen, +8 more

TL;DR: In this paper, a superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput is presented, where the data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instruction in-order.

...read moreread less

Patent

Method for pipeline processing of instructions by controlling access to a reorder buffer using a register file outside the reorder buffer

Glenn J. Hinton, +4 more

TL;DR: In this paper, a pipelined method for executing instructions in a computer system is presented, which includes providing multiple instructions as a continuous stream of operations, provided in program order, and provided for executing the instructions in an out-of-order pipeline.

...read moreread less

Patent

Two-level branch prediction cache

David R. Stiles, +2 more

TL;DR: An improved branch prediction cache (BPC) as mentioned in this paper utilizes a hybrid cache structure that provides two levels of branch information caching, a shallow but wide structure (36 32-byte entries), which caches full prediction information for a limited number of branch instructions.

...read moreread less

Patent

Multi-instruction stream branch processing mechanism

Jeffrey F. Hughes, +3 more

TL;DR: In this paper, a high-performance computer which prefetches and predecodes instructions for sequential presentation to an execution unit, at least three separately gated and sequenced multi-instruction buffers for prefetched instructions permit continued sequential predecoding and buffering of instructions from three independent instruction streams identified by multiple branch instructions, some of which may be conditionally executed.

...read moreread less

Collapse

Dual call/return stack branch prediction system

Citations

Apparatus and method for selectively overriding return stack prediction in response to detection of non-standard return sequence

Register file segments for supporting code block execution by using virtual cores instantiated by partitionable engines

Method and apparatus for correcting an internal call/return stack in a microprocessor that detects from multiple pipeline stages incorrect speculative update of the call/return stack

Apparatus and method for handling BTAC branches that wrap across instruction cache lines

Apparatus and Method for Processing an Instruction Matrix Specifying Parallel and Dependent Operations

References

A secure and reliable bootstrap architecture

High performance, superscalar-based computer system with out-of-order instruction execution

Method for pipeline processing of instructions by controlling access to a reorder buffer using a register file outside the reorder buffer

Two-level branch prediction cache

Multi-instruction stream branch processing mechanism

Related Papers (5)

High performance, superscalar-based computer system with out-of-order instruction execution

Apparatus and method for selectively overriding return stack prediction in response to detection of non-standard return sequence

Method and apparatus for branch instruction processing in a processor

Method and system for fetching noncontiguous instructions in a single clock cycle

Hybrid branch prediction using a global selection counter and a prediction method comparison table