scispace - formally typeset
Patent

Dual call/return stack branch prediction system

G. Henry, +1 more
TLDR
In this paper, the authors propose a branch prediction apparatus that employs dual call/return stacks to predict return addresses in a microprocessor. But their approach requires the microprocessor to decode the instruction before decoding it to know whether it is actually a return instruction.
Abstract
A branch prediction apparatus that employs dual call/return stacks to predict return addresses in a microprocessor. The apparatus includes a first call/return stack that provides a speculative return address based upon a return instruction hit in a speculative branch target address cache (BTAC) of an instruction cache fetch address prior to decoding of the instruction to know whether it is actually a return instruction. The speculative return address is provided early in the pipeline and the microprocessor speculatively branches to the speculative return address. Later in the pipeline, a second call/return stack provides a non-speculative return address after the instruction is decoded and verified to be a return instruction. A comparator compares the speculative and non-speculative return addresses, and if the two addresses mismatch, the microprocessor branches to the non-speculative return address.

read more

Citations
More filters
Patent

Apparatus and method for selectively overriding return stack prediction in response to detection of non-standard return sequence

TL;DR: In this article, a microprocessor for predicting a target address of a return instruction is disclosed, which includes a BTAC and a return stack that each makes a prediction of the target address.
Patent

Register file segments for supporting code block execution by using virtual cores instantiated by partitionable engines

TL;DR: In this article, a system for executing instructions using a plurality of register file segments for a processor is presented, which includes a global front end scheduler for receiving an incoming instruction sequence, wherein the global front-end scheduler partitions the incoming instructions into a plurality and generates inheritance vectors describing interdependencies between instructions.
Patent

Method and apparatus for correcting an internal call/return stack in a microprocessor that detects from multiple pipeline stages incorrect speculative update of the call/return stack

TL;DR: In this article, an internal call/return stack (CRS) correction apparatus in a pipelined microprocessor is disclosed, which includes two distinct stages that detect invalidating events, such as a branch misprediction or exception.
Patent

Apparatus and method for handling BTAC branches that wrap across instruction cache lines

TL;DR: The branch target address cache (BTAC) as mentioned in this paper caches indications of whether a branch instruction wraps across two cache lines, and the target address is stored in a register to be used by the instruction cache in order to decode the branch instructions.
Patent

Apparatus and Method for Processing an Instruction Matrix Specifying Parallel and Dependent Operations

TL;DR: In this article, a matrix of execution blocks form a set of rows and columns for parallel and dependent execution of a single block of instructions, and the rows support parallel execution of instructions and the columns support dependent instructions.
References
More filters
Proceedings ArticleDOI

A secure and reliable bootstrap architecture

TL;DR: The AEGIS architecture for initializing a computer system validates integrity at each layer transition in the bootstrap process, and it is shown how this results in robust systems.
Patent

High performance, superscalar-based computer system with out-of-order instruction execution

TL;DR: In this paper, a superscalar-based computer system with out-of-order instruction execution for enhanced resource utilization and performance throughput is presented, where the data results of each executed instruction are stored in the temporary data registers until all prior instructions have been executed, thereby retiring the executed instruction in-order.
Patent

Method for pipeline processing of instructions by controlling access to a reorder buffer using a register file outside the reorder buffer

TL;DR: In this paper, a pipelined method for executing instructions in a computer system is presented, which includes providing multiple instructions as a continuous stream of operations, provided in program order, and provided for executing the instructions in an out-of-order pipeline.
Patent

Two-level branch prediction cache

TL;DR: An improved branch prediction cache (BPC) as mentioned in this paper utilizes a hybrid cache structure that provides two levels of branch information caching, a shallow but wide structure (36 32-byte entries), which caches full prediction information for a limited number of branch instructions.
Patent

Multi-instruction stream branch processing mechanism

TL;DR: In this paper, a high-performance computer which prefetches and predecodes instructions for sequential presentation to an execution unit, at least three separately gated and sequenced multi-instruction buffers for prefetched instructions permit continued sequential predecoding and buffering of instructions from three independent instruction streams identified by multiple branch instructions, some of which may be conditionally executed.
Related Papers (5)