scispace - formally typeset
Search or ask a question
Author

Francisco Tirado Fernández

Bio: Francisco Tirado Fernández is an academic researcher. The author has contributed to research in topics: Cache & Filter (video). The author has an hindex of 1, co-authored 2 publications receiving 11 citations.

Papers
More filters
Journal Article
TL;DR: The most significant proposals in this research field are reviewed, focusing on the own contributions on optimizing address-based memory disambiguation logic, namely the load-store queue.
Abstract: One of the main challenges of modern processor designs is the implementation of scalable and efficient mechanisms to detect memory access order violations as a result of out-of-order execution. Conventional structures performing this task are complex, inefficient and power-hungry. This fact has generated a large body of work on optimizing address-based memory disambiguation logic, namely the load-store queue. In this paper we review the most significant proposals in this research field, focusing on our own contributions.

11 citations

01 Jan 2010
TL;DR: Two techniques are proposed, one to filter accesses to the LSQ (Load-Store Queue) based on both timing and address information, and the other to filter Access to the first level data cache based on a forwarding predic- tor.
Abstract: In most modern processor designs, the HW dedicated to store data and instructions (memory hierarchy) has become a major con- sumer of power. In order to reduce this power consumption, we propose in this paper two techniques, one to filter accesses to the LSQ (Load-Store Queue) based on both timing and address information, and the other to filter accesses to the first level data cache based on a forwarding predic- tor. Our simulation results show that the power consumption decreases in 30-40% in each structure, with a negligible performance penalty of less than 0.1%.

Cited by
More filters
Patent
10 Jun 2013
TL;DR: In this article, a disambiguation-free out-of-order load store queue method is proposed, which includes implementing a memory resource that can be accessed by a plurality of asynchronous cores, implementing a store retirement buffer, and implementing speculative execution, wherein results of speculative execution can be saved in the store reorder buffer as a speculative state.
Abstract: In a processor, a disambiguation-free out of order load store queue method. The method includes implementing a memory resource that can be accessed by a plurality of asynchronous cores; implementing a store retirement buffer, wherein stores from a store queue have entries in the store retirement buffer in original program order; and implementing speculative execution, wherein results of speculative execution can be saved in the store retirement/reorder buffer as a speculative state. The method further includes, upon dispatch of a subsequent load from a load queue, searching the store retirement buffer for address matching; and, in cases where there are a plurality of address matches, locating a correct forwarding entry by scanning for the store retirement buffer for a first match, and forwarding data from the first match to the subsequent load. Once speculative outcomes are known, the speculative state is retired to memory.

16 citations

Patent
30 Mar 2012
TL;DR: In this paper, a method of memory disambiguation hardware to support software binary translation is provided, which includes unrolling a set of instructions to be executed within a processor, the sets of instructions having a number of memory operations.
Abstract: A method of memory disambiguation hardware to support software binary translation is provided. This method includes unrolling a set of instructions to be executed within a processor, the set of instructions having a number of memory operations. An original relative order of memory operations is determined. Then, possible reordering problems are detected and identified in software. The reordering problem being when a first memory operation has been reordered prior to and aliases to a second memory operation with respect to the original order of memory operations. The reordering problem is addressed and a relative order of memory operations to the processor is communicated.

15 citations

Patent
10 Jun 2013
TL;DR: In this article, a thread agnostic unified store queue and a unified load queue method for out-order loads in a memory consistency model using shared memory resources is proposed. But it does not support out-of-order caches.
Abstract: In a processor, a thread agnostic unified store queue and a unified load queue method for out of order loads in a memory consistency model using shared memory resources. The method includes implementing a memory resource that can be accessed by a plurality of asynchronous cores, wherein the plurality of cores share a unified store queue and a unified load queue; and implementing an access mask that functions by tracking which words of a cache line are accessed via a load, wherein the cache line includes the memory resource, wherein the load sets a mask bit within the access mask when accessing a word of the cache line, and wherein the mask bit blocks accesses from other loads from a plurality of cores. The method further includes checking the access mask upon execution of subsequent stores from the plurality of cores to the cache line, wherein stores from different threads can forward to loads of different threads while still maintaining in order memory consistency semantics; and causing a miss prediction when a subsequent store to the portion of the cache line sees a prior mark from a load in the access mask, wherein the subsequent store will signal a load queue entry corresponding to that load by using a tracker register and a thread ID register.

14 citations

Patent
14 Jun 2013
TL;DR: In this article, a disambiguation-free out-of-order load store queue method is proposed, which includes implementing a memory resource that can be accessed by a plurality of asynchronous cores, implementing a store retirement buffer, and forwarding data from the first match to the subsequent load.
Abstract: In a processor, a disambiguation- free out of order load store queue method. The method includes implementing a memory resource that can be accessed by a plurality of asynchronous cores; implementing a store retirement buffer, wherein stores from a store queue have entries in the store retirement buffer in original program order; and upon dispatch of a subsequent load from a load queue, searching the store retirement buffer for address matching. The method further includes in cases where there are a plurality of address matches, locating a correct forwarding entry by scanning for the store retirement buffer for a first match; and forwarding data from the first match to the subsequent load.

9 citations

Patent
12 Jun 2013
TL;DR: In this article, a lock-based method for out-of-order loads in a memory consistency model using shared memory resources is proposed, which includes implementing a memory resource that can be accessed by a plurality of cores; and implementing an access mask that functions by tracking which words of a cache line are accessed via a load.
Abstract: In a processor, a lock-based method for out of order loads in a memory consistency model using shared memory resources. The method includes implementing a memory resource that can be accessed by a plurality of cores; and implementing an access mask that functions by tracking which words of a cache line are accessed via a load, wherein the cache line includes the memory resource, wherein the load sets a mask bit within the access mask when accessing a word of the cache line, and wherein the mask bit blocks accesses from other loads from a plurality of cores. The method further includes checking the access mask upon execution of subsequent stores from the plurality of cores to the cache line; and causing a miss prediction when a subsequent store to the portion of the cache line sees a prior mark from a load in the access mask, wherein the subsequent store will signal a load queue entry corresponding to that load by using a tracker register and a thread ID register.

8 citations