scispace - formally typeset
F

Frederico Pratas

Researcher at Intel

Publications -  41
Citations -  683

Frederico Pratas is an academic researcher from Intel. The author has contributed to research in topics: Speedup & Cache. The author has an hindex of 13, co-authored 41 publications receiving 632 citations. Previous affiliations of Frederico Pratas include Technical University of Lisbon & INESC-ID.

Papers
More filters
Journal ArticleDOI

Cache-aware Roofline model: Upgrading the loft

TL;DR: This paper analyzes the original Roofline model and proposes a novel approach to provide a more insightful performance modeling of modern architectures by introducing cache-awareness, thus significantly improving the guidelines for application optimization.
Patent

Weight-shifting mechanism for convolutional neural networks

TL;DR: A processor includes a processor core and a calculation circuit as discussed by the authors, which includes logic to determine a set of weights for use in a convolutional neural network (CNN) calculation and scale up the weights using a scale value.
Proceedings ArticleDOI

Fine-grain Parallelism Using Multi-core, Cell/BE, and GPU Systems: Accelerating the Phylogenetic Likelihood Function

TL;DR: This paper focuses on exploiting fine-grain parallelism for a demanding Bioinformatics application - MrBayes - and its Phylogenetic Likelihood Functions (PLF) using different architectures and general-purpose multi-core processors prove to be simpler to program and provide the best balance between an efficient parallel and serial execution.
Patent

Storage device and method for performing convolution operations

TL;DR: In this paper, a storage device and method for performing convolution operations is described, which comprises a plurality of processing units to execute convolution operation on input data and partial results.
Patent

Method and apparatus for distributed and cooperative computation in artificial neural networks

TL;DR: In this paper, an apparatus and method for distributed and cooperative computation in artificial neural networks is described, which comprises an input/output (I/O) interface, a plurality of processing units communicatively coupled to the I/O interface to receive data for input neurons and synaptic weights associated with each of the input neurons, each unit processing at least a portion of the data for the inputs and weights to generate partial results.