scispace - formally typeset
E

Elmoustapha Ould-Ahmed-Vall

Researcher at Intel

Publications -  299
Citations -  1664

Elmoustapha Ould-Ahmed-Vall is an academic researcher from Intel. The author has contributed to research in topics: Operand & Opcode. The author has an hindex of 19, co-authored 299 publications receiving 1656 citations. Previous affiliations of Elmoustapha Ould-Ahmed-Vall include Georgia Institute of Technology & AMIT.

Papers
More filters
Patent

Memory prefetching in multiple gpu environment

TL;DR: In this article, the authors present an embodiment of an apparatus including multiple processors including a host processor and multiple graphics processing units (GPUs) to process data, each of the GPUs including a prefetcher and a cache; and a memory for storage of data, the memory including a plurality of memory elements.
Patent

Systems, apparatuses, and methods for arithmetic recurrence

TL;DR: In this paper, a decoded instruction to broadcast a data value from a least significant packed data element position of a first packed data source operand to a plurality of arithmetic circuits is described.
Patent

Systèmes, appareils et procédés de réalisation de détection de conflit et de diffusion de contenu d'un registre vers des positions d'élément de données d'un autre registre

TL;DR: In this paper, a processeur informatique de diffusion de donnees en reponse a une instruction de diffusion comprimee de vecteur unique comprend un operande de registre de masque d'ecriture source, un operator de registerre of vectuer de destination, and un code operation.
Patent

Systems, methods, and apparatuses for tile store

TL;DR: In this article, the loading of a matrix (tile) from memory is described in at least a form of decode circuitry to decode an instruction having fields for an opcode, a source matrix operand identifier, and destination memory information.
Patent

Multi-tile architecture for graphics operations

TL;DR: In this paper, a multi-tile architecture for graphics operations is defined, which includes one or more dies; multiple processor tiles installed on the one or multiple dies; and a structure to interconnect the processor tiles on the single or more die, wherein the structure to enable communications between processor tiles the processor tile.