scispace - formally typeset
E

Elmoustapha Ould-Ahmed-Vall

Researcher at Intel

Publications -  299
Citations -  1664

Elmoustapha Ould-Ahmed-Vall is an academic researcher from Intel. The author has contributed to research in topics: Operand & Opcode. The author has an hindex of 19, co-authored 299 publications receiving 1656 citations. Previous affiliations of Elmoustapha Ould-Ahmed-Vall include Georgia Institute of Technology & AMIT.

Papers
More filters
Patent

Compression in machine learning and deep learning processing

TL;DR: In this paper, the authors present an embodiment of an apparatus for compression of untyped data including a graphical processing unit (GPU) including a data compression pipeline, the data pipeline includes a data port coupled with one or more shader cores.
Patent

Collapsing of multiple nested loops, methods and instructions

TL;DR: In this paper, a multi-dimensional loop counter update instruction with a decode logic and an execution logic are presented. And methods to collapse loops using such instructions are described and claimed.
Patent

Compute optimizations for neural networks

TL;DR: In this paper, a neural network and an arithmetic logic unit including a barrel shifter, an adder, and an accumulator register are used to decode a single instruction into a decoded instruction that specifies multiple operands including an input value and a quantized weight value.
Patent

Compute optimizations for low precision machine learning operations

TL;DR: In this article, the authors present an accelerator module comprising a memory stack including multiple memory dies, a graphics processing unit (GPU) coupled with the memory stack via one or more memory controllers, the GPU including a plurality of multiprocessors having a single instruction, multiple thread (SIMT) architecture, and the at least one single instruction to cause at least a portion of the GPU to perform a floating-point operation on input having differing precisions.
Patent

Systems, apparatuses, and methods for performing mask bit compression

TL;DR: In this article, a single mask bit compression instruction that includes a source writemask register operand, a destination writeemask operand and an opcode is described. But it does not specify a single opcode.