Elmoustapha Ould-Ahmed-Vall

Patent

Apparatus and method for multiply, add/subtract, and accumulate of packed data elements

TL;DR: In this article, an apparatus and method for performing dual concurrent multiplications, subtraction/addition, and accumulation of packed data elements is presented, where a decoder is used to decode an instruction to generate a decoded instruction, and an execution circuitry comprising: multiplier circuitry to multiply the first and third data elements to generate the first temporary product, and adder circuitry to concurrently add the second temporary product to a second accumulated packed data element from a third source register, which is at least twice as large as the first width.

...read moreread less

Patent

Systems, methods, and apparatuses for tile matrix multiplication and accumulation

Valentine Robert, +12 more

TL;DR: In this article, the source/destination matrix operands are associated with a source and a destination matrix operand, and decoding circuitry is used to decode an instruction having fields for an opcode, an identifier for a first source matrix operator, and an identifier of a second source operand.

...read moreread less

Patent

Instructions and logic for bit field address and insertion

Elmoustapha Ould-Ahmed-Vall, +1 more

TL;DR: In this article, a processor includes a core to execute an instruction to return an address of a bit-field in a packed bit array, and the core includes logic to identify an index of the bit field, identify a length, multiply the index and length, and return the address and bit offset based upon a product of the indexed and length.

...read moreread less

Patent

Policy-based system interface for a real-time autonomous system

Ray Joydeep, +19 more

TL;DR: In this paper, the authors present an embodiment of an apparatus for compression of untyped data including a graphical processing unit (GPU) including a data compression pipeline, the data pipeline includes a data port coupled with one or more shader cores.

...read moreread less

Patent

Systems and methods for computing dot products of nibbles in two tile operands

Raanan Sade, +10 more

TL;DR: In this article, a processor includes decode circuitry to decode a tile dot product instruction having fields for anopcode, a destination identifier to identify a M by N destination matrix, a first source identifier and a second source identifier for identifying a K by N secondsource matrix, each of the matrices containing doubleword elements, and execution circuitry to execute the decoded instruction to perform a flow K times for each element of the identified destination matrix to generate eight products by multiplying each nibble of a doubleword element (M,K) of the first source matrix by a corresponding nib

...read moreread less

Papers

Apparatus and method for multiply, add/subtract, and accumulate of packed data elements

Systems, methods, and apparatuses for tile matrix multiplication and accumulation

Instructions and logic for bit field address and insertion

Policy-based system interface for a real-time autonomous system

Systems and methods for computing dot products of nibbles in two tile operands