Author

# Chester Rebeiro

Other affiliations: Centre for Development of Advanced Computing, Indian Institute of Technology Kharagpur, Indian Institutes of Technology ...read more

Bio: Chester Rebeiro is an academic researcher from Indian Institute of Technology Madras. The author has contributed to research in topics: Block cipher & Cache. The author has an hindex of 17, co-authored 78 publications receiving 845 citations. Previous affiliations of Chester Rebeiro include Centre for Development of Advanced Computing & Indian Institute of Technology Kharagpur.

Topics: Block cipher, Cache, Cryptography, Timing attack, Cipher

##### Papers published on a yearly basis

##### Papers

More filters

••

09 Sep 2012TL;DR: The resulting scalar multiplier is the fastest reported implementation for generic curves over binary finite fields and leads to area requirements that is significantly lesser compared to other high-speed implementations.

Abstract: In this paper we present an FPGA implementation of a high-speed elliptic curve scalar multiplier for binary finite fields. High speeds are achieved by boosting the operating clock frequency while at the same time reducing the number of clock cycles required to do a scalar multiplication. To increase clock frequency, the design uses optimized implementations of the underlying field primitives and a mathematically analyzed pipeline design. To reduce clock cycles, a new scheduling scheme is presented that allows overlapped processing of scalar bits. The resulting scalar multiplier is the fastest reported implementation for generic curves over binary finite fields. Additionally, the optimized primitives leads to area requirements that is significantly lesser compared to other high-speed implementations. Detailed implementation results are furnished in order to support the claims.

71 citations

••

08 Dec 2006TL;DR: The impact of the architecture of the microprocessor on the performance of bitslice AES is analyzed and the implementation is optimized to best utilize the superscalar architecture and SIMD instruction set present in the processors.

Abstract: Network applications need to be fast and at the same time provide security. In order to minimize the overhead of the security algorithm on the performance of the application, the speeds of encryption and decryption of the algorithm are critical. To obtain maximum performance from the algorithm, efficient techniques for its implementation must be used and the implementation must be tuned for the specific hardware on which it is running.
Bitslice is a non-conventional but efficient way to implement DES in software. It involves breaking down of DES into logical bit operations so that N parallel encryptions are possible on a single N-bit microprocessor. This results in tremendous throughput. AES is a symmetric block cipher introduced by NIST as a replacement for DES. It is rapidly becoming popular due to its good security features, efficiency, performance and simplicity. In this paper we present an implementation of AES using the bitslice technique. We analyze the impact of the architecture of the microprocessor on the performance of bitslice AES. We consider three processors; the Intel Pentium 4, the AMD Athlon 64 and the Intel Core 2. We optimize the implementation to best utilize the superscalar architecture and SIMD instruction set present in the processors.

68 citations

••

TL;DR: A theoretical model is used to approximate the delay of different characteristic two primitives used in an elliptic curve scalar multiplier architecture (ECSMA) implemented on k input lookup table (LUT)-based field-programmable gate arrays to design the fastest scalarmultiplier over generic curves.

Abstract: This paper uses a theoretical model to approximate the delay of different characteristic two primitives used in an elliptic curve scalar multiplier architecture (ECSMA) implemented on k input lookup table (LUT)-based field-programmable gate arrays. Approximations are used to determine the delay of the critical paths in the ECSMA. This is then used to theoretically estimate the optimal number of pipeline stages and the ideal placement of each stage in the ECSMA. This paper illustrates suitable scheduling for performing point addition and doubling in a pipelined data path of the ECSMA. Finally, detailed analyses, supported with experimental results, are provided to design the fastest scalar multiplier over generic curves. Experimental results for GF(2163) show that, when the ECSMA is suitably pipelined, the scalar multiplication can be performed in only 9.5 μs on a Xilinx Virtex V. Notably the design has an area which is significantly smaller than other reported high-speed designs, which is due to the better LUT utilization of the underlying field primitives.

55 citations

•

TL;DR: In this paper, the impact of the architecture of the microprocessor on the performance of bitslice AES was analyzed for three processors; the Intel Pentium 4, the AMD Athlon 64 and the Intel Core 2.

Abstract: Network applications need to be fast and at the same time provide security. In order to minimize the overhead of the security algorithm on the performance of the application, the speeds of encryption and decryption of the algorithm are critical. To obtain maximum performance from the algorithm, efficient techniques for its implementation must be used and the implementation must be tuned for the specific hardware on which it is running. Bitslice is a non-conventional but efficient way to implement DES in software. It involves breaking down of DES into logical bit operations so that N parallel encryptions are possible on a single N-bit microprocessor. This results in tremendous throughput. AES is a symmetric block cipher introduced by NIST as a replacement for DES. It is rapidly becoming popular due to its good security features, efficiency. performance and simplicity. In this paper we present an implementation of AES using the bitslice technique. We analyze the impact of the architecture of the microprocessor on the performance of bitslice AES We consider three processors; the Intel Pentium 4, the AMD Athlon 64 and the Intel Core 2. We optimize the implementation to best utilize the superscalar architecture and SIMD instruction set present in the processors.

55 citations

••

14 Dec 2008TL;DR: Improved scheduling of elliptic curve point arithmetic results in lower number of register files thus reducing the area required and the critical delay of the circuit.

Abstract: This paper proposes an efficient high speed implementation of an elliptic curve crypto processor (ECCP) for an FPGA platform The main optimization goal for the ECCP is efficient implementation of the important underlying finite field primitives namely multiplication and inverse The techniques proposed maximize the utilization of FPGA resources Additionally improved scheduling of elliptic curve point arithmetic results in lower number of register files thus reducing the area required and the critical delay of the circuit Through several comparisons with existing work we demonstrate that the combination of the above techniques helps realize one of the fastest and compact elliptic curve processors

45 citations

##### Cited by

More filters

••

[...]

TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.

Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality.
Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

33,785 citations

••

2,687 citations

••

TL;DR: An extremely strong type of attack is demonstrated, which requires knowledge of neither the specific plaintexts nor ciphertexts and works by merely monitoring the effect of the cryptographic process on the cache.

Abstract: We describe several software side-channel attacks based on inter-process leakage through the state of the CPU’s memory cache. This leakage reveals memory access patterns, which can be used for cryptanalysis of cryptographic primitives that employ data-dependent table lookups. The attacks allow an unprivileged process to attack other processes running in parallel on the same processor, despite partitioning methods such as memory protection, sandboxing, and virtualization. Some of our methods require only the ability to trigger services that perform encryption or MAC using the unknown key, such as encrypted disk partitions or secure network links. Moreover, we demonstrate an extremely strong type of attack, which requires knowledge of neither the specific plaintexts nor ciphertexts and works by merely monitoring the effect of the cryptographic process on the cache. We discuss in detail several attacks on AES and experimentally demonstrate their applicability to real systems, such as OpenSSL and Linux’s dm-crypt encrypted partitions (in the latter case, the full key was recovered after just 800 writes to the partition, taking 65 milliseconds). Finally, we discuss a variety of countermeasures which can be used to mitigate such attacks.

500 citations

•

31 Aug 2011TL;DR: In this article, a method for modifying an image is presented, which consists of displaying an image, the image comprising a portion of an object; determining if an edge of the object is in a location within the portion; and detecting movement in a member direction, of an operating member with respect to the edge.

Abstract: A method is provided for modifying an image. The method comprises displaying an image, the image comprising a portion of an object; and determining if an edge of the object is in a location within the portion. The method further comprises detecting movement, in a member direction, of an operating member with respect to the edge. The method still further comprises moving, if the edge is not in the location, the object in an object direction corresponding to the detected movement; and modifying, if the edge is in the location, the image in response to the detected movement, the modified image comprising the edge in the location.

434 citations

15 Jul 2012

419 citations