Shakti-T: A RISC-V Processor with Light Weight Security Extensions

doi:10.1145/3092627.3092629

Home
/
Papers
/
Shakti-T: A RISC-V Processor with Light Weight Security Extensions

Proceedings Article•DOI•

Shakti-T: A RISC-V Processor with Light Weight Security Extensions

Arjun Menon¹, Subadra Murugan¹, Chester Rebeiro¹, Neel Gala¹, Kamakoti Veezhinathan¹ - Show less +1 more•Institutions (1)

Indian Institute of Technology Madras¹

25 Jun 2017-pp 2

TL;DR: This work presents a unified hardware framework for handling spatial and temporal memory attacks with a RISC-V based micro-architecture with an enhanced application binary interface that enables software layers to use these features to protect sensitive data.

read less

Abstract: With increased usage of compute cores for sensitive applications, including e-commerce, there is a need to provide additional hardware support for securing information from memory based attacks. This work presents a unified hardware framework for handling spatial and temporal memory attacks. The paper integrates the proposed hardware framework with a RISC-V based micro-architecture with an enhanced application binary interface that enables software layers to use these features to protect sensitive data. We demonstrate the effectiveness of the proposed scheme through practical case studies in addition to taking the design through a VLSI CAD design flow. The proposed processor reduces the metadata storage overhead up to 4 x in comparison with the existing solutions, while incurring an area overhead of just 1914 LUTs and 2197 flip flops on an FPGA, without affecting the critical path delay of the processor.

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

Xuantie-910: A Commercial Multi-Core 12-Stage Pipeline Out-of-Order 64-bit High Performance RISC-V Processor with Vector Extension : Industrial Product

[...]

Chen Chen¹, Xiaoyan Xiang¹, Chang Liu¹, Yunhai Shang¹, Ren Guo¹, Dongqi Liu¹, Lu Yimin¹, Ziyi Hao¹, Jiahui Luo¹, Zhijian Chen¹, Chunqiang Li¹, Yu Pu¹, Jianyi Meng¹, Xiaolang Yan¹, Yuan Xie¹, Xiaoning Qi¹ - Show less +12 more•Institutions (1)

Alibaba Group¹

30 May 2020

TL;DR: Xuantie-910 is an industry leading 64-bit high performance embedded RISC-V processor from Alibaba T-Head division that features custom extensions to arithmetic operation, bit manipulation, load and store, TLB and cache operations, and implements the 0.7.1 stable release of RISCV vector extension specification for high efficiency vector processing.

...read moreread less

Abstract: The open source RISC-V ISA has been quickly gaining momentum. This paper presents Xuantie-910, an industry leading 64-bit high performance embedded RISC-V processor from Alibaba T-Head division. It is fully based on the RV64GCV instruction set and it features custom extensions to arithmetic operation, bit manipulation, load and store, TLB and cache operations. It also implements the 0.7.1 stable release of RISC-V vector extension specification for high efficiency vector processing. Xuantie-910 supports multi-core multi-cluster SMP with cache coherence. Each cluster contains 1 to 4 core(s) capable of booting the Linux operating system. Each single core utilizes the state-of-the-art 12-stage deep pipeline, out-of-order, multi-issue superscalar architecture, achieving a maximum clock frequency of 2.5 GHz in the typical process, voltage and temperature condition in a TSMC 12nm FinFET process technology. Each single core with the vector execution unit costs an area of 0.8 mm2 (excluding the L2 cache). The toolchain is enhanced significantly to support the vector extension and custom extensions. Through hardware and toolchain co-optimization, to date Xuantie-910 delivers the highest performance (in terms of IPC, speed, and power efficiency) for a number of industrial control flow and data computing benchmarks, when compared with its predecessors in the RISC-V family. Xuantie-910 FPGA implementation has been deployed in the data centers of Alibaba Cloud, for application-specific acceleration (e.g., blockchain transaction). The ASIC deployment at low-cost SoC applications, such as IoT endpoints and edge computing, is planned to facilitate Alibaba's end-to-end and cloud-to-edge computing infrastructure.

...read moreread less

55 citations

Cites background from "Shakti-T: A RISC-V Processor with L..."

...Some prior arts extended RISC-V to domainspecific accelerators/coprocessors [22], [27]–[29]....
[...]

Proceedings Article•DOI•

Lightweight Secure-Boot Architecture for RISC-V System-on-Chip

[...]

Jawad Haj-Yahya¹, Ming Ming Wong², Vikramkumar Pudi³, Shivam Bhasin², Anupam Chattopadhyay² - Show less +1 more•Institutions (3)

Agency for Science, Technology and Research¹, Nanyang Technological University², Indian Institutes of Technology³

06 Mar 2019

TL;DR: A lightweight hardware-based secure boot architecture that incorporates an optimized Physical Unclonable Function (PUF) for providing keys to the security blocks of the System on Chip (SoC), among which, secure boot and remote attestation are presented.

...read moreread less

Abstract: Securing thousands of connected, resource-constrained computing devices is a major challenge nowadays. Adding to the challenge, third party service providers need regular access to the system. To ensure the integrity of the system and authenticity of the software vendor, secure boot is supported by several commercial processors. However, the existing solutions are either complex, or have been compromised by determined attackers. In this scenario, open-source secure computing architectures are poised to play an important role for designers and white hat attackers. In this manuscript, we propose a lightweight hardware-based secure boot architecture. The architecture uses efficient implementation of Elliptic Curve Digital Signature Algorithm (ECDSA), Secure Hash Algorithm 3 (SHA3) hashing algorithm and Direct Memory Access (DMA). In addition, the architecture includes Key Management Unit, which incorporates an optimized Physical Unclonable Function (PUF) for providing keys to the security blocks of the System on Chip (SoC), among which, secure boot and remote attestation. We demonstrated the framework on RISC-V based SoC. Detailed analysis of performance and security for the platform is presented.

...read moreread less

29 citations

Cites background from "Shakti-T: A RISC-V Processor with L..."

...Shakti-T [8] employs the concept of base and bounds to ensure that pointers access only valid memory regions....
[...]

Proceedings Article•DOI•

SHAKTI-MS: a RISC-V processor for memory safety in C

[...]

Sourav Das¹, R. Harikrishnan Unnithan², Arjun Menon¹, Chester Rebeiro¹, Kamakoti Veezhinathan¹ - Show less +1 more•Institutions (2)

Indian Institute of Technology Madras¹, Birla Institute of Technology and Science²

23 Jun 2019

TL;DR: The proposal is to use stack-based cookies for crafting fat-pointers instead of having object-based identifiers, which eliminates the use of shadow memory space, or any table to store the pointer metadata, and reduces the storage overheads by a great extent.

...read moreread less

Abstract: In this era of IoT devices, security is very often traded off for smaller device footprint and low power consumption. Considering the exponentially growing security threats of IoT and cyber-physical systems, it is important that these devices have built-in features that enhance security. In this paper, we present Shakti-MS, a lightweight RISC-V processor with built-in support for both temporal and spatial memory protection. At run time, Shakti-MS can detect and stymie memory misuse in C and C++ programs, with minimum runtime overheads. The solution uses a novel implementation of fat-pointers to efficiently detect misuse of pointers at runtime. Our proposal is to use stack-based cookies for crafting fat-pointers instead of having object-based identifiers. We store the fat-pointer on the stack, which eliminates the use of shadow memory space, or any table to store the pointer metadata. This reduces the storage overheads by a great extent. The cookie also helps to preserve control flow of the program by ensuring that the return address never gets modified by vulnerabilities like buffer overflows. Shakti-MS introduces new instructions in the microprocessor hardware, and also a modified compiler that automatically inserts these new instructions to enable memory protection. This co-design approach is intended to reduce runtime and area overheads, and also provides an end-to-end solution. The hardware has an area overhead of 700 LUTs on a Xilinx Virtex Ultrascale FPGA and 4100 cells on an open 55nm technology node. The clock frequency of the processor is not affected by the security extensions, while there is a marginal increase in the code size by 11% with an average runtime overhead of 13%.

...read moreread less

14 citations

Cites background or methods from "Shakti-T: A RISC-V Processor with L..."

...Although [23] enhances a RISC-V processor to efficiently implement memory checks, the software support required for [23] is extremely complex....
[...]
...On the other hand, hardware solutions like [23, 25] reduce the run time overhead at the cost of hardware complexity....
[...]
...Safety Check Instrumentation Methods Metadata Size Performance Overheads Spatial Temporal Hardware Compiler Hardware Software [33] ✔ × × ✔ 128*n NA NA [27] ✔ ✔ × ✔ 256*n + 64 NA 29% [25] ✔ ✔ ✔ ✔ 256*n + 64 NA 25% [23] ✔ ✔ ✔ × 64*n + 128 0% NA [7] ✔ × ✔ × 128*n NA 10% Shakti-MS ✔ ✔ ✔ ✔ 128*n 0% 13%...
[...]
...Further, unlike [25], we are not using any separate shadow memory space and unlike [23], there are no additional tables or tag bits that are required in the processor to store pointer metadata....
[...]

Journal Article•DOI•

Stack Redundancy to Thwart Return Oriented Programming in Embedded Systems

[...]

Cyril Bresch¹, David Hely¹, Athanasios Papadimitriou¹, Adrien Michelet-Gignoux², Laurent Amato², Thomas Meyer² - Show less +2 more•Institutions (2)

University of Grenoble¹, Grenoble Institute of Technology²

27 Mar 2018-IEEE Embedded Systems Letters

TL;DR: A hardware-based countermeasure against return address corruption in the processor stack is proposed and validated on the OpenRISC core with a minimal hardware modification of the targeted core and an easy integration at the application level.

...read moreread less

Abstract: With the emergence of Internet of Things, embedded devices are increasingly the target of software attacks. The aim of these attacks is to maliciously modify the behavior of the software being executed by the device. The work presented in this letter has been developed for the Cyber Security Awareness Week Embedded Security Challenge. This contest focuses on memory corruption issues, such as stack overflow vulnerabilities. These low level vulnerabilities are the result of code errors. Once exploited, they allow an attacker to write arbitrary data in memory without limitations. We detail in this letter a hardware-based countermeasure against return address corruption in the processor stack. First, several exploitation techniques targeting stack return addresses are discussed, whereas a lightweight hardware countermeasure is proposed and validated on the OpenRISC core. The countermeasure presented follows the shadow stack concept with a minimal hardware modification of the targeted core and an easy integration at the application level.

...read moreread less

12 citations

Cites background or methods from "Shakti-T: A RISC-V Processor with L..."

...On the other hand, ISA extensions such as Shakti-T [9] and Watchdog Lite [10] aim at mitigating pointer hijacking....
[...]
...First, those that use specific toolchains, compilers [9], [10] or library to adapt an applica-...
[...]
...To identify pointers, Shakti-T, and Watchdog Lite need to instrument the code in advance using compiler modification....
[...]

Journal Article•DOI•

Towards Designing a Secure RISC-V System-on-Chip: ITUS

[...]

Vinay B. Y. Kumar¹, Suman Deb¹, Naina Gupta¹, Shivam Bhasin¹, Jawad Haj-Yahya², Anupam Chattopadhyay¹, Avi Mendelson² - Show less +3 more•Institutions (2)

Nanyang Technological University¹, Technion – Israel Institute of Technology²

01 Dec 2020

TL;DR: This manuscript discusses a set of primitive building blocks of a secure SoC and presents some of the implemented security subsystems using these building blocks—such as secure boot, memory protection, PUF-based key management, a countermeasure methodology for RISC-V micro-architectural side-channel leakage, and an integration of the open keystone-enclaves for TEE.

...read moreread less

Abstract: A rising tide of exploits, in the recent years, following a steady discovery of the many vulnerabilities pervasive in modern computing systems has led to a growing number of studies in designing systems-on-chip (SoCs) with security as a first-class consideration. Following the momentum behind RISC-V-based systems in the public domain, much of this effort targets RISC-V-based SoCs; most ideas, however, are independent of this choice. In this manuscript, we present a consolidation of our early efforts along these lines in designing a secure SoC around RISC-V, named ITUS. In particular, we discuss a set of primitive building blocks of a secure SoC and present some of the implemented security subsystems using these building blocks—such as secure boot, memory protection, PUF-based key management, a countermeasure methodology for RISC-V micro-architectural side-channel leakage, and an integration of the open keystone-enclaves for TEE. The current ITUS SoC prototype, integrating the discussed security subsystems, was built on top of the lowRISC project; however, these are portable to any other SoC code base. The SoC prototype has been evaluated on an FPGA.

...read moreread less

9 citations

1
2
3
4
…
5
6
7
8

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Hardbound: architectural support for spatial safety of the C programming language

[...]

Joseph Devietti¹, Colin Blundell², Milo M. K. Martin², Steve Zdancewic²•Institutions (2)

University of Washington¹, University of Pennsylvania²

01 Mar 2008

TL;DR: A hardware bounded pointer architectural primitive that supports cooperative hardware/software enforcement of spatial memory safety for C programs is proposed, which is a new hardware primitive datatype for pointers that leaves the standard C pointer representation intact, but augments it with bounds information maintained separately and invisibly by the hardware.

...read moreread less

Abstract: The C programming language is at least as well known for its absence of spatial memory safety guarantees (i.e., lack of bounds checking) as it is for its high performance. C's unchecked pointer arithmetic and array indexing allow simple programming mistakes to lead to erroneous executions, silent data corruption, and security vulnerabilities. Many prior proposals have tackled enforcing spatial safety in C programs by checking pointer and array accesses. However, existing software-only proposals have significant drawbacks that may prevent wide adoption, including: unacceptably high run-time overheads, lack of completeness, incompatible pointer representations, or need for non-trivial changes to existing C source code and compiler infrastructure.Inspired by the promise of these software-only approaches, this paper proposes a hardware bounded pointer architectural primitive that supports cooperative hardware/software enforcement of spatial memory safety for C programs. This bounded pointer is a new hardware primitive datatype for pointers that leaves the standard C pointer representation intact, but augments it with bounds information maintained separately and invisibly by the hardware. The bounds are initialized by the software, and they are then propagated and enforced transparently by the hardware, which automatically checks a pointer's bounds before it is dereferenced. One mode of use requires instrumenting only malloc, which enables enforcement of perallocation spatial safety for heap-allocated objects for existing binaries. When combined with simple intraprocedural compiler instrumentation, hardware bounded pointers enable a low-overhead approach for enforcing complete spatial memory safety in unmodified C programs.

...read moreread less

231 citations

"Shakti-T: A RISC-V Processor with L..." refers background in this paper

...This overhead can be justified by the additional implicit security guarantee provided by the PLM against temporal memory attacks, which is absent in [14, 37]....
[...]
...Under this scenario, the solution proposed in [14, 37] would require a storage overhead of 2n words (base and bounds for each pointer), while the solution proposed in [23, 24] would incur a storage overhead of 4n + 1 words (base, bound, lock and key per pointer; and a single key value for the entire set)....
[...]
...• Reduced metadata storage overheads (of the order of 2× and 4× over [14] and [23, 24] respectively) by using a common memory region across all pointers to store the base and bounds....
[...]
...While Shakti-T clearly offers benefits in scenarios where aliased pointers exist, in scenarios where all pointers point to different memory regions it incurs relatively higher storage overheads (three words per pointer - ptr_id, base and bound) as compared to [14, 37] which incurs an overhead of only two words per pointer (base and bound)....
[...]
...Hardware solutions[14, 37, 27], on the contrary, incurs minimal run-time overheads in addition to providing strong security guarantees....
[...]

Proceedings Article•DOI•

The Performance Cost of Shadow Stacks and Stack Canaries

[...]

Thurston H. Y. Dang¹, Petros Maniatis², David Wagner¹•Institutions (2)

University of California, Berkeley¹, Google²

14 Apr 2015

TL;DR: This work studies the inherent overheads of shadow stack schemes, and designs a new scheme, the parallel shadow stack, and shows that its performance cost is significantly less than the traditional shadow stack: 3.5%.

...read moreread less

Abstract: Control flow defenses against ROP either use strict, expensive, but strong protection against redirected RET instructions with shadow stacks, or much faster but weaker protections without. In this work we study the inherent overheads of shadow stack schemes. We find that the overhead is roughly 10% for a traditional shadow stack. We then design a new scheme, the parallel shadow stack, and show that its performance cost is significantly less: 3.5%. Our measurements suggest it will not be easy to improve performance on current x86 processors further, due to inherent costs associated with RET and memory load/store instructions. We conclude with a discussion of the design decisions in our shadow stack instrumentation, and possible lighter-weight alternatives.

...read moreread less

214 citations

Additional excerpts

...Some of the proposed solutions include: stack canaries [8]; encryption of the code pointer [9]; storing the return address in a shadow stack [11, 33, 12]; re-arranging argument locations, return addresses, previous frame pointers and local variables [34]; control flow integrity checks [1]; and, Address Space Layout Randomization (ASLR) [31]....
[...]

Journal Article•DOI•

The Blaster worm: then and now

[...]

Michael Bailey¹, Evan Cooke¹, Farnam Jahanian¹, David W. Watson¹•Institutions (1)

University of Michigan¹

01 Jul 2005

TL;DR: Observing the Blaster worm's activity can provide insight into the evolution of Internet worms.

...read moreread less

Abstract: The Blaster worm of 2003 infected at least 100000 Microsoft Windows systems and cost millions in damage. In spite of cleanup efforts, an antiworm, and a removal tool from Microsoft, the worm persists. Observing the worm's activity can provide insight into the evolution of Internet worms.

...read moreread less

140 citations

"Shakti-T: A RISC-V Processor with L..." refers methods in this paper

...Researchers have also found several ways to exploit this vulnerability, such as the blaster worm [5] and the slammer worm [21] which have been used to perform Distributed Denial of Service attacks within a network....
[...]

Proceedings Article•DOI•

Low-fat pointers: compact encoding and efficient gate-level implementation of fat pointers for spatial safety and capability-based security

[...]

Albert Kwon¹, Udit Dhawan¹, Jonathan M. Smith¹, Thomas F. Knight, André DeHon¹ - Show less +1 more•Institutions (1)

University of Pennsylvania¹

04 Nov 2013

TL;DR: To achieve the safety of fat pointers without increasing program state, this work compactly encode approximate base and bound pointers along with exact address pointers for a 46b address space into one 64-bit word with a worst-case memory overhead of 3%.

...read moreread less

Abstract: Referencing outside the bounds of an array or buffer is a common source of bugs and security vulnerabilities in today's software. We can enforce spatial safety and eliminate these violations by inseparably associating bounds with every pointer (fat pointer) and checking these bounds on every memory access. By further adding hardware-managed tags to the pointer, we make them unforgeable. This, in turn, allows the pointers to be used as capabilities to facilitate fine-grained access control and fast security domain crossing. Dedicated checking hardware runs in parallel with the processor's normal datapath so that the checks do not slow down processor operation (0% runtime overhead). To achieve the safety of fat pointers without increasing program state, we compactly encode approximate base and bound pointers along with exact address pointers for a 46b address space into one 64-bit word with a worst-case memory overhead of 3%. We develop gate-level implementations of the logic for updating and validating these compact fat pointers and show that the hardware requirements are low and the critical paths for common operations are smaller than processor ALU operations. Specifically, we show that the fat-pointer check and update operations can run in a 4 ns clock cycle on a Virtex 6 (40nm) implementation while only using 1100 6-LUTs or about the area of a double-precision, floating-point adder.

...read moreread less

130 citations

"Shakti-T: A RISC-V Processor with L..." refers methods in this paper

...A lightweight hardware implementation of fat-pointers (Low-Fat pointers), as proposed in [20], reduces this storage overhead by encoding the base and bounds into a custom 16-bit floating point format which is stored in the upper bits of the 64-bit virtual address....
[...]

Journal Article•DOI•

Watchdog: hardware for safe and secure manual memory management and full memory safety

[...]

Santosh Nagarakatte¹, Milo M. K. Martin¹, Steve Zdancewic¹•Institutions (1)

University of Pennsylvania¹

09 Jun 2012

TL;DR: This paper extends Watchdog's mechanisms to detect bounds errors, thereby providing full hardware-enforced memory safety at low overheads, and streamline the implementation and reduce runtime overhead.

...read moreread less

Abstract: Languages such as C and C++ use unsafe manual memory management, allowing simple bugs (i.e., accesses to an object after deallocation) to become the root cause of exploitable security vulnerabilities. This paper proposes Watchdog, a hardware-based approach for ensuring safe and secure manual memory management. Inspired by prior software-only proposals, Watchdog generates a unique identifier for each memory allocation, associates these identifiers with pointers, and checks to ensure that the identifier is still valid on every memory access. This use of identifiers and checks enables Watchdog to detect errors even in the presence of reallocations. Watchdog stores these pointer identifiers in a disjoint shadow space to provide comprehensive protection and ensure compatibility with existing code. To streamline the implementation and reduce runtime overhead: Watchdog (1) uses micro-ops to access metadata and perform checks, (2) eliminates metadata copies among registers via modified register renaming, and (3) uses a dedicated metadata cache to reduce checking overhead. Furthermore, this paper extends Watchdog's mechanisms to detect bounds errors, thereby providing full hardware-enforced memory safety at low overheads.

...read moreread less

124 citations