JASMIN KAUR, University of South Florida, USA ALVARO CINTAS CANTO, Marymount University, USA MEHRAN MOZAFFARI KERMANI, University of South Florida, USA REZA AZARDERAKHSH, Florida Atlantic University, USA

This survey is the first work on the current standard for lightweight cryptography, standardized in 2023. Lightweight cryptography plays a vital role in securing resource-constrained embedded systems such as deeplyembedded systems (implantable and wearable medical devices, smart fabrics, smart homes, and the like), radio frequency identification (RFID) tags, sensor networks, and privacy-constrained usage models. National Institute of Standards and Technology (NIST) initiated a standardization process for lightweight cryptography and after a relatively-long multi-year effort, eventually, in Feb. 2023, the competition ended with ASCON as the winner. This lightweight cryptographic standard will be used in deeply-embedded architectures to provide security through confidentiality and integrity/authentication (the dual of the legacy AES-GCM block cipher which is the NIST standard for symmetric key cryptography). ASCON's lightweight design utilizes a 320-bit permutation which is bit-sliced into five 64-bit register words, providing 128-bit level security. This work summarizes the different implementations of ASCON on field-programmable gate array (FPGA) and ASIC hardware platforms on the basis of area, power, throughput, energy, and efficiency overheads. The presented work also reviews various differential and side-channel analysis attacks (SCAs) performed across variants of ASCON cipher suite in terms of algebraic, cube/cube-like, forgery, fault injection, and power analysis attacks as well as the countermeasures for these attacks. We also provide our insights and visions throughout this survey to provide new future directions in different domains. This survey is the first one in its kind and a step forward towards scrutinizing the advantages and future directions of the NIST lightweight cryptography standard introduced in 2023.

CCS Concepts: • Hardware  $\rightarrow$  Application specific integrated circuits; Hardware reliability screening.

Additional Key Words and Phrases: ASCON, ASIC, differential cryptanalysis, field-programmable gate array (FPGA), lightweight cryptography (LWC), machine-learning (ML) attacks, NIST, side-channel analysis attacks (SCA).

#### **ACM Reference Format:**

Jasmin Kaur, Alvaro Cintas Canto, Mehran Mozaffari Kermani, and Reza Azarderakhsh. 2023. A Comprehensive Survey on the Implementations, Attacks, and Countermeasures of the Current NIST Lightweight Cryptography Standard. *ACM Comput. Surv.*-, -, Article - (April 2023), 16 pages. https://doi.org/10.1145/nnnnnn.nnnnnn

Authors' addresses: Jasmin Kaur, University of South Florida, Tampa, FL, 33620, USA, jasmink1@usf.edu; Alvaro Cintas Canto, Marymount University, Arlington, VA, 22207, USA, acintas@marymount.edu; Mehran Mozaffari Kermani, University of South Florida, Tampa, FL, 33620, USA, mehran2@usf.edu; Reza Azarderakhsh, Florida Atlantic University, Boca Raton, FL, 33431, USA, razarderakhsh@fau.edu.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org.

https://doi.org/10.1145/nnnnnnnnnnn

<sup>© 2023</sup> Association for Computing Machinery.

<sup>0360-0300/2023/04-</sup>ART- \$15.00

#### **1 INTRODUCTION**

Lightweight cryptography (LWC) has become a necessity today as the world is extensively adopting the Internet of Things (IoT), and the Internet of Nano-Things. LWC is extensively used in resource constraint devices such as radio frequency identification (RFID) tags, wireless sensor networks (WSN), and embedded systems (implantable and wearable medical devices, smart fabrics, smart homes, and the like) to ensure their applications are secure. However, security does not mean reliability, and many lightweight cryptographic algorithms such as ASCON are vulnerable to SCAs. Before any standardization efforts existed for lightweight cryptography, there were many research work performed on various aspects of lightweight ciphers including the works by the authors, e.g., for side-channel analysis (SCA) [1]-[7].

The National Institute of Standards and Technology (NIST) initiated a standardization process for lightweight cryptography and after a relatively-long multi-year effort, eventually, in Feb. 2023 the competition ended with ASCON [8] as the winner among the other round three candidates -Elephant [9], GIFT-COFB [10], Grain128-AEAD [11], ISAP [12], Photon-Beetle [13], Romulus [14], Sparkle [15], TinyJambu [16], and Xoodyak [17]. This lightweight cryptographic standard will be used in deeply-embedded architectures to provide security. Previously, ASCON was also chosen as a finalist of the CAESAR competition for authenticated encryption.

ASCON is a lightweight cipher suite that provides authenticated encryption with associated data (AEAD) as well as hashing functionalities. It uses a duplex-based mode of operation [8]. The 320-bit permutation of ASCON iteratively applies a substitution-permutation network to encrypt/decrypt data in a bit-slice fashion. This bit-slice implementation of ASCON permutation makes it scalable to 8-, 16-, 32-, and 64-bit platforms while remaining lightweight. ASCON has two different variants for different message lengths - ASCON-128 and ASCON-128a. ASCON-128 uses a message length of 64 bits while ASCON-128a uses a message length of 128 bits. ASCON also has a post-quantum secure variant called ASCON-128pq which is the same as ASCON-128 but uses a 160-bit length key [8]. We note that post-quantum cryptography (PQC) refers to attacks enabled at the presence of powerful quantum computers. The algorithms for public-key cryptography were standardized in 2022; yet, symmetric-key cryptography has much less issues and the larger keys can be used for such threats. The current NIST PQC winners are CRYSTALS-KYBER [18], CRYSTALS-DILITHIUM [19], FALCON [20], and SPHINCS+ [21].

Over the years, various cryptanalysis, both differential and side-channel, have been performed on different ASCON variants. Madushan et al. [22] explore the various fault analysis of the NIST LWC standardization process finalists. Furthermore, Dobraunig et al. [23] perform in- depth cryptanalysis of ASCON for key-recovery attacks, forgery attacks, and algebraic attacks by using zerosum distinguisher. Moreover, they leverage the low algebraic degree of ASCON to construct a zero-sum distinguisher, i.e., a set of input and output values for which sum to zero over  $\mathbb{F}_2^n$ , for the 12-round ASCON that is able to highlight the ASCON permutation from a random permutation with a complexity of  $2^{130}$  by targeting the internal state after round 5. The recent cryptanalysis of ASCON has strived towards improving the work of [23] as well as to propose new methodology for determining new distinguishers for differential, cube, algebraic, and forgery based key-recovery attacks. This study extends the work of [22], and summarizes the new differential cryptanalysis and SCA works performed on ASCON in hardware/software implementations.

The SCA works reviewed in this survey also include statistical fault analysis (SFA), machine learning (ML), and differential power analysis-based key-recovery attacks performed on the hard-ware implementations of ASCON. In SFA, the attacker performs a statistical analysis of the injected fault on the output of the ASCON permutation to fully recover the secret key; in machine learning (ML) strategies implement deep and reinforced/unsupervised learning techniques to extract



Fig. 1. The associated encryption of ASCON [8].

the secret key; while, the differential power analysis (DPA) exploit the vulnerabilities in the AS-CON initialization operation to mount key-recovery attacks on the hardware implementations of ASCON. The SCA countermeasures, which are also reviewed in this work, include threshold implementation strategies, stronger S-box design, error-detection mechanisms, as well as protected architectures against side-channel leakage. This paper also summarizes the various hardware implementations of ASCON that have improved the design for better area utilization, power consumption, throughput, energy, and efficiency on FPGA and ASIC hardware platforms. We also provide our insights and visions throughout this survey to provide new future directions in different domains. This survey is the first one in its kind and a step forward towards scrutinizing the advantages and future directions of the NIST lightweight cryptography standard introduced in 2023.

The organization of the paper is as follows. Section 2 describes the design and architecture of ASCON. Section 3 explores the hardware implementations of ASCON presented in previous and current literature. Section 4 lists the existent differential cryptanalysis and SCAs performed on ASCON. Finally, we conclude the review in Section 5.

### 2 PRELIMINARIES

The entire design specification of ASCON is given in [8]. ASCON's encryption process (Fig. 1) is designed as a sponge-based MonkeyDuplex construction which consists of 4 stages, namely, initialization operation, processing of the associated data, processing of the plaintext, and the finalization operation. These four stages get updated using two 320-bit permutations, i.e.,  $p^a$  and  $p^b$ , where *a* and *b* are the number of rounds. Both of the permutations are bit-sliced into five



Fig. 2. The 5-bit S-box of ASCON [8, 53].

ACM Comput. Surv., Vol. -, No. -, Article -. Publication date: April 2023.

-:3

Table 1. LUT representation of the non-linear 5-bit S-box SB of ASCON-128 in hexadecimal form for input vector  $\mu$ 

| μ         | 0  | 1  | 2  | 3  | 4  | 5  | 6  | 7  | 8  | 9  | а  | b  | с  | d  | e  | f  |
|-----------|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|
| $SB(\mu)$ | 4  | b  | 1f | 14 | 1a | 15 | 9  | 2  | 1b | 5  | 8  | 12 | 1d | 3  | 6  | 1c |
| μ         | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 1a | 1b | 1c | 1d | 1e | 1f |
| $SB(\mu)$ | 1e | 13 | 7  | e  | 0  | d  | 11 | 18 | 10 | с  | 1  | 19 | 16 | а  | f  | 17 |

64-bit register words which make up the 5-bit internal state. In a full 12-round ASCON, the permutations iteratively apply a substitution-permutation network (SPN)-based round transformation which consists of adding round constants, applying the substitution layer, and employing the linear layer for diffusion to the internal state.

The substitution layer consists of a non-linear 5-bit S-box (Fig. 2) whose hexadecimal form is shown in Table 1. This S-box is applied 64 times in parallel to update each bit-slice of the internal state. The 5-bit S-box is designed using Boolean logic which makes it highly compact and lightweight for implementations on both ASIC and FPGA hardware platforms. The linear layer of ASCON updates each 64-bit word of the internal state by first rotating register words with different shift values, and then performing a modulo-2 addition on the shifted word values.

# 3 HARDWARE IMPLEMENTATION OF ASCON

This section goes over the various hardware implementations of the ASCON family along with any optimizations that have been proposed in recent years. All the overhead results in terms of area, power, delay, throughput for FPGA implementations, and energy utilization for ASIC implementations are tabulated in Table 2 and Table 3, respectively.

Various hardware designs of ASCON are implemented in [24] for applications such as RFID tags, WSNs, and embedded systems. Such hardware implementations of ASCON that [24] proposes are: ASCON-fast, ASCON-64-bit, and ASCON-x-low-area.

ASCON-fast [24] is a high throughput design with minimal processing delay, which uses unrolled round transformations. At least one round transformation is performed each clock cycle without any pipelining. This allows multiple rounds to complete in a single clock cycle and each ASCON-fast variant uses a different number of the unrolled round transformations. The unrolled round transformation is connected with the data bus and key registers using a few additional multiplexers and XOR gates.

ASCON-64-bit [24] uses an arithmetic logic unit (ALU) where the control path executes similarly to a sequential code. The design uses two temporary registers in addition to the five state registers which along the inputs from the control path constitute the inputs to the ALU. The ALU takes the 64-bit data input and arranges them in either the high or low part of the selected operand using a barrel-shift unit, a data storage unit, and a three logic operations. The result of the operation is selected at the output of the ALU which is then applied to the destination register. The S-box and the linear layer are iteratively calculated using the ALU operations during the execution phase, thus making one round operation 59 clock cycles. The design of the S-box is altered to use twenty five 3-operand instructions and two temporary registers to decrease the area.

In the ASCON-x-low-area variant proposed in [24], the datapath is designed to use a radical lowarea "one-bit operation per cycle" approach. The five state registers are clock-gated shift registers with independent shift-enable inputs. All the state registers are active during the S-box calculation and the data is shifted bit-slice-wise in each S-box instance in 64 clock cycles. The linear diffusion layer updates each state register individually in five interleaved sub-iterations. A temporary shift register is used to store the results of the current linear layer in one iteration and which are then

| ASCON                                  | FPGA Hardware | Area         | Power | Throughput | Efficiency |
|----------------------------------------|---------------|--------------|-------|------------|------------|
| Architecture                           | Platform      | (LUTs)       | (mW)  | (Mbps)     | (Mbps/LUT) |
| ASCON (original) [53]                  | Spartan 7     | 371 (slices) | 99    | 6.646      | 17.914     |
| ASCON (original) [53]                  | Kintex 7      | 376 (slices) | 88    | 6.709      | 17.843     |
| ASCON (unprotected) [58]               | Spartan 6     | 2048         | 11.5  | 255.4      | 0.1247     |
| ASCON [26]                             | Artix7        | 1808         | 26.8  | 39.0       | -          |
| ASCON-128 [27]                         | Artix7        | 1330         | 31    | 457        | 0.343      |
| RECO-HCON (128) [29]                   | Artix7        | 1548         | -     | 5926       | -          |
| RECO-HCON (128a) [29]                  | Artix7        | 1548         | -     | 9077       | -          |
| RECO-HCON (hash) [29]                  | Artix7        | 1548         | -     | 3160       | -          |
| RECO-HCON (hash-a) [29]                | Artix7        | 1548         | -     | 4534       | -          |
| ASCON (protected) [58]                 | Spartan 6     | 6364         | 37.5  | 134.6      | 0.0212     |
| ASCON-128 (Logic One-bit) [53]         | Spartan 7     | 373 (slices) | 99    | 6705       | 17.783     |
| ASCON-128 (Logic Interleaved-bit) [53] | Spartan 7     | 380 (slices) | 99    | 6687       | 17.444     |
| ASCON-128 (Logic CRC-3) [53]           | Spartan 7     | 407 (slices) | 99    | 6603       | 15.601     |
| ASCON-128 (LUT One-bit) [53]           | Spartan 7     | 372 (slices) | 100   | 6448       | 17.333     |
| ASCON-128 (LUT Interleaved-bit) [53]   | Spartan 7     | 377 (slices) | 100   | 6443       | 17.090     |
| ASCON-128 (LUT CRC-3) [53]             | Spartan 7     | 425 (slices) | 100   | 6431       | 15.131     |
| ASCON-128 (Logic One-bit) [53]         | Kintex 7      | 381 (slices) | 89    | 6705       | 17.598     |
| ASCON-128 (Logic Interleaved-bit) [53] | Kintex 7      | 384 (slices) | 89    | 6687       | 17.414     |
| ASCON-128 (Logic CRC-3) [53]           | Kintex 7      | 385 (slices) | 89    | 6603       | 17.150     |
| ASCON-128 (LUT One-bit) [53]           | Kintex 7      | 363 (slices) | 89    | 6776       | 18.667     |
| ASCON-128 (LUT Interleaved-bit) [53]   | Kintex 7      | 372 (slices) | 89    | 6409       | 17.228     |
| ASCON-128 (LUT CRC-3) [53]             | Kintex 7      | 384 (slices) | 89    | 6383       | 16.224     |
| ASCON-128 (unrolled) [30]              | Virtex4       | 26943        | -     | 817.41     | 0.031      |
| ASCON-128 (recursive) [30]             | Virtex4       | 4021         | -     | 506.29     | 0.125      |
| ASCON-128 (unrolled) [30]              | Virtex7       | 22636        | -     | 1342.31    | 0.059      |
| ASCON-128 (recursive) [30]             | Virtex7       | 2708         | -     | 721.53     | 0.266      |
| ASCON-128 (unrolled) [30]              | Spartan6      | 22636        | -     | 688.83     | 0.031      |
| ASCON-128 (recursive) [30]             | Spartan6      | 2781         | -     | 346.50     | 0.124      |
| ASCON-128a (unrolled) [30]             | Virtex4       | 30006        | -     | 1496.25    | 0.049      |
| ASCON-128a (recursive) [30]            | Virtex4       | 4215         | -     | 970.25     | 0.231      |
| ASCON-128a (unrolled) [30]             | Virtex7       | 25187        | -     | 2419.88    | 0.096      |
| ASCON-128a (recursive) [30]            | Virtex7       | 2916         | -     | 1357.08    | 0.465      |
| ASCON-128a (unrolled) [30]             | Spartan6      | 25187        | -     | 1247.22    | 0.049      |
| ASCON-128a (recursive) [30]            | Spartan6      | 2918         | -     | 638.44     | 0.218      |
| Fault-injected ASCON [49]              | SASEBO-GII    | 217          | 7.8   | 198        | -          |
| Key-bypass HT on ASCON [33]            | SoC Cyclone V | 827          | 22.4  | -          | -          |
| Round-reduction HT on ASCON [33]       | SoC Cyclone V | 771          | 22.3  | -          | -          |

Table 2. Overhead results of different hardware implementations of ASCON on FPGA hardware platforms

written back in the next iteration. This low-area design uses 512 clock cycles per round transformation. All the overhead results for the aforementioned implementations are tabulated in Table 3.

Diehl et. al. [25] compare the protected and unprotected implementation of ASCON against firstorder differential power analysis (DPA) using test vector leakage assessment (TVLA) implemented using Flexible Opensource workBench fOr Side-channel analysis (FOBOS). The overhead results of the protected ASCON implementations are presented in Table 2.

FOBOS 2, an upgraded and optimized FOBOS, is proposed in [26] which is used to evaluate power measurements and SCA resistance for the hardware implementations of various lightweight ciphers with AEAD functionality on the Xilinx Artix-7 FPGA board. The results of power consumption, frequency, throughput, and energy/bit obtained using FOBOS 2 are tabulated in Table 2. The results show that ASCON performed better in terms of having the lowest power consumption (33.5 mW at 50 MHz), and lowest incrementally increasing dynamic power with increasing frequency among the NIST standardization process candidates Spoc, Spook, and GIFT-COFB. The energy per bit of ASCON was 0.86 nJ/bit while the static power consumption was around 27 mW.

An FPGA-based application of ASCON cipher in portable Internet of Medical Things (IoMT) devices is presented in [27], where the cipher is used to enhance the security of such devices using

| ASCON Architecture                         | Area (kGE)                 | Power $(mW)$ | Throughput (Mbps) | Energy              |
|--------------------------------------------|----------------------------|--------------|-------------------|---------------------|
| ASCON-fast (6 rounds w/ interface) [24]    | 25.80                      | 0.184        | 13,218            | 23 µJ/byte          |
| ASCON-64-bit (w/ interface) [24]           | 5.86                       | 0.032        | 72                | 1,397 µJ/byte       |
| ASCON-x-low-area (w/ interface) [24]       | 3.75                       | 0.015        | 14                | 5,706 µJ/byte       |
| ASCON-fast-TI (6 rounds w/ interface) [24] | 125.19                     | 0.830        | 9,028             | 104 µJ/b yte        |
| ASCON-x-low-TI (w/ interface) [24]         | 9.19                       | 0.045        | 15                | 17,234 µJ/byte      |
| RECO-HCON (128) [29]                       | 25.1                       | 1.990        | 5926              | 0.335 pJ/bit        |
| RECO-HCON (128a) [29]                      | 25.1                       | 1.990        | 9077              | 0.219 pJ/bit        |
| RECO-HCON (hash) [29]                      | 25.1                       | 1.990        | 3160              | 0.637 pJ/bit        |
| RECO-HCON (hash-a) [29]                    | 25.1                       | 1.990        | 4534              | 0.439 pJ/bit        |
| ASCON CMOS [31]                            | 10.1529                    | 0.7459       | -                 | 478.9 pJ (64-bit)   |
| ASCON CMOS [31]                            | 10.1529                    | 0.7459       | -                 | 736.2 pJ (256-bit)  |
| ASCON CMOS [31]                            | 10.1529                    | 0.7459       | -                 | 1079.4 pJ (512-bit) |
| ASCON CMOS/STT-MRAM [31]                   | 10.7155                    | 0.7148       | -                 | 427.9 pJ (64-bit)   |
| ASCON CMOS/STT-MRAM [31]                   | 10.7155                    | 0.7148       | -                 | 556.5 pJ (256-bit)  |
| ASCON CMOS/STT-MRAM [31]                   | 10.7155                    | 0.7148       | -                 | 728.1 pJ (512-bit)  |
| Key-bypass HT on ASCON [33]                | 4024 (combinational cells) | 22.4         | -                 | -                   |
| Round-reduction HT on ASCON [33]           | 3971 (combinational cells) | 22.3         | -                 | -                   |

Table 3. Overhead results of different hardware implementations of ASCON on ASIC hardware platform

AEAD functionality. A round-based architecture of ASCON is designed for round calculation per clock cycle. The proposed design utilizes the dual output LUT (LUT6) feature of the Xilinx 7-series FPGA boards to implement the 5-bit S-box of ASCON for the optimized area. Implementing the 5-bit S-box of ASCON using LUT6 utilized only three LUTs in FPGA implementation, significantly optimizing the area when compared to other hardware implementations of ASCON. The hardware implementation is performed on the Xilinx Artix-7 FPGA family and the area (in terms of LUTs), throughput, frequency, and efficiency (throughput/area) results are shown in Table 2. This proposed implementation of ASCON cipher consumed 35% less area and 56% more efficiency when compared to the architecture of ASCON in [28].

In [29], a flexible, reconfigurable, and energy-efficient crypto-processor to run ASCON is introduced by Wei et. al. The proposed ASCON crypto-processor runs in six different modes: Encryption, decryption, and hash function with different data sizes. The crypto-processor consists of an ASCON core, shift registers (FIFO), and an I/O interface. First, the data and text inputs are loaded into 128-bit shift registers as FIFO while the key, the nonce, and the target instance mode are processed on the *Start* signal. The ASCON core stays occupied until the entire ciphertext reaches the same size as the input message in AEAD mode, reaches 256 bits when in hashing mode, or the tag verification result is given out once done. The four shift registers provide flexibility in adapting the ASCON processor to various IoT systems with variable block sizes as they are used to divide and pad the inputs to match the block size of the used default variant ASCON-128a. The input sizes for other ASCON variants ASCON-128 and ASCON-hash are processed either by splitting the 128-bit inputs as two 64-bit inputs or by adapting a different counting technique to fully utilize the space, respectively.

The ASCON-core [29] consists of two stages: Selective XOR and parameterized permutation. It runs iteratively to keep the permutation block (Section 2) busy once started. For every fixed number of rounds, the permutation block reads a new message. Between two iterations of the ASCON permutation, the input for the next iteration is computed by XORing the current sponge state with either another message, key, or a constant. The challenge of supporting multiple instances with an arbitrary round number, variable XOR operand, and a block size is overcome by splitting the current sponge state into two parts: A head comprising 128 most significant bits (MSB), and a tail comprising 192 least significant bits (LSB). The area, frequency, power, throughput, and efficiency overheads of the proposed architecture on FPGA and ASIC hardware platforms are shown in Table 2 and Table 3, respectively.

Khan et. al. [30] explore the hardware performance of ASCON for artificial intelligence (AI) enabled IoT devices. Unrolled and recursive strategies have been adopted for ASCON implementations on Virtex-4, Virtex-7, and Spartan-6 FPGA families. The unrolled scheme has been designed to achieve high throughput while the recursive scheme helps in reducing hardware costs. For the unrolled architecture, the encryption/decryption is performed using combinational circuits and ASCON permutation is deployed in an unrolled manner for initialization, run, and finalization phases. This results in higher throughput at the cost of high area overhead. Moreover, since the permutation function utilizes the same hardware for every stage, a recursive strategy is implemented by the authors to achieve high throughput. To successfully implement hardware re-utilization in ASCON permutation functions, the authors propose computing two permutation rounds per clock cycle, thus requiring a total of 24 clock cycles for encryption/decryption using ASCON-128 (as opposed to 36 clock cycles for 36 permutations) and 26 clock cycles for the ASCON-128a variant (as opposed to 40 clock cycles for 40 permutations). The area overhead is negligible for any additional XOR operations required. The overhead results of both unrolled and recursive implementations have been tabulated in Table 2.

A CMOS/STT-MRAM-based hardware implementation of ASCON benchmarked on an ASIC hardware platform is proposed by Roussel et. al. [31]. Such implementation is made resilient to power failure by replacing volatile CMOS flip-flops with non-volatile flip-flops to save the intermediate state of ASCON computations which can then be retrieved on startup. This hybrid CMOS/STT-MRAM implementation helps in reducing energy utilization between 11% to 48%, while incurring an area overhead of about 5.5% when compared to CMOS-only implementation of ASCON. These results are presented in Table 3.

# 4 DIFFERENTIAL AND SIDE-CHANNEL CRYPTANALYSIS OF ASCON

Various structural and mathematical vulnerabilities of cryptographic ciphers with authenticated encryption such as ASCON can be exploited to gather information from the permutation during encryption/decryption processes. This section summarizes the works presenting cryptanalysis of ASCON in terms of algebraic attacks, cube/cube-like attacks, differential attacks, fault attacks, and power attacks. Most of these attacks target the reduced round versions of ASCON to recover the secret key successfully.

# 4.1 Algebraic Attacks

In [32], Luo et. al. successfully attempt to recover the entire 128-bit key, is carried out in the software implementation, of ASCON by attacking its permutation function and performing a soft side-channel analytical attack (SASCA) using a factor graph method. The factor graph of the inner permutation state is built using a template matching technique on side-channel information leakage. Then, this factor graph is run through their proposed Belief Propagation (BP) algorithm to recover the secret key. From the simulations run on an 8-bit platform, it is observed that the proposed attack can recover the entire key using a simple Hamming-weight leakage model on only a few traces of leaked information while having low delay and memory overheads. The authors suggest that the attack could be improved by utilizing multi-variate value templates or machine learning algorithms.

# 4.2 Cube Attacks

Cube attacks aim to recover the secret key, one bit at a time, in the block or stream ciphers by manipulating a set of cube variables [33]. The manipulated cube variables are then used to generate encrypted messages which act as a system of linear equations which can be solved to obtain the key bits. Any remaining key bit which was not obtained as a solution of the linear equations can

be obtained by the brute force method [33]. Several cube attack strategies and implementations have been applied to the reduced-round ASCON variants. The most successful attack has broken 7 out of 12 rounds of ASCON [37]. This section briefly compiles all the cube attacks performed in software/hardware implementations of ASCON.

Halak and Duarte-Sanchez [33] perform an attack on a reduced round ASCON by mounting a cube attack on the initialization function of the ASCON permutation in a hardware trojan (HT) compromised setting. They leverage the vulnerabilities in the SoC FPGA hardware implementation of ASCON-128/128a variants such as unused/ partially used states in the FSM operations, unused values in the initial inputs, and manipulating the round number in the iterative implementation, to inject HT in the hardware. The authors propose two HT designs to obtain the secret key of reduced round ASCON, namely, the key-bypass trojan, and the round-reduction trojan [33]. The key-bypass trojan inserts a malicious state in the FSM which manipulates the control signals to bypass the key directly to the output of the cipher. The round-reduction trojan makes the cipher vulnerable to cube attack by decreasing the number of permutation rounds from 12 to 5 in the initialization phase of ASCON encryption, decreasing the time complexity from 2<sup>103.9</sup> to 2<sup>24</sup> to recover the secret key. The overhead results of HT's compromised ASCON implementation are shown in Table 3. The area overhead incurred is about 7% when compared to the original design results [33]. Since the overheads of the injected trojans are low, they go undetected in the hardware implementation. The authors propose countermeasures such as pre-silicon circuit verification techniques (UCI [34], VeriTrust [35], or FANCI [36]) to detect unused inputs, post-silicon functional/structural verification strategies, or runtime trojan detection techniques. Hardware modifications such as pipelined/unrolled implementations, better key management, and one-hot encoding of the FSM are also proposed to strengthen the ASCON architecture against HTs.

In [37], Li et. al. mount a practical conditional cube attack on a reduced 5/6 round ASCON. A cube-like key-subset technique [37] is utilized as a key dividing strategy for specific key conditions to recover the entire key space. The practicality of such an attack is tested in a software environment. However, the full 12-round ASCON implementation is resilient to this attack. The success of this attack lies in the construction of 65-dimension cubes due to the key dividing strategy, which divides the key into 63 key subsets. To determine a correct subset containing a correct key bit the *cube sum* of each cube is analyzed. This process is continued until each of the 63 key subsets either passes or fails the cube tests, allowing the recovery of the secret key in a 7-round ASCON with a time complexity of  $2^{103.9}$ . The time complexity of the proposed attack is further reduced to  $2^{77}$  if the key is weak.

Similar to the work of [37], a practical conditional cube attack on ASCON in a nonce-misuse setting is investigated by Baudrin et. al. [38]. The attack model aims to recover the capacity, i.e., unknown inner part, of the 6-round ASCON state right before the encryption operation by reusing the key-nonce pair multiple times to recover the full state and gather information about the plaintext from corresponding ciphertexts. In [38], a new strategy to search for conditional cubes in ASCON is presented, where the public variables are split based on the coefficients of quadratic monomials after two rounds. This allows to recover 64 to 128 bits of the internal states, which in turn allows to recover the remaining 256 bits of the state using brute force with a time complexity of up to 2<sup>40</sup>. However, this attack does not break the security of the original nonce respecting ASCON design [8] and only aims to provide insight into the potential vulnerabilities of the cipher.

The authors of [39] also explore conditional cube attacks on ASCON variants ASCON-128 and ASCON-128pq in a nonce-misuse scenario. The authors implement a conditional cube attack using a proposed partial-state-recovery method to recover 192 bits of the 320-bit ASCON permutation from 2<sup>44.8</sup> data complexity by misusing the nonce. The remaining permutation bits are recovered

using brute force in  $2^{128}$  time. After recovering all the permutation bits, the secret key is also recovered with a time complexity of  $2^{129.5}$  using  $2^{31.5}$  bits of data and  $2^{31.5}$  bits of memory.

A machine learning (ML)-based known plaintext attack utilizing deep learning (DL) is proposed for the encryption operation of ASCON [40]. The attack predicts the plaintext with a 99.8% accuracy in a nonce-misuse setting and no finalization function. The proposed attack does not work in a real-world application of ASCON where nonce use is respected and a satisfactory amount of randomness is introduced in the cipher functions so that an ML-based attack cannot match the inputs and outputs meaningfully.

In [41], Rohit et. al. perform key-recovery attacks on a software implementation of 7-round ASCON using a superpoly-recovery technique called partial polynomial multiplication. Using this technique the entire 128-bit key is recovered from  $2^{64}$  bits of data with a time complexity of  $2^{123}$ , while utilizing only  $2^{101}$  bits of memory. Using division properties [42] new cube distinguishers are also identified for 7-round ASCON, also improving the cube distinguishers for other reduced round ASCON implementations to recover the secret key in a nonce/key respecting scenario.

#### 4.3 Differential Cryptanalysis

In [43], Zong et. al. mount two differential-based collision attack strategies on the sponge-based hash variants of ASCON namely, ASCON-Hash and ASCON-Xof. The first attack is a non-practical kind on a 2-round ASCON-Hash with a time complexity of  $2^{125}$ . A practical kind of collision attack is also performed on a 2-round ASCON-Xof with a time complexity of  $2^{15}$  for an output of 64-bit. The differential characteristics of the hash variants are found using the MILP method and the target differential algorithm [44].

Gerault et. al. [45] utilize Constraint Programming (CP) to perform differential cryptanalysis, in software, on the permutation of ASCON variants ASCON-128, and ASCON-128a. The capabilities of CP in finding good differentials are used to generate differential characteristics for ASCON to form limited-birthday distinguishers (for rounds 4-7) and rectangle attacks (for rounds 4 and 5). The distinguishers are divided into black-box and non-black-box types based on their usability for attacks on permutations with or without a key, respectively. High-probability differentials are introduced as well to improve collision attacks on ASCON-hash (proposed in [43]) with a time complexity of 2<sup>103</sup>. Additionally, multiple differential characteristics for rounds 3 and 4 have been used forgery attacks, in a nonce-respecting setting, on the permutation and finalization functions of reduced-round ASCON-128 and ASCON-128a. The time complexity for such an attack for the three rounds of finalization operation is determined to be  $2^{32.76}$ . For 4-round ASCON-128, the time complexity of such an attack in the finalization phase is  $2^{96.61}$ . However, in this scenario, the attack exceeds the recommended amount of data blocks processed for a single key. For ASCON-128a, the forgery attack is performed on both the iterative permutation and the finalization functions. For the iterative permutation, the time complexity for the proposed attack is  $2^{117}$ , however, it also exceeds the processed data blocks limit for a given key established by the ASCON designers [8].

Using the undisturbed bits in the ASCON S-box, Tezcan [46] performs a truncated, impossible, and improbable differential analysis on reduced 4/5 round ASCON. It is observed that there are 35 undisturbed bits in the execution of the S-box which are used to generate the aforementioned differentials. The impossible differential analysis is based on the idea that for a specific difference, the differential cannot occur, i.e., its probability is zero. This helps in finding the correct key by removing the incorrect keys determined using impossible differentials which have a probability of 1. Truncated differentials with a probability of 1 are coupled with the cipher symmetry and can help determine whether the two key-bits associated with the active S-boxes are 1. Thus, for a truncated attack mounted on a 4-round ASCON, 16 key bits would be 1 and the remaining 48 key

bits can recovered by brute force using  $3^{48}$  encryptions in 4-round ASCON using  $2^2$  bits of data. A truncated attack on 5-round ASCON can recover 70 key bits and the remaining key bits are recovered in either  $2^{58}$  encryptions (for weak keys) or  $2^{128} - 2^{64}$  encryptions with  $2^{109}$  bits of data. The improbable differential attack on 5-round ASCON is similar to the aforementioned truncated differential attack which again uses  $2^{109}$  data bits to determine the ASCON permutation from a random permutation by complementing the output differences. The miss-in-the-middle technique is combined with the truncated differential in the decryption operation to mount an impossible differential attack on 5-round ASCON by using only  $2^{256}$  data bits.

In [47], Hu et. al. perform probabilistic and deterministic high-order differential/differentiallinear (HD/HDL) attacks on a software implementation of ASCON. The probabilistic HD/HDL attack is performed using a higher-order algebraic transitional form function (HATF) technique which is helpful for the cryptanalysis of quadratic round functions. Using HATF, various highlybiased 2nd-order HDL approximations are discovered for the initialization function of reduced 5round ASCON up to eighth order. Thus, using HATF, the key-recovery attack on 5-round ASCON can be performed with a complexity of 2<sup>22</sup>, and the distinguishing attack on reduced-round ASCON can be performed with a complexity of  $2^{12}$ . A conditional 3rd-order HDL approximation is also proposed for the initialization function of 6-round ASCON. High bias bits are observed in the Sbox in the 6th round of ASCON using the HATF with 24 conditions with a theoretical bias value of  $2^{-22}$  [47]. This value is computed using the bias observed in the bits of round-5 S-box ( $2^{-14}$ ), pilling-up lemma, and 8 HDL approximations applied to 2<sup>30</sup> test samples. The deterministic HD attack is also performed using the differential support function (DSF) technique, that helps in finding HD distinguishers by performing efficient linearizations on permutation inputs. Thus, the DSF technique improves the complexity of a distinguishing attack on the permutation of reduced 8-round ASCON from  $2^{130}$  to  $2^{46}$ . Using a similar DSF method, the zero-sum distinguishers for 12-round ASCON permutation are calculated with a time and data complexity of  $2^{55}$ , respectively. [47].

#### 4.4 Fault Analysis

Ramezanpour et. al. [48] successfully apply a statistical ineffective fault analysis (SIFA) using double-fault injection and key-dividing techniques on the S-box of ASCON in a software implementation to recover the secret key. The authors inject faults in any selected pair of S-boxes for every encryption performed in the last round of the finalization stage of ASCON. The faults are injected using a clock and/or voltage glitches are injected in a manner where they do not affect the result of the S-box. Then, the correct tag values resulting from induced ineffective faults are analyzed to gather information about the secret key. The probability of distribution is assumed to be biased in the proposed attack and the attack is successful as long as there is sufficient data available for analysis (12.5 to 2500 correct tag values in the proposed study). Thus, the SIFA-based fault model requirements are less than other differential fault analysis techniques and are also noise tolerant. Fault attack countermeasures such as error detection and error-randomization techniques fail in the presence of SIFA as they rely on the incorrect output value in cipher operations under fault inductions. The best countermeasures against SIFA are those where the fault injection mechanisms can be detected or where the fault distribution is independent of secret data. Sensor-based techniques can detect fault injection mechanisms, however, they are limited in the fault mechanisms they can detect. The FPGA and ASIC hardware implementations are proposed as future work.

Surya et. al. [49] implement a synchronous clock glitching strategy to induce delay faults in selected parts of the ASCON architecture. SASEBO-GII FPGA board is used for the hardware implementation and a Digital Clock Manager (DCM) is used to generate synchronous clock signals.

In every encryption round, the faults are injected into the ASCON S-box via a high-frequency faulty clock signal resulting in faulty output in the ASCON linear layer. Surya et. al. also try to implement an asynchronous clock glitching strategy, however, it incurs higher utilization overhead when compared to the synchronous method [49]. The fault injection is performed by feeding the faulty clock signal only to a few parts of the design to better observe the error distribution and propagation, and to emulate EM injections easily. Possible countermeasures to this kind of attack include a threshold implementation (TI) scheme and a unified masking approach to protecting the ASCON architecture. Hardware implementation overheads are given in Table 2.

Joshi and Mazumdar [50] perform a subset fault analysis (SSFA) on a software implementation of ASCON-128 by attacking the vulnerabilities in its S-box. The strategy tries to find correlations between the input and output bits of the ASCON S-box by determining which output bits become 0 for input bits set to 0. They also propose a key division strategy to decrease the search space for key recovery to 2<sup>64</sup> for the worst case. Key masking before key whitening operation, error detection via partial decryption, or using a new and strengthened S-box design resilient to 1-bit SSFA are proposed as the countermeasures against SSFA.

In [50], a fault attack called the preliminary attack which focuses on vulnerabilities in the key whitening function and the tag creation function of the finalization stage is also proposed. Both attacks are shown to retrieve the entire secret key using a key analysis methodology. For the preliminary attack strategy, the involution property of the XOR function is leveraged to recover the key value from the generated tag. The attack can be mounted on ASCON in three ways: 1) Injecting faults into three selected S-boxes which requires 374 fault injections to recover the full secret key; 2) Injecting faults into a single selected S-box followed by an instruction skip error (induced by another fault injection) which requires 256 fault injections to recover the entire secret key; and 3) Resetting a word register to 0 at the output of the substitution layer via 128 (for 1-bit faults), 16 (for 1-byte fault), or 2 (for 64-bit word fault) fault injection, respectively, to recover the entire key.

Ramezanpour et. al. [51] propose another statistical fault analysis attack called the fault intensity map analysis (FIMA) for a software implementation of ASCON. This attack can retrieve the entire 128-bit key of ASCON. The attack is designed to use different features such as faulty ciphertexts, SIFA-induced correct ciphertexts, and data-dependent bias to recover the secret key. It is also resilient several countermeasures such as error detection techniques for DFA where it can gather secret information from the increased sample size. Even with infective countermeasures, where a fault is injected in a wrong random round of ASCON, FIMA can recover the secret key with 453 data samples. Thus, compared to other fault analysis methods, FIMA is 6 times more powerful.

Ambili and Jose [52] propose an upgraded design of ASCON-128a using pseudo-randomness of Cellular Automata (CA) to make the cipher resilient against SIFA and SSFA, verified mathematically in a software implementation. CA is a technique where a particular cell updates its value every iteration depending upon its state and a set of predefined rules. In [52], a null boundary CA (where the farthest cell neighbor is set to 0) is used to protect the architecture of ASCON by implementing it in the pseudorandom function of ASCON permutation. The security against SIFA and SSFA is due to the induced randomness in the linear layer due to which the XOR/linear equations derived for erroneous bits cannot be solved reliably. The authors propose the practicality of their work in hardware implementation as future work.

Kaur et. al. [53] propose low-cost error-detection mechanisms as countermeasures against fault attacks for the hardware implementations of ASCON. Parity, interleaved parity, and CRC-3-based techniques are formulated and applied to the 5-bit S-box of ASCON to detect natural and transient faults injected that may occur during the S-box operation to generate faulty outputs. Two kinds of error-detection implementations are introduced, either using Boolean logic or using the Look-up

Tables (LUTs). The error coverage of the proposed error detection schemes is tested for 640,000 injected faults and is determined to be over 99.99% [53]. The overhead results for the hardware implementations of ASCON on Spartan-7 and Kintex-7 FPGAs, protected using these error-detection mechanisms, show an increase in the area overhead up to 15% for both types of implementations. The results have been tabulated in Table 2. These mechanisms aim to detect most of injected single and multiple-bit faults leveraged in DFA and SSFA, however, detecting SIFA could be challenging since the error detection is performed at the output of the S-box and not the input.

# 4.5 Power Analysis

The research work of [54] introduces a machine learning (ML) based side-channel analysis with reinforcement learning (SCARL) to obtain confidential data by using unsupervised learning to extract leakage models from power measurements. SCARL attempts to obtain the secret key by analyzing the power consumed by the non-linear S-box computations in the initialization phase of ASCON. An autoencoder to process power measurement samples and reinforcement learning along with actor-critic networks are used to cluster the power features. The hardware implementation of the ASCON-128 on the Artix-7 FPGA board is attacked using SCARL, where FOBOS is used to gather the power measurements of 64 S-box computations. The authors successfully demonstrate that their proposed SCARL strategy can recover the secret key of the implemented ASCON-128 cipher on Artix-7 FPGA by using power measurements obtained during 24,000 encryption operations; the first 4 bits of the secret key are obtained using SCARL within 8 minutes.

The SCA countermeasure assessment based on power leakage is performed using FOBOS 2 and is presented in [26]. Test vector leakage assessment (TVLA) results for protected and unprotected ASCON architecture using Artix-7 and Spartan-6 FPGA boards. Significant leakage is noticed in the unprotected version vs. the protected version in which the values are within the threshold value; however, no confidential data is recovered through power leakage. The  $\chi^2$ -test is also performed for the protected and unprotected ASCON architecture for leakage assessment flow for fixed and random frequency classes using the same test vectors as that for TVLA. Similar to TVLA, the  $\chi^2$ -test observes leakage in the unprotected ASCON while no leakage is observed in the protected version.

ASCON's initialization phase is the most vulnerable to power analysis attacks as only 2 input bits (out of 5) of the initial S-boxes are unknown (secret key bits) while the other 3 are known. In [24], in addition to optimized hardware implementations of ASCON, the authors propose countermeasures against side-channel analysis attacks, particularly first-order differential power analysis (DPA) attacks. This is achieved by using the TI scheme [55]-[56], a masking technique where the calculations on critical data are indirectly carried out by modified transformations called shares. The proposed protected implementation of ASCON efficiently applies three shares like in the Keccak since ASCON uses an affine transformation of Keccak's  $\chi$  function [56]. ASCON's linear layer can be implemented on each share. However, the non-linear S-box layer needs to be transformed such that it maintains the following properties 1) Correctness - the sum of the resulting output share matches the S-box output when applied to the sum of the input shares, 2) Non-completeness each of the three S-box functions is independent of at least one input share, and 3) Uniformity - each S-box function is invertible. The research work [24] also implements a three-share TI version of the ASCON-fast and ASCON-x-low-area variants. The ASCON-fast-TI is a microcontroller-based implementation proposed as a cryptographic co-processor where the initial state sharing and randomness are performed by the microcontroller. The ASCON-x-low-TI variant directly uses the output of an available random number generator in the S-box operation per cycle. The hardware implementation results of the TI-protected implementation are listed in Table 2.

In [57], the correlation power analysis (CPA) and DPA attacks are mounted on the parallel implementations of ASCON-128 and the TI-protected ASCON-128, respectively. The CPA attack is successfully implemented on ASCON-128 by attacking the vulnerabilities at the end of the initialization phase while requiring fewer power traces to obtain half of the secret key. These vulnerabilities in the initialization round are leveraged again for the attack on the TI-protected ASCON-128 by using the difference of skewness as the third-order attack [57].

Similarly, the research work of [58] implements a TI-based protection scheme for ASCON against first-order DPA by executing one round in 7 clock cycles. The implemented scheme instantiates a single hybrid 2-share/3-share TI-protected 64-bit AND module which uses random 192 bits every clock cycle - 128 for resharing between 2-/3- shares, and the remaining 64 bits are used to maintain TI uniformity. This TI-protected ASCON implementation is shown to be resistant to first-order DPA by analyzing the results of the *t*-test leakage detection test [58].

#### **OUR INSIGHTS, VISIONS, AND CONCLUSION** 5

This survey is the first work on the current standard for lightweight cryptography, standardized in 2023. This study covers various hardware implementations proposed for NIST LWC winner ASCON in the recent years on FPGA and ASIC hardware platforms. These hardware implementations suggest improvements on the original design in terms of area, throughput, efficiency or energy/power utilizations for applications of ASCON in resource-constrained devices. Differential and side-channel cryptanalysis performed on ASCON on both hardware and software platforms have also been reviewed. The differential cryptanalysis techniques highlight the vulnerabilities present in the permutation function of the reduced-round ASCON but not of a full 12-round AS-CON. The S-box design is also shown to be vulnerable to SFA due to its design where the secret key is retrieved using correlation between the input and the output bits. Power analysis attacks were also mounted on ASCON using machine learning strategies or DPA to gather secret information from side-channel leakage, however, protected architecture of ASCON using TI is considered safe against such attacks.

An insight here is to investigate augmenting the ASCON implementations with design-for lowcost fault diagnosis and have that as a design decision factor. The merit of taking low overhead into account before designing cryptographic algorithms is that the resulting ASCON architectures will be designed for low-overhead error detection and the countermeasures are not devised aftermath.

One other interesting insight/vision related to fault and side-channel attacks is combined attacks and countermeasures. We believe this would be the future of such attacks for ASCON (there has been little prior work and none considers ASCON [59]-[62]).

## ACKNOWLEDGMENTS

This work was performed under the U.S. federal agency award 60NANB20D013 granted from U.S. Department of Commerce, National Institute of Standards and Technology (NIST).

#### REFERENCES

- [1] J. Kaur, M. Mozaffari Kermani, and R. Azarderakhsh. 2022. Hardware constructions for lightweight cryptographic block cipher QARMA with error detection mechanisms," IEEE Trans. Emerg. Topics Comp., 10(1), 514-519.
- [2] J. Kaur, A. Sarker, M. Mozaffari Kermani, and R. Azarderakhsh. 2022. Hardware constructions for error detection in lightweight Welch-Gong (WG)-oriented streamcipher WAGE benchmarked on FPGA. IEEE Trans. Emerg. Topics Comp., 10(2), 1208-1215.
- [3] A. Aghaie, M. Mozaffari Kermani, and R. Azarderakhsh. 2018. Reliable and fault diagnosis architectures for hardware and software-efficient block cipher KLEIN benchmarked on FPGA. IEEE Transactions on Computer-Aided Design Integr. Circuits Syst., vol. 37(4), 901-905.

- [4] S. Subramanian, M. Mozaffari Kermani, R. Azarderakhsh, and M. Nojoumian. 2017. Reliable hardware architectures for cryptographic block ciphers LED and HIGHT. *IEEE Trans. Computer-Aided Des. Integr. Circ. Syst.*, 36(10), 1750-1758.
- [5] P. Ahir, M. Mozaffari Kermani, and R. Azarderakhsh. 2017. Lightweight architectures for reliable and fault detection Simon and Speck cryptographic algorithms on FPGA. ACM Trans. Embedded Comput. Sys., vol. 16(4), 109:1-109:17.
- [6] A. Aghaie, M. Mozaffari Kermani, and R. Azarderakhsh. 2017. Fault diagnosis schemes for low-energy block cipher Midori benchmarked on FPGA. IEEE Transactions on Very Large Scale Integrated (VLSI) Systems, 25 (4), 1528-1536.
- [7] M. Mozaffari Kermani and R. Azarderakhsh. 2013. Efficient Fault Diagnosis Schemes for Reliable Lightweight Cryptographic ISO/IEC Standard CLEFIA Benchmarked on ASIC and FPGA. *IEEE Transactions on Industrial Electronics*, 60(12), 5925-5932.
- [8] C. Dobraunig, M. Eichlseder, F. Mendel, and M. Schlffer. 2021. ASCON v1.2: Lightweight Authenticated Encryption and Hashing. J. Cryptol, 34, 1-42.
- [9] T. Beyne, Y. L. Chen, C. Dobraunig, B. Mennink. 2021. Elephant v2. 1-55. [Online] Available: https://www.esat.kuleuven.be/cosic/elephant.
- [10] S. Banik, A. Chakraborti, A. Inoue, T. Iwata, K. Minematsu, M. Nandi, T. Peyrin, Y. Sasaki, S. M. Sim, and Y. Todo. 2020. GIFT-COFB. J. Cryptology ePrint Archive. 732-768. [Online] Available: https://eprint.iacr.org/2020/738.
- [11] M. Hell, T. Johansson, A. Maximov, W. Meier, J. S"onnerup, and H. Yoshida. 2021. Grain-128AEADv2. A submission to the NIST lightweight cryptography standardization process, 1-38. Available [Online]: https://grain-128aead.github.io.
- [12] C. Dobraunig, M. Eichlseder, S. Mangard, F. Mendel, and T. Unterluggauer. 2017. ISAP Towards Side-Channel Secure Authenticated Encryption. In Proc. Trans. Symm. Cryptology (ToSC), 2017(1), 80–105.
- [13] Z. Bao, A. Chakraborti, N. Datta, J. Guo, M. Nandi, T. Peyrin, and K. Yasuda. 2021. PHOTON-beetle authenticated encryption and hash family. A submission to the NIST lightweight cryptography standardization process. 1-115. Available [Online]: https://www.isical.ac.in/~lightweight/beetle.
- [14] C. Guo, T. Iwata, M. Khairallah, K. Minematsu, and T. Peyrin. 2021. Romulus v1. 3. A submission to NIST Lightweight Cryptography., 1-57. Available [Online]: https://romulusae.github.io/romulus.
- [15] C. Beierle, A. Biryukov, L. Cardoso dos Santos, J. Groschdl, L. Perrin, A. Udovenko, V. Velichkov, and Q. Wang. 2020. Lightweight AEAD and Hashing using the Sparkle Permutation Family. J. ToSC. 208-261.
- [16] H. Wu and T. Huang. 2021. TinyJAMBU: A Family of Lightweight Authenticated Encryption Algorithms (Version 2). A submission to the NIST lightweight cryptography standardization process. 1-40. Available [Online]: https://csrc.nist.gov/Projects/lightweight-cryptography/finalists.
- [17] J. Daemen, S. Hoffert, M. Peeters, G. Van Assche, and R. Van Keer. 2020. Xoodyak a lightweight cryptographic scheme. J. ToSC, 60-87.
- [18] R. Avanzi, J. Bos, L. Ducas, E. Kiltz, T. Lepoint, V. Lyubashevsky, J. M. Schanck, P. Schwabe, G. Seiler, D. Stehl. 2017. CRYSTALS - Kyber: A CCA-Secure Module-Lattice-Based KEM. in *Proc. IEEE European Symposium on Security and Privacy (EuroS&P)*, 353-367.
- [19] L. Ducas, E. Kiltz, T. Lepoint, V. Lyubashevsky, P. Schwabe, G. Seiler, and D. Stehl. 2018. CRYSTALS-Dilithium: A Lattice-Based Digital Signature Scheme. J. Trans. Crypto. Hard. Embed. Sys. (TCHES), 2018(1), 238–268.
- [20] P. A. Fouque, J. Hoffstein, P. Kirchner, V. Lyubashevsky, T. Pornin, T. Prest, T. Ricosset, G. Seiler, W. Whyte, and Z. Zhang. 2018. Falcon: Fast-Fourier lattice-based compact signatures over NTRU. Submission to the NIST's post-quantum cryptography standardization process. 36(5). 1-75.
- [21] D. J. Bernstein, A. Hlsing, S. Klbl, R. Niederhagen, J. Rijneveld, and Peter Schwabe. 2019. The SPHINCS+ Signature Framework. In Proc. ACM SIGSAC Conf. Comp. Comms. Sec. (CCS '19), 2129–2146.
- [22] H. Madushan, I. Salam, and J. Alawatugoda. 2022. A Review of the NIST Lightweight Cryptography Finalists and Their Fault Analyses. J. Electronics, 11(24), 4199.
- [23] C. Dobraunig, M. Eichlseder, F. Mendel, and M. Schlffer. 2015. Cryptanalysis of Ascon. In Proc. RSA, 371-387.
- [24] H. Gross, E. Wenger, C. Dobraunig, and C. Ehrenhfer. 2017. Ascon hardware implementations and side-channel evaluation. J. Microprocessors and Microsystems, 52, 470–479.
- [25] W. Diehl, A. Abdulgadir, F. Farahmand, J.-P. Kaps, and K. Gaj. 2018. Comparison of Cost of Protection against Differential Power Analysis of Selected Authenticated Ciphers. J. Cryptography, 2, 3:1-26.
- [26] A. Abdulgadir, W. Diehl and J. -P. Kaps. 2019. An Open-Source Platform for Evaluation of Hardware Implementations of Lightweight Authenticated Ciphers. In Proc. Inter. Conf. ReConFig. Comp. FPGAs (ReConFig), 1-5.
- [27] K. Raj and S. Bodapati. 2022. FPGA Based Light Weight Encryption of Medical Data for IoMT Devices using ASCON Cipher. In Proc. IEEE Inter. Symp. Smart Elect. Sys. (iSES), 196-201.
- [28] S. Khan, W.K. Lee, and S. O. Hwang. 2021. Scalable and Efficient Hardware Architectures for Authenticated Encryption in IoT Applications. *IEEE Internet of Things*, 8, 14: 11260-11275.
- [29] X. Wei, M. El-Hadedy, S. Mosanu, Z. Zhu, W. -M. Hwu, and X. Guo. 2022. RECO-HCON: A High-Throughput Reconfigurable Compact ASCON Processor for Trusted IoT. In Proc. IEEE Inter. System-on-Chip Conf. (SOCC), 1-6.

- [30] S. Khan, W. K. Lee, and S. O. Hwang. 2022. Evaluating the Performance of Ascon Lightweight Authenticated Encryption for AI-Enabled IoT Devices. In Proc. TRON Symp. (TRONSHOW), 1-6.
- [31] N. Roussel, O. Potin, G. Di Pendina, J. -M. Dutertre, and J. -B. Rigaud. 2022. CMOS/STT-MRAM Based Ascon LWC: a Power Efficient Hardware Implementation. In Proc. IEEE Inter. Conf. Electro. Circs. Sys. (ICECS), 1-4.
- [32] S. Luo, W. Wu, Y. Li, R. Zhang, and Z. Liu. 2022. An Efficient Soft Analytical Side-Channel Attack on Ascon. In Proc. Inter. Conf. Wireless Algo. Sys. Apps., 389-400.
- [33] B. Halak and J. Duarte-Sanchez. 2020. Cube Attack on a Trojan-Compromised Hardware Implementation of Ascon. In Proc. IEEE Inter. SOCC, 43-47.
- [34] M. Hicks, M. Finnicum, S.T. King, M. M. K. Martin, and J. M. Smith. 2010. Overcoming an untrusted computing base: Detecting and removing malicious hardware automatically. In Proc. IEEE Symp. Sec. Priv., 159-172.
- [35] J. Zhang, Q. Xu, Z. Sun, F. Yuan, and L. Wei. 2013. VeriTrust: verification for hardware trust. In Proc. DAC, 61, 1-8.
- [36] A.Waksman, M. Suozzo, and S. Sethumadhavan. 2013. FANCI: identification of stealthy malicious logic using Boolean functional analysis. In Proc. ACM Conf. Comp. Commun. Sec. (CCS), 697-708.
- [37] Z. Li, X. Dong, and X. Wang. 2017. Conditional Cube Attack on Round-Reduced ASCON. J. Cryptology, 160-187.
- [38] J. Baudrin, A. Canteaut, and L. Perrin. 2022. Practical Cube Attack against Nonce-Misused Ascon. J. Transactions on Symmetric Cryptology (ToSC), 4, 120-144.
- [39] D. Chang, D. Hong, and J. Kang. 2022. Conditional Cube Attacks on Ascon-128 and Ascon-80pq in a Nonce-misuse Setting. J. Cryptology, 544-567.
- [40] D. Jankovikj, H. Mihajloska Trpceska, and V. Dimitrova. 2022. Cryptanalysis of Round-Reduced ASCON powered by ML. In Proc. Inter. Conf. on Informatics Info.Tech. (CIIT), 1-6.
- [41] R. Rohit, K. Hu, S. Sarkar, and S. Sun. 2021. Misuse-Free Key-Recovery and Distinguishing Attacks on 7-Round Ascon. J. Cryptology ePrint Archive, 194-220.
- [42] Y. Todo. 2015. Structural Evaluation by Generalized Integral Property. In Proc. Inter. Conf. Theory Apps. Crypto. Tech. (EUROCRYPT Adv. Crypto.), 9056, 287-314.
- [43] R. Zong, X. Dong, and X. Wang. 2019. Collision Attacks on Round-Reduced Gimli-Hash/Ascon-Xof/Ascon-Hash. J. Cryptology ePrint Archive, 1115-1135.
- [44] I. Dinur, O. Dunkelman, and A. Shamir. 2021. New Attacks on Keccak-224 and Keccak-256. In Proc. Inter. Workshop Fast Soft. Encryp., 7549, 442-461.
- [45] D. Gerault, T. Peyrin, and Q. Q. Tan. 2021. Exploring Differential-Based Distinguishers and Forgeries for ASCON. J. Cryptology ePrint Archive, 1103-1138.
- [46] C. Tezcan. 2016. Truncated, Impossible, and Improbable Differential Analysis of Ascon. J. Cryptology, 460-468.
- [47] K. Hu, T. Peyrin, Q. Q. Tan, and T. Yap. 2022. Revisiting Higher-Order Differential-Linear Attacks from an Algebraic Perspective. J. Cryptology ePrint Archive, 1335-1400.
- [48] K. Ramezanpour, P. Ampadu and W. Diehl. 2019. A Statistical Fault Analysis Methodology for the Ascon Authenticated Cipher. In Proc. IEEE Inter. Symp. Hard. Oriented Sec. Trust (HOST), 41-50.
- [49] G. Surya, P. Maistri, and S. Sankaran. 2020. Local Clock Glitching Fault Injection with Application to the ASCON Cipher. In Proc. IEEE Inter. Symp. Smart Electro. Sys. (iSES), 271-276.
- [50] P. Joshi and B. Mazumdar. 2021. SSFA: Subset fault analysis of ASCON-128 authenticated cipher. J. Microelectronics Reliability, 123, 114155:1-14.
- [51] K. Ramezanpour, P. Ampadu, and W. Diehl. 2019. FIMA: Fault Intensity Map Analysis. In Proc. Inter. Workshop Constructive SCA Sec. Design (COSADE), 11421, 63-79.
- [52] K. N. Ambili and J. Jose. 2022. Reinforcing Lightweight Authenticated Encryption Schemes against Statistical Ineffective Fault Attack. J. Cryptology ePrint Archive, 41-59.
- [53] J. Kaur, M. Mozaffari Kermani, and R. Azarderakhsh. 2022. Hardware Constructions for Error Detection in Lightweight Authenticated Cipher ASCON Benchmarked on FPGA. IEEE Trans. Circs. Sys. II: Express Briefs, 69(4), 2276-2280.
- [54] K. Ramezanpour, P. Ampadu, and W. Diehl. 2020. SCARL: Side-Channel Analysis with Reinforcement Learning on the Ascon Authenticated Cipher. 1-25. [Online]. Available: https://arxiv.org/abs/2006.03995v1.
- [55] S. Nikova, C. Rechberger, and V. Rijmen. 2006. Threshold implementations against side- channel attacks and glitches. In Proc. Inter. Conf. Info. Comms. Sec. (ICICS), 529-545.
- [56] B. Bilgin, J. Daemen, V. Nikov, S. Nikova, V. Rijmen, and G. Van Assche. 2014. Efficient and first-order DPA resistant implementations of Keccak. In Proc. Smart Card Research Adv. App. (CARDIS), 187-199.
- [57] N. Samwel and J. Daemen. 2017. DPA on hardware implementations of Ascon and Keyak. In Proc. Comp. Frontiers Conf., 415-424.
- [58] W. Diehl, A. Abdulgadir, F. Farahmand, J.-P. Kaps, and K. Gaj. 2018. Comparison of Cost of Protection against Differential Power Analysis of Selected Authenticated Ciphers. J. Crypt., 2(3), 1-26.
- [59] F. Regazzoni, T. Eisenbarth, L. Breveglieri, P. Ienne, and I. Koren. 2008. Can knowledge regarding the presence of countermeasures against fault attacks simplify power attacks on cryptographic devices? In Proc. DFT, 202-210.

- [60] F. Regazzoni, T. Eisenbarth, J. GroschŁadl, L. Breveglieri, P. Ienne, I. Koren, and C. Paar. 2007. Power attacks resistance of cryptographic S-Boxes with added error detection circuits. In *Proc. DFT*, 508-516.
- [61] J. Dofe, H. Pahlevanzadeh, and Q. Yu. 2016. A comprehensive FPGA-based assessment on fault-resistant AES against correlation power analysis attack. J. Electronic Testing, 32(5), 611-624.
- [62] H. Pahlevanzadeh, J. Dofe, and Q. Yu. 2016. Assessing CPA resistance of AES with different fault tolerance mechanisms. In Proc. Asia and South Pacic Design Automation Conference (ASP-DAC), pp. 661-666.

-:16