scispace - formally typeset
Search or ask a question

Showing papers by "Alessandro Trifiletti published in 2017"


Journal ArticleDOI
TL;DR: This paper presents a CMOS operational transconductance amplifier (OTA), suitable for sub-1-V supply applications, whose (input) common-mode voltage can be set to (
Abstract: This paper presents a CMOS operational transconductance amplifier (OTA), suitable for sub-1-V supply applications, whose (input) common-mode voltage can be set to ( $V_{\mathrm {\mathbf {DD}}}+V_{\mathrm {\mathbf {SS}}})$ /2 thanks to two combined techniques applied to the differential pair, namely, threshold voltage lowering and elimination of the tail current generator. Both techniques are implemented through a single common-mode feedback loop, which embeds the shared bulk terminal of the pair. In contrast to other low-voltage approaches employing bulk driving, the proposed OTA is driven from the gate terminals and exploits only MOS transistors in strong inversion. Therefore, effective values of dc gain, gain bandwidth, and noise are found, suitable for high-accuracy switched-capacitor applications. Using a standard 0.35- $\mu \text{m}$ technology with nominal MOS transistors threshold around 0.7 V, a 0.9-V OTA with 0.45-V analog ground was designed and successfully tested. The measured gain and unity gain frequency were 65 dB and 1 MHz with phase margin of 60° for a capacitive load of 10 pF.

43 citations


Journal ArticleDOI
TL;DR: For the first time in the literature, the Attack Exploiting Static Power (AESP) is formulated as a univariate attack by using the mutual information approach to quantify the information that leaks through the static power side channel independently from the adopted leakage model.
Abstract: In this work we focus on Power Analysis Attacks (PAAs) which exploit the dependence of the static current of sub-50 nm CMOS integrated circuits on the internally processed data. Spice simulations of static power have been carried out to show that the coefficient of variation of nanometer logic gates is increasing with the scaling of CMOS technology. We demonstrate that it is possible to recover the secret key of a cryptographic core by exploiting this data dependence by means of different statistical distinguishers. For the first time in the literature we formulate the Attack Exploiting Static Power (AESP) as a univariate attack by using the mutual information approach to quantify the information that leaks through the static power side channel independently from the adopted leakage model. This analysis shows that countermeasures conceived to protect cryptographic hardware from attacks based on dynamic power consumption (e.g., WDDL, MDPL, SABL) still exhibit a leakage through the static power side channel. Finally, we show that the Time Enclosed Logic (TEL) concept does not leak information through the static power and is suitable to be used as a countermeasure against both attacks explointig dynamic power and attacks exploiting static power.

29 citations


Journal ArticleDOI
TL;DR: This paper presents the design of a novel low-voltage high-speed D-latch circuit suitable for nanometer CMOS technologies and its advantages are demonstrated both by simulations, under different performance/power consumption tradeoffs with a 40-nm CMOS technology, and theoretically, thanks to a simple model of the propagation delay derived for both low- voltage topologies.
Abstract: This paper presents the design of a novel low-voltage high-speed D-latch circuit suitable for nanometer CMOS technologies. The proposed topology is compared against the low-voltage triple-tail D-latch and its advantages are demonstrated both by simulations, under different performance/power consumption tradeoffs with a 40-nm CMOS technology, and theoretically, thanks to a simple model of the propagation delay derived for both low-voltage topologies. In order to further demonstrate the advantages of the proposed topology, it has also been used to design a D flip-flop (DFF), where thanks to the feature to need just 1 clock differential pair; a further speed improvement is achieved over the conventional triple-tail topology. Indeed, by comparing a two-stage frequency divider designed using both the triple-tail DFF and the proposed folded DFF, a 54% improvement in the maximum operating frequency is found when using the proposed folded DFF.

27 citations


Journal ArticleDOI
TL;DR: A novel framework is introduced to estimate the max-delay variability in logic paths due to variations in a back-of-the-envelope fashion, thus allowing quick evaluation of the additional cycle time margin imposed by random (local) variations.
Abstract: In this paper, a novel framework is introduced to estimate the max-delay variability in logic paths due to variations in a back-of-the-envelope fashion, thus allowing quick evaluation of the additional cycle time margin imposed by random (local) variations. The framework provides the designer with a deep insight into the main variability contributions, and the improvements allowed by prospective design modifications (e.g., logic restructuring and cell up-sizing). The proposed framework is applicable to a wide voltage range, from sub-threshold to nominal. Our analysis shows that the popular fan-out-of-4 metric (FO4) fully captures the impact of technology, supply voltage, die-to-die, and voltage and temperature variations on the delay variability. On the other hand, the variability contribution due to random variations is accounted for by cell-specific coefficients having a clear physical meaning, and depending only on circuit-level parameters knobs (i.e., cell topology and transistor size). Accordingly, the proposed method completely decouples the effect of random variations from the impact of the process/voltage/temperature corner. The proposed approach has been validated in a range of technology generations (28, 40, and 65 nm) and voltages (from 0.3 to 1.2 V) through Monte Carlo simulations and silicon measurements (28 and 65 nm). Being adequately accurate compared with Monte Carlo simulations and silicon measurements, this framework eliminates time-consuming Monte Carlo simulations from the design loop, thus drastically facilitating design closure. Being its accuracy comparable to other state-of-the-art methods, the proposed framework can also be used for efficient automated statistical timing analysis of VLSI circuits.

24 citations


Journal ArticleDOI
TL;DR: A new calibration technique for time-interleaved analog-to-digital converters is proposed, based on Hermitianity-preserving complex Taylor approximations of the frequency response of the correction filters, shown to be accurate and to require few hardware resources.
Abstract: A new calibration technique for time-interleaved analog-to-digital converters is proposed, based on Hermitianity-preserving complex Taylor approximations of the frequency response of the correction filters. Calibration is interpreted as approximating these filters with linear combinations of base filters obtained by the proposed Taylor expansion. Known calibration techniques are reinterpreted in this way and compared in terms of accuracy, computational complexity, numerical stability, and convergence time. The new technique is shown to be accurate and to require few hardware resources. The limited number of parameters to estimate enables good performance in fixed-point arithmetic and fast convergence. This is important in background calibration schemes in which parameters need to be estimated in real time.

21 citations


Journal ArticleDOI
TL;DR: A new class of template attacks aiming at recovering the secret key of a cryptographic core from measurements of its static power consumption is presented, and it is shown that using just a few different temperatures to build multivariate templates allows to strongly increase the effectiveness of the attack.
Abstract: Summary A new class of template attacks aiming at recovering the secret key of a cryptographic core from measurements of its static power consumption is presented in this paper. These attacks exploit the dependence of the static current of Complementary metal–oxide–semiconductor Integrated Circuits on the input vector and the maximum likelihood decision rule as a statistical distinguisher. In the proposed Template Attacks Exploiting Static Power (TAESP), we take advantage of the temperature dependence of static currents in order to build a new multivariate approach able to extract relevant information from cryptographic devices. As a validation case study, we consider the PRESENT-80 block cypher algorithm and its implementation on a 40 nm Complementary metal–oxide–semiconductor process. Monte Carlo and corner simulations at transistor level are used to show the effectiveness of the TAESP in the presence of die-to-die and intra-die process variations. A real attack scenario is then built by adding Gaussian noise to current samples extracted from transistor-level simulations. The univariate TAESP in which just one temperature is considered to build the templates is compared against the multivariate TAESP in which measurements at different controlled temperatures are exploited. This comparison shows that using just a few different temperatures to build multivariate templates allows to strongly increase the effectiveness of the attack. Copyright © 2016 John Wiley & Sons, Ltd.

17 citations


Proceedings ArticleDOI
01 Sep 2017
TL;DR: A novel measurement setup is presented, which aims to overcome several issues in measuring static currents, such as extremely low SNR and temperature dependency, providing a low-cost solution to carry out Attacks Exploiting Static Power (AESP).
Abstract: The static power consumption in modern integrated circuits has become a critical standpoint in side-channel analysis. As it has been widely demonstrated in the technical literature, it is possible to extract secret information from a cryptographic circuit by means of static current measurements. Static and dynamic power analysis require different measurement procedures, due to physical reasons. In this work, we present a novel measurement setup, which aims to overcome several issues in measuring static currents, such as extremely low SNR and temperature dependency, providing a low-cost solution to carry out Attacks Exploiting Static Power (AESP). The proposed measurement setup is based on a DC pico-ammeter, which allows to acquire DC currents after a long integration time, and on a thermal feedback loop exploiting a commercial Peltier cell to set and control the working temperature of the cryptographic processor. To verify the effectiveness of the proposed setup, AESP have been successfully implemented on a 4×4 bit crypto-core, extracted from a bit slice implementation of the PRESENT-80 algorithm and implemented on a 45nm Xilinx Spartan-6 FPGA.

10 citations


Proceedings ArticleDOI
01 Sep 2017
TL;DR: A mixed-signal Y-matrix synthesizer using VCIIs is proposed, which is the dual of a similar one using CCIIs, and can also be used as an N-port analyzer, with an advantage with respect to the CCII-based version related to the possibility of sensing low-impedance (voltage) inputs.
Abstract: The Voltage Conveyor (VCII) is the dual of the second generation Current Conveyor (CCII), and has received only a cursory attention in the literature, probably for lack of interesting applications. The VCII has a current buffer between Y and X terminals, and a voltage buffer between X and Z terminals. In this way, it makes it easier to sum (current) signals at the Y node, whereas CCIIs make it easier to sum (current) signals at the X node. Exploiting this difference between the two dual circuits, a very simple N-port synthesizer can be obtained using only N VCIIs. A mixed-signal Y-matrix synthesizer using VCIIs is also proposed, which is the dual of a similar one using CCIIs, and can also be used as an N-port analyzer, with an advantage with respect to the CCII-based version related to the possibility of sensing low-impedance (voltage) inputs. An inductor emulator and a lowpass / bandpass biquad filter are also simulated, showing the versatility of the VCII.

9 citations


Journal ArticleDOI
TL;DR: A novel recursive least squares (RLS) algorithm that exploits the Frisch–Waugh–Lovell theorem to reduce digital complexity and improve convergence speed and algorithmic stability in fixed-point arithmetic is proposed.
Abstract: We propose a novel recursive least squares (RLS) algorithm that exploits the Frisch–Waugh–Lovell theorem to reduce digital complexity and improve convergence speed and algorithmic stability in fixed-point arithmetic. We tested the new algorithm in the digital background calibration section of a four-channel time-interleaved analog-to-digital converter, obtaining better stability and faster convergence. The digital complexity of the new algorithm in terms of multiplications and divisions is 33% lower asymptotically than that of the conventional Bierman algorithm if the model parameters need not be computed at each update; otherwise, it is the same. Memory requirements are also the same. Because, in calibration, the distance between the ideal and calibrated outputs of the system is to be minimized, the actual value of the model parameters is usually not of interest. Convergence time can be up to 10 or 20 times better in fixed-point arithmetic, and stability for large models is also better in our simulations. In our simulations, when the conventional Bierman RLS algorithm is stable, the steady-state accuracy of the new algorithm is either comparable or better, depending on the simulation setup.

9 citations


Book ChapterDOI
24 May 2017
TL;DR: A multivariate analysis exploiting static power consumption is presented in which the temperature-domain is used to extract more information and the perceived information shows taking advantage of the use of more than one temperature, the security level can be decreased.
Abstract: Latest nanometer CMOS technology nodes have highlighted new issues in security of cryptographic hardware implementations. The constant growth of the static power consumption has led to a new class of side-channel attacks. Common attacks exploiting static power use an univariate approach to recover information from cryptographic engines. In our work, a multivariate approach based on information theoretic security metrics is presented. The temperature-dependence helps to exploit more information leakage from the hardware implementation. Starting from a univariate analysis, mutual information reveals that increasing the working temperature, the information leaked through the static power side channel is increased as well. In this work a multivariate analysis exploiting static power consumption is presented in which the temperature-domain is used to extract more information. The use of information theoretic approach allows to precisely quantify the amount of information that can be leaked from a cryptographic hardware implementation. The perceived information shows taking advantage of the use of more than one temperature, the security level can be decreased. The improvement achieved using the presented approach is demonstrated on a 40 nm CMOS implementation of the Present 80 crypto core.

8 citations


Proceedings ArticleDOI
01 Sep 2017
TL;DR: A closed-form representation of the reconstruction filters for a 4-channel Time-Interleaved ADC affected by gain mismatches and timing skew is derived solving the Papoulis equations to enable linear estimation methods to be used in foreground and background calibration techniques.
Abstract: A closed-form representation of the reconstruction filters for a 4-channel Time-Interleaved ADC affected by gain mismatches and timing skew is derived solving the Papoulis equations. First-order Taylor expansions of the filters are then computed to enable linear estimation methods to be used in foreground and background calibration techniques. The results are validated with behavioral simulations and compared to the reconstruction filters' responses obtained with numerical methods.

Proceedings ArticleDOI
01 Sep 2017
TL;DR: A novel class-AB second-generation Current Conveyor based on the class- AB Flipped Voltage Follower (FVF) topology is proposed, and it can drive the Z output with currents larger than the biasing ones, improving power efficiency.
Abstract: We propose a novel class-AB second-generation Current Conveyor (CCII) based on the class-AB Flipped Voltage Follower (FVF) topology, and compare it with a class-A CCII based on the conventional FVF. The AB-FVF is capable of driving larger capacitive loads, showing faster settling. Furthermore, it can drive the Z output with currents larger than the biasing ones, improving power efficiency. A modification of a previously published FVF is also introduced to improve the compensation of the frequency response.

Proceedings ArticleDOI
01 May 2017
TL;DR: An analytical model to evaluate the hybrid architecture of a wide bandwidth high-speed digitizer is proposed based on the robust approach of multi-rate signal processing theory and can be used at design stage to identify viable solutions for counteracting the effects of the impairments.
Abstract: An analytical model to evaluate the hybrid architecture of a wide bandwidth high-speed digitizer is proposed. The model is based on the robust approach of multi-rate signal processing theory and allows analyzing the effects of the impairments that can affect the digitizer, and consequently evaluating achievable performance. The proposed model can also be used at design stage to identify viable solutions for counteracting the effects of the impairments. More specifically, it can be used to identify the correction filters that provide a digital representation of the input signal that minimizes spurious terms and distortions.

Proceedings ArticleDOI
01 Oct 2017
TL;DR: This paper demonstrates that, by the use of multiple channels and suitable correlation algorithms, the above ambiguities can be cancelled and the final spectrum estimation comes out to be correct.
Abstract: The need to estimate in real time a large frequency spectrum, in order to detect the presence of narrowband emitters within a certain frequency area, is relevant in many applications, in particular those related to cognitive radar. The goal of spectrum monitoring is to analyze the whole wide spectrum in real time. This paper starts from previously proposed solutions, based on correlating the undersampled signals of only two asynchronous channels. The drawback of this solution is that, in specific cases, some ambiguities may occur along the estimated spectrum, due to the casual overlapping of some aliased tones, produced by undersampling. This paper demonstrates that, by the use of multiple channels and suitable correlation algorithms, the above ambiguities can be cancelled and the final spectrum estimation comes out to be correct.

Proceedings ArticleDOI
01 Sep 2017
TL;DR: A local feedback loop is proposed that exploits internal nodes and triode-biased transistors to improve the CMRR with a limited power and area penalty.
Abstract: The fully differential class-AB OTA topology by Peluso presents a poor Common-Mode Rejection Ratio (CMRR) and could become unusable for a common-mode gain larger than 1. We propose a local feedback loop that exploits internal nodes and triode-biased transistors to improve the CMRR with a limited power and area penalty. Simulations in 40-nm CMOS technology show a net improvement of the CMRR without affecting the differential-mode behavior; simulations of a sample- and-hold exploiting the proposed OTA topology are also presented.

Proceedings ArticleDOI
01 Jun 2017
TL;DR: In this paper, a very low voltage sample-and-hold (SHA) circuit based on this opamp is presented to test the feasibility of a reconfigurable pipeline ADC, and the opamp gain remains quite constant up to a supply voltage of 0.5V, and can still be used with a supply as low as 0.3V.
Abstract: In this paper, a voltage-scalable inverter-based operational amplifier suitable to be used in a reconfigurable ADC is optimized. A very low voltage sample-and-hold (SHA) circuit based on this opamp is presented to test the feasibility of a reconfigurable pipeline ADC. Simulations using STMicroelectronics 45-nm device models show that the opamp gain remains quite constant up to a supply voltage of 0.5V, and the amplifier can still be used with a supply as low as 0.3V. The SHA maximum sampling frequency decreases with supply voltage, and good performance with low power consumption is achieved.

Proceedings ArticleDOI
01 Jun 2017
TL;DR: A novel current-mode feedback suppressor as on-chip analog-level CPA countermeasure is proposed, which aims to suppress differences in power consumption due to data-dependency of CMOS cryptographic devices, in order to counteract CPA attacks.
Abstract: Security of sensible data for ultraconstrained IoT smart devices is one of the most challenging task in modern design. The needs of CPA-resistant cryptographic devices has to deal with the demanding requirements of small area and small impact on the overall power consumption. In this work, a novel current-mode feedback suppressor as on-chip analog-level CPA countermeasure is proposed. It aims to suppress differences in power consumption due to data-dependency of CMOS cryptographic devices, in order to counteract CPA attacks. The novel countermeasure is able to improve MTD of unprotected CMOS implementation of at least three orders of magnitude, providing a ×1.1 area and ×1.7 power overhead.

Proceedings ArticleDOI
01 Sep 2017
TL;DR: A VHDL implementation that has been successfully implemented on a Xilinx Virtex-7 FPGA is proposed that is stable and often faster in fixed-point arithmetic than conventional RLS.
Abstract: The Frisch-Waugh-Lovell (FWL) Recursive Least Squares (RLS) algorithm has been recently proposed as an RLS algorithm with lower computational cost and better numerical properties. We propose a VHDL implementation that has been successfully implemented on a Xilinx Virtex-7 FPGA. The FWL RLS algorithm has a complexity of L2 + O(L) products, instead of 1.5L2 O(L) as in conventional RLS algorithms. Because it removes all matrix operations, separating an L input vector problem into L separate scalar problems, it is stable and often faster in fixed-point arithmetic than conventional RLS. An RLS filter with L inputs is composed of L stages, and the i-th stage (1 = {1, 2, …, L}) has L+ 2-i inputs and L + l-i outputs. The implementation is based on two blocks: a scalar estimation block (EB), which is instantiated once for every layer, and L + l-i identical filtering blocks (FB). For a L-input RLS model, there are L EBs and L(L + l)/2 FBs. Adding an input involves instantiating one additional EB and L + 1 FBs. Removing one input requires the removal of the first layer. The VHDL structure is modular and can be easily adjusted for different values of L. We also present estimated hardware costs over a wide range of L values.

Proceedings ArticleDOI
01 Sep 2017
TL;DR: An algorithm which has a complexity between 5L2/6 and L2/2, which is in theory as fast and accurate as the other RLS ones, but employs a batch approach, waiting for K≥L consecutive samples and processing them together.
Abstract: Conventional Recursive Least Squares (RLS) filters have a complexity of 1.5L2 products per sample, where L is the number of parameters in the least squares model. The recently published FWL RLS algorithm has a complexity of L2, about 33% lower. We present an algorithm which has a complexity between 5L2/6 and L2/2. The algorithm is in theory as fast and accurate as the other RLS ones, but employs a batch approach, waiting for K≥L consecutive samples and processing them together. When K = L, complexity is highest, but still lower than in the conventional and FWL RLS algorithms. When K >> L complexity converges to one third of conventional RLS algorithms, or one half of the FWL RLS one. The algorithm may have stability problems in fixed-point because of accumulation of numerical errors, and it can only be effectively implemented in floating-point arithmetic. Some DSP processors and advanced FPGAs are capable of using floating-point arithmetic: the algorithm may thus be employed in many advanced DSP hardware. We test it in a C++ implementation.

Journal ArticleDOI
TL;DR: This paper presents a fully differential class-AB current mirror OTA that improves the common-mode behavior of a topology that presents very good differential-mode performance but poor common- mode rejection ratio (CMRR).
Abstract: This paper presents a fully differential class-AB current mirror OTA that improves the common-mode behavior of a topology that presents very good differential-mode performance but poor common-mode rejection ratio (CMRR). The proposed solution requires a low-current auxiliary circuit driven by the input signal, to compensate the effect of the common-mode input component. Simulations in 40-nm CMOS technology show a net reduction of common-mode gain of more than 90dB without affecting the differential-mode behavior; a sample-and-hold amplifier exploiting the proposed amplifier has also been simulated.

Proceedings ArticleDOI
01 May 2017
TL;DR: A novel modeling framework is proposed to quickly estimate the delay variability of logic paths due to random variations, and evaluate the related design margin, and shows that the popular fan-out-of-4 metric F04 can capture the impact of technology and voltage on the delay variations of logicpaths.
Abstract: In this paper, a novel modeling framework is proposed to quickly estimate the delay variability of logic paths due to random variations, and evaluate the related design margin The analysis shows that the popular fan-out-of-4 metric F04 can capture the impact of technology and voltage on the delay variations of logic paths Once those contributions are isolated, the impact of random variations on standard cells' delay is accounted for by means of cell-specific coefficients that are evaluated in a preliminary library characterization phase The proposed framework is very general and applicable from sub-threshold to nominal voltage, and provides the designer with a deep insight into the main delay variability contributions in a path It also predicts the impact of design modifications (eg, logic restructuring, cell up-sizing), and is well suited for pencil-and-paper calculations Case studies involving three critical paths extracted from designs ranging from microprocessors to specialized hardware show adequate accuracy, with a delay variability error being typically less than 10%