scispace - formally typeset
Search or ask a question
Author

Shien-Chun Luo

Other affiliations: National Cheng Kung University
Bio: Shien-Chun Luo is an academic researcher from Industrial Technology Research Institute. The author has contributed to research in topics: Subthreshold conduction & Dynamic voltage scaling. The author has an hindex of 4, co-authored 12 publications receiving 143 citations. Previous affiliations of Shien-Chun Luo include National Cheng Kung University.

Papers
More filters
Journal ArticleDOI
TL;DR: A novel level shifter, of which the operating range is from a deep subthreshold voltage to the standard supply voltage and includes upward and downward level conversion and is designed for practical applications.
Abstract: Wide-range level shifters play critical roles in ultra- low-voltage circuits and systems. Although state-of-the-art level shifters can convert a subthreshold voltage to the standard supply voltage, they may have limited operating ranges, which restrict the flexibility of dynamic voltage scaling. Therefore, this paper presents a novel level shifter, of which the operating range is from a deep subthreshold voltage to the standard supply voltage and includes upward and downward level conversion. The proposed level shifter is a hybrid structure comprising a modified Wilson current mirror and generic CMOS logic gates. The simulation and measurement results were verified using a 65-nm technology. The minimal operating voltage of the proposed level shifter was less than 200 mV based on the measurement results. In addition to the operating range, the delay, power consumption, and duty cycle of the proposed level shifter were designed for practical applications.

105 citations

Journal ArticleDOI
TL;DR: This brief proposes an adaptive pulse-generating method to fit the transparent window required in situ to improve the robustness of pulse-triggered flip-flops and promises this high-speed clocked element for wide-voltage-range operations.
Abstract: Pulse-triggered flip-flops are candidates to improve pipeline speed, although flip-flop robustness and system timing closure are challenging in a wide range of supply voltages. Pulse-triggered flip-flops usually have specific structures and transistor sizes to optimize performance. The topology, transistor size, and threshold voltage of the flip-flop make the timing characteristics sensitive to the supply voltage. The transparent windows generated and required in a pulse-triggered flip-flop may have mismatch under supply voltage scaling, which is likely to result in functional and system timing failures. Therefore, this brief proposes an adaptive pulse-generating method to fit the transparent window required in situ. Process variations and intrinsic transistor driving-strength mismatches are considered. The proposed structure improves the robustness of pulse-triggered flip-flops and promises this high-speed clocked element for wide-voltage-range operations. A normalized timing metric is also introduced to characterize flip-flops and help timing tradeoffs in wide-voltage-range operations.

18 citations

Journal ArticleDOI
TL;DR: A novel SA-activation scheme by sensing differential bitlines locally and concurrently is proposed that effectively tolerates the WID variations and supports dynamic voltage scaling down to the subthreshold supply voltage.
Abstract: The access timing control of low-voltage static random access memory cells encounters crucial challenges in the presence of within-die (WID) variations, which induce severe delay mismatches between the timing-reference circuit and the bitlines. Prevention of early activation of sense amplifiers (SAs) is thus required to improve the yield. This brief proposes a novel SA-activation scheme by sensing differential bitlines locally and concurrently. The proposed structure effectively tolerates the WID variations and supports dynamic voltage scaling down to the subthreshold supply voltage. Measurement results show that the fabricated 8-kb test chips using 90-nm technology can be operated at the supply voltage range from 1 V (nominal Vdd) to 0.16 V. The maximum operating frequency at 0.16 V is up to 200 kHz.

17 citations

Proceedings ArticleDOI
28 Mar 2013
TL;DR: This paper presents a video recording SoC fabricated in 65nm low-power technology, which integrates a complexity and bandwidth-effective H.264 encoder, an ultra-low-power (ULP) MPU, with timing-optimized ROM and 8T SRAM macros for ultra- low-voltage (ULV) operation, a 512Kb ULV and leakage-aware 8TSRAM for the frame buffer (FB), and various on-chip peripherals, such
Abstract: This paper presents a video recording SoC fabricated in 65nm low-power technology, which integrates a complexity and bandwidth-effective H.264 encoder, an ultra-low-power (ULP) MPU, with timing-optimized ROM and 8T SRAM macros for ultra-low-voltage (ULV) operation, a 512Kb ULV and leakage-aware 8T SRAM for the frame buffer (FB), and various on-chip peripherals, such as external memory interfaces (Fig. 9.3.1). Utilizing ULV cell libraries with custom-pulsed D flip-flops (PFF) for wide-range voltage scaling, ROM/SRAM macros optimized simultaneously for timing and leakage, and advanced energy management (AEM), the SoC achieves 32fps HD720 H.264 encoding at 1.0V, down to 0.57nJ/pixel ultra-low energy dissipation at 0.48V (30fps QQVGA H.264 encoding for preview through ANT+).

14 citations

Proceedings ArticleDOI
22 Apr 2019
TL;DR: This presentation introduces an example of custom deep- learning accelerating system developed from open-source NVIDIA deep-learning accelerator (DLA), and supplements an environment for developing the system, from model quantization, model compilation, test generation, to driving tools.
Abstract: Deep-learning algorithms require large and parallel multiplication and accumulation (MAC), which fits hardware accelerators consist of parallel processing elements (PEs) to speed up. As the PE number increases, how to distribute data in time becomes the key problem. Improving the accelerator performance needs to balance the computation power and the data communication bandwidth, which forms a roofline model of the throughput. In addition to hardware setup, the combination of neural network operators, layers, also causes the roofline curve to shift. Optimizing the performance, power, cost of the accelerators needs to link the neural network models to the physical hardware setup, which indicates custom and model-specific are essential. This presentation introduces an example of custom deep-learning accelerating system developed from open-source NVIDIA deep-learning accelerator (DLA). We supplement an environment for developing the system, from model quantization, model compilation, test generation, to driving tools. FPGA prototypes and test chips are designed, running an application of object detection, offering 70% MAC network utilization.

7 citations


Cited by
More filters
Journal ArticleDOI
01 Dec 2010
TL;DR: Asymmetrical Write-assist cell virtual ground biasing scheme and positive feedback sensing keeper schemes are proposed to improve the read static noise margin (RSNM), write margin (WM), and operation speed of a single-ended read/write 8 T SRAM cell.
Abstract: In this paper, asymmetrical Write-assist cell virtual ground biasing scheme and positive feedback sensing keeper schemes are proposed to improve the read static noise margin (RSNM), write margin (WM), and operation speed of a single-ended read/write 8 T SRAM cell. A 4 Kbit SRAM test chip is implemented in 90 nm CMOS technology. The test chip measurement results show that at 0.2 V VDD, an operation frequency of 6.0 MHz can be achieved with power consumption of 10.4 μW.

113 citations

Journal ArticleDOI
TL;DR: A novel level shifter, of which the operating range is from a deep subthreshold voltage to the standard supply voltage and includes upward and downward level conversion and is designed for practical applications.
Abstract: Wide-range level shifters play critical roles in ultra- low-voltage circuits and systems. Although state-of-the-art level shifters can convert a subthreshold voltage to the standard supply voltage, they may have limited operating ranges, which restrict the flexibility of dynamic voltage scaling. Therefore, this paper presents a novel level shifter, of which the operating range is from a deep subthreshold voltage to the standard supply voltage and includes upward and downward level conversion. The proposed level shifter is a hybrid structure comprising a modified Wilson current mirror and generic CMOS logic gates. The simulation and measurement results were verified using a 65-nm technology. The minimal operating voltage of the proposed level shifter was less than 200 mV based on the measurement results. In addition to the operating range, the delay, power consumption, and duty cycle of the proposed level shifter were designed for practical applications.

105 citations

Journal ArticleDOI
TL;DR: A novel ultra-low voltage level shifter for fast and energy-efficient wide-range voltage conversion from sub-threshold to I/O voltage with good delay scalability with supply voltage scaling and low sensitivity to process and temperature variations is presented.
Abstract: This paper presents a novel ultra-low voltage level shifter for fast and energy-efficient wide-range voltage conversion from sub-threshold to I/O voltage. By addressing the voltage drop and non-optimal feedback control in a state-of-the-art level shifter based on Wilson current mirror, the proposed level shifter with revised Wilson current mirror significantly improves the delay and power consumption while achieving a wide voltage conversion range. It also employs mixed-Vt device and device sizing aware of inverse narrow width effect to further improve the delay and power consumption. Measurement results at 0.18 μm show that compared with the Wilson current mirror based level shifter, the proposed level shifter improves the delay, switching energy and leakage power by up to 3×, 19×, 29× respectively, when converting 0.3 V to a voltage between 0.6 V and 3.3 V. More specifically, it achieves 1.03 (or 1.15) FO4 delay, 39 (or 954) fJ/transition and 160 (or 970) pW leakage power, when converting 0.3 V to 1.8 V (or 3.3 V), which is better than several state-of-the-art level shifters for similar range voltage conversion. The measurement results also show that the proposed level shifter has good delay scalability with supply voltage scaling and low sensitivity to process and temperature variations.

82 citations

Journal ArticleDOI
TL;DR: The reliable operation at the energy-minimum voltage of the various SCM architectures in a 65-nm CMOS technology considering within-die process parameter variations is demonstrated by means of Monte Carlo circuit simulation and the area of the best SCM architecture is compared to recent sub-VT SRAM designs.
Abstract: In this paper, standard-cell based memories (SCMs) are proposed as an alternative to full-custom sub-VT SRAM macros for ultra-low-power systems requiring small memory blocks. The energy per memory access as well as the maximum achievable throughput in the sub-VT domain of various SCM architectures are evaluated by means of a gate-level sub-VT characterization model, building on data extracted from fully placed, routed, and back-annotated netlists. The reliable operation at the energy-minimum voltage of the various SCM architectures in a 65-nm CMOS technology considering within-die process parameter variations is demonstrated by means of Monte Carlo circuit simulation. Finally, the energy per memory access, the achievable throughput, and the area of the best SCM architecture are compared to recent sub-VT SRAM designs.

80 citations

Journal ArticleDOI
TL;DR: An energy-efficient level shifter able to convert extremely low level input voltages to the nominal voltage domain based on the single-stage differential-cascode-voltage-switch scheme that exploits self-adapting pull-up networks to increase the switching speed and to reduce the dynamic energy consumption.
Abstract: This brief presents an energy-efficient level shifter (LS) able to convert extremely low level input voltages to the nominal voltage domain. To obtain low static power consumption, the proposed architecture is based on the single-stage differential-cascode-voltage-switch scheme. Moreover, it exploits self-adapting pull-up networks to increase the switching speed and to reduce the dynamic energy consumption, while a split input inverting buffer is used as the output stage to further improve energy efficiency. When implemented in a commercial 180-nm CMOS process, the proposed design can up-convert from the deep subthreshold regime (sub-100 mV) to the nominal supply voltage (1.8 V). For the target voltage level conversion from 0.4 to 1.8 V, our LS exhibits an average propagation delay of 31.7 ns, an average static power of less than 60 pW, and an energy per transition of 173 fJ, as experimentally measured across the test chips.

73 citations