scispace - formally typeset
Search or ask a question

Showing papers by "Xilinx published in 2013"


Journal ArticleDOI
TL;DR: The root cause of performance bottlenecks in current full-duplex systems is investigated and signal models for wideband and multiple-input-multiple-output (MIMO) full- DUplex systems are proposed, capturing all the salient design parameters, thus allowing future analytical development of advanced coding and signal design for full- duplex systems.
Abstract: Recent experimental results have shown that full-duplex communication is possible for short-range communications. However, extending full-duplex to long-range communication remains a challenge, primarily due to residual self-interference, even with a combination of passive suppression and active cancelation methods. In this paper, we investigate the root cause of performance bottlenecks in current full-duplex systems. We first classify all known full-duplex architectures based on how they compute their canceling signal and where the canceling signal is injected to cancel self-interference. Based on the classification, we analytically explain several published experimental results. The key bottleneck in current systems turns out to be the phase noise in the local oscillators in the transmit-and-receive chain of the full-duplex node. As a key by-product of our analysis, we propose signal models for wideband and multiple-input-multiple-output (MIMO) full-duplex systems, capturing all the salient design parameters, thus allowing future analytical development of advanced coding and signal design for full-duplex systems.

251 citations


Proceedings ArticleDOI
19 May 2013
TL;DR: This paper proposes a novel VLSI architecture to efficiently compute the approximate inverse using a systolic array and shows reference FPGA implementation results for various system configurations.
Abstract: The high processing complexity of data detection in the large-scale multiple-input multiple-output (MIMO) uplink necessitates high-throughput VLSI implementations In this paper, we propose - to the best of our knowledge - first matrix inversion implementation suitable for data detection in systems having hundreds of antennas at the base station (BS) The underlying idea is to carry out an approximate matrix inversion using a small number of Neumann-series terms, which allows one to achieve near-optimal performance at low complexity We propose a novel VLSI architecture to efficiently compute the approximate inverse using a systolic array and show reference FPGA implementation results for various system configurations For a system where 128 BS antennas receive data from 8 single-antenna users, a single instance of our design processes 19M matrices/s on a Xilinx Virtex-7 FPGA, while using only 39% of the available slices and 36% of the available DSP48 units

158 citations


Proceedings Article
Michaela Blott1, Kimon Karras1, Ling Liu1, Kees Vissers1, Jeremia Bar2, Zsolt István2 
01 Jan 2013
TL;DR: The design of a novel memcached architecture implemented on Field Programmable Gate Arrays (FPGAs) which is the first in literature to achieve 10Gbps line rate processing for all packet sizes and can not only provide significant speed-up but also operate at a lower power consumption than any x86.
Abstract: Distributed in-memory key-value stores such as memcached have become a critical middleware application within current web infrastructure However, typical x86based systems yield limited performance scalability and high power consumption as their architecture with its optimization for single thread performance is not wellmatched towards the memory-intensive and parallel nature of this application In this paper we present the design of a novel memcached architecture implemented on Field Programmable Gate Arrays (FPGAs) which is the first in literature to achieve 10Gbps line rate processing for all packet sizes By transformation of the functionality into a dataflow architecture, the implementation can not only provide significant speed-up but also operate at a lower power consumption than any x86 More specifically, with our prototype we have measured an increase of up to a factor of 36x in requests per second per Watt that can be serviced in comparison to the best published numbers for regular servers with optimized software Additionally, we show that through the tight integration of network interface, memory and compute, round trip latency can be reduced down to below 45 microseconds

97 citations


Proceedings ArticleDOI
01 Dec 2013
TL;DR: Analysis and simulation of the iterative HDD of tightly-braided block codes with BCH component codes for high-speed optical communication shows that these codes are competitive with the best schemes based on HDD.
Abstract: Designing error-correcting codes for optical communication is challenging mainly because of the high data rates (e.g., 100 Gbps) required and the expectation of low latency, low overhead (e.g., 7% redundancy), and large coding gain (e.g., >9dB). Although soft-decision decoding (SDD) of low-density parity-check (LDPC) codes is an active area of research, the mainstay of optical transport systems is still the iterative hard-decision decoding (HDD) of generalized product codes with algebraic syndrome decoding of the component codes. This is because iterative HDD allows many simplifications and SDD of LDPC codes results in much higher implementation complexity. In this paper, we use analysis and simulation to evaluate tightly-braided block codes with BCH component codes for high-speed optical communication. Simulation of the iterative HDD shows that these codes are competitive with the best schemes based on HDD. Finally, we suggest a specific design that is compatible with the G.709 framing structure and exhibits a coding gain of >9.35 dB at 7% redundancy under iterative HDD with a latency of approximately 1 million bits.

86 citations


Proceedings ArticleDOI
Weirong Jiang1
21 Oct 2013
TL;DR: This paper presents a scalable random access memory (RAM)-based TCAM architecture aiming for efficient implementation on state-of-the-art FPGAs, and is the first FPGA design that implements a TCAM larger than 1 Mbits.
Abstract: Ternary Content Addressable Memory (TCAM) is widely used in network infrastructure for various search functions. There has been a growing interest in implementing TCAM using reconfigurable hardware such as Field Programmable Gate Array (FPGA). Most of existing FPGA-based TCAM designs are based on brute-force implementations, which result in inefficient on-chip resource usage. As a result, existing designs support only a small TCAM size even with large FPGA devices. They also suffer from significant throughput degradation in implementing a large TCAM, mainly caused by deep priority encoding. This paper presents a scalable random access memory (RAM)-based TCAM architecture aiming for efficient implementation on state-of-the-art FPGAs. We give a formal study on RAM-based TCAM to unveil the ideas and the algorithms behind it. To conquer the timing challenge, we propose a modular architecture consisting of arrays of small-size RAM-based TCAM units. After decoupling the update logic from each unit, the modular architecture allows us to share each update engine among multiple units. This leads to resource saving. The capability of explicit range matching is also offered to avoid range-to-ternary conversion for search functions that require range matching. Implementation on a Xilinx Virtex 7 FPGA shows that our design can support a large TCAM of up to 2.4 Mbits while sustaining high throughput of 150 million packets per second. The resource usage scales linearly with the TCAM size. The architecture is configurable, allowing various performance trade-offs to be exploited. To the best of our knowledge, this is the first FPGA design that implements a TCAM larger than 1 Mbits.

76 citations


Proceedings ArticleDOI
26 May 2013
TL;DR: This paper proposes a Cholesky-based reference architecture for exact matrix inversion and shows corresponding implementation results on an Virtex-7 FPGA, which reveals that the inversion circuit of choice is determined by the antenna configuration of large-scale MIMO systems.
Abstract: In this paper, we analyze the VLSI implementation tradeoffs for linear data detection in the uplink of large-scale multiple-input multiple-output (MIMO) wireless systems. Specifically, we analyze the error incurred by using the sub-optimal, low-complexity matrix inverse proposed in Wu et al., 2013, ISCAS, and compare its performance and complexity to an exact matrix inversion algorithm. We propose a Cholesky-based reference architecture for exact matrix inversion and show corresponding implementation results on an Virtex-7 FPGA. Using this reference design, we perform a performance/complexity trade-off comparison with an FPGA implementation for the proposed approximate matrix inversion, which reveals that the inversion circuit of choice is determined by the antenna configuration (base-station antennas vs. number of users) of large-scale MIMO systems.

75 citations


Proceedings ArticleDOI
24 Oct 2013
TL;DR: This paper presents the design of a novel hash table which forms the centre piece of this dataflow architecture which can sustain consistent 10Gbps line-rate performance by deploying a concurrent mechanism to handle hash collisions and addresses problems such as support for a broad range of key sizes without stalling the pipeline.
Abstract: Common web infrastructure relies on distributed main memory key-value stores to reduce access load on databases, thereby improving both performance and scalability of web sites. As standard cloud servers provide sub-linear scalability and reduced power efficiency to these kinds of scale-out workloads, we have investigated a novel dataflow architecture for key-value stores with the aid of FPGAs which can deliver consistent 10Gbps throughput. In this paper, we present the design of a novel hash table which forms the centre piece of this dataflow architecture. The fully pipelined design can sustain consistent 10Gbps line-rate performance by deploying a concurrent mechanism to handle hash collisions. We address problems such as support for a broad range of key sizes without stalling the pipeline through careful matching of lookup time with packet reception time. Finally, the design is based on a scalable architecture that can be easily parametrized to work with different memory types operating at different access speeds and latencies. We deployed this hash table in a memcached prototype to index 2 million entries in 24GBytes of external DDR3 DRAM while sustaining 13 million requests per second for UDP binary encoded memcached packets which is the maximum packet rate that can be achieved with memcached on a 10Gbps link.

59 citations


Proceedings ArticleDOI
28 May 2013
TL;DR: In this paper, a 3D finite element method (FEM) was used to study the thermo-mechanical response of the interposer-based package during thermal cycle reliability stressing.
Abstract: TSV (Through Silicon Via)-based interposer has been proposed as a multi-die package solution to meet the rapidly increasing demand in inter-component (e.g. CPU, GPU and DRAM) communication bandwidth in an electronic system. The stacked-silicon die package configuration may give rise to package reliability concerns not observed in conventional monolithic flip-chip packages. 3D finite element method (FEM) was used to study the thermo-mechanical response of the interposer-based package during thermal cycle reliability stressing. Fatigue failures of the C4 and BGA joints are the two primary reliability focuses in the present study. Experimental data collected on the CoWoS™-enabled test vehicles were used to validate the FEM models. Parametric study of key package material and geometric parameters was performed to analyze their effects on C4 bump thermal cycle reliability. Package materials of interest include UF (underfill), lid and substrate, and the geometric parameters include lid thickness and C4 bump scheme. The results showed that the CoWoS package using AlSiC lid has better C4 bump life than the CoWoS package using Cu lid, and when the Tg of the underfill of C4 bump is higher, the C4 bump has better reliability. Furthermore, 3D thermo-mechanical and reliability study of BGA balls is presented for organic and ceramic substrates. Several DOEs have been constructed for ceramic substrate to increase BGA reliability by optimizing C4 underfill material and package design. The effect of board layer count and design is detailed. Finally reliability of BGA balls, C4 and micro-bumps are compared for a part that is mounted on a PCB board.

51 citations


Patent
23 Feb 2013
TL;DR: In this article, a system generally relating to an SoC, which may be a field programmable SoC (FPSoC), is disclosed, which includes a processing unit, a first internal memory, an authentication engine, and a decryption engine.
Abstract: A system generally relating to an SoC, which may be a field programmable SoC (“FPSoC”), is disclosed. In this SoC, dedicated hardware includes a processing unit, a first internal memory, a second internal memory, an authentication engine, and a decryption engine. A storage device is coupled to the SoC. The storage device has access to a boot image. The first internal memory has boot code stored therein. The boot code is for a secure boot of the SoC. The boot code is configured to cause the processing unit to control the secure boot.

47 citations


Proceedings ArticleDOI
Michaela Blott1, Kees Vissers1
01 Aug 2013
TL;DR: A collection of slides covering the following topics: key-value stores; TCP-IP stack; synchronization overhead; L3 cache; FPGA-based dataflow architecture; hash table architecture; memcached evaluation; and code complexity.
Abstract: Presents a collection of slides covering the following topics: key-value stores; TCP-IP stack; synchronization overhead; L3 cache; FPGA-based dataflow architecture; hash table architecture; memcached evaluation; and code complexity.

41 citations


Patent
15 Mar 2013
TL;DR: In this paper, a method relating generally to loading a boot image is disclosed, in which a header of a boot file is read by boot code executed by a system-on-chip.
Abstract: A method relating generally to loading a boot image is disclosed. In such a method, a header of a boot image file is read by boot code executed by a system-on-chip. It is determined whether the header read has an authentication certificate. If the header has the authentication certificate, authenticity of the header is verified with the first authentication certificate. It is determined whether the header is encrypted. If the header is encrypted, the header is decrypted.

Journal ArticleDOI
01 Jan 2013
TL;DR: In this article, a 28nm FPGA with stacked silicon interconnect technology (SSIT) platform is used to develop and optimize the seamless integration of the processes, structures, parameter, as well as to evaluate the yield, reliability and device performance of them.
Abstract: Technology challenges and solutions in the development and manufacturing of Stacked Silicon Interconnect (SSI) Technology have been investigated with the established foundry and OSAT ecosystem. Key enabling technologies, such as TSV processing, interposer backside manufacturing yield enhancement, new stacking technology, interposer warpage control, micro-bump (μ-bump) processes and joining, that comprise the building blocks for SSI technology were developed. Xilinx 28nm FPGA with stacked silicon interconnect technology (SSIT) platform is used to develop and optimize the seamless integration of the processes, structures, parameter, as well as to evaluate the yield, reliability and device performance of them.

Patent
06 Sep 2013
TL;DR: In this article, a method of generating a digital image is described, which comprises detecting light from a scene to form an image, identifying an aberration in the image, and implementing a color filter array interpolator based upon the detected aberration.
Abstract: A method of generating a digital image is described. The method comprises detecting light from a scene to form an image; identifying an aberration in the image; and implementing a color filter array interpolator based upon the detected aberration in the image. A device for generating a digital image is also described.

Patent
Samskrut J. Konduru1
13 Mar 2013
TL;DR: In this paper, a power off state for a selected power domain of a circuit design is emulated by partially reconfiguring the reconfigurable region of the integrated circuit, which is implemented within a reconfiguration partition.
Abstract: Testing power domains of a circuit design includes correlating, using a processor, a selected power domain of a circuit design having a plurality of power domains with a partial reconfiguration partition and implementing the circuit design within an integrated circuit. The partial reconfiguration partition is implemented within a reconfigurable region of the integrated circuit. A power off state for the selected power domain of the circuit design is emulated by partially reconfiguring the reconfigurable region of the integrated circuit.

Journal ArticleDOI
Roy D. Cideciyan1, Mark A. Gustlin2, Mike Peng Li, John Wang3, Zhongfeng Wang3 
TL;DR: The background of the market drivers for this technology, the various technologies that are used to solve the challenging problems of running across the various mediums at a data rate of 25 Gb/s and the details of the forward error correction, transcoding and physical layer coding that are employed to achieve robust links.
Abstract: This article provides an overview of some of the work that is ongoing in the IEEE P802.3bj 100 Gb/s Backplane and Copper Cable Task Force. The task force is standardizing Ethernet at 100 Gb/s over a 4-lane backplane channel as well as across a 4-lane copper cable. We first describe the background of the market drivers for this technology, and then give an overview of the various technologies that are used to solve the challenging problems of running across the various mediums at a data rate of 25 Gb/s. Also discussed are the details of the forward error correction, transcoding and physical layer coding that are employed to achieve robust links.

Patent
Jun-Chau Chien1, Wayne Fang1, Parag Upadhyaya1, Jafar Savoj1, Kun-Yung Chang1 
14 Mar 2013
TL;DR: In this paper, a delay-locked loop is coupled to an injection-locked phase-locked oscillator in an ICL-PLL, and the oscillator is in a feedback loop path of the delaylocked loop.
Abstract: An apparatus relates generally to an injection-controlled-locked phase-locked loop (“ICL-PLL”) is disclosed. In this apparatus, a delay-locked loop is coupled to an injection-locked phase-locked loop. An injection-locked oscillator of the injection-locked phase-locked loop is in a feedback loop path of the delay-locked loop.

Proceedings ArticleDOI
11 Feb 2013
TL;DR: This tutorial will present a detailed introduction to Vivado HLS, which is capable of synthesizing optimized FPGA circuits from algorithmic descriptions in C, C++ and SystemC and show how interesting system architectures can be constructed using High Level Synthesis and the programmable logic portion of these devices.
Abstract: Engineering complex systems inevitably requires a designer to balance many conflicting design requirements including performance, cost, power, and design time. In many cases, FPGAs enable engineers to balance these design requirements in ways not possible with other technologies like ASICs, ASSPs, GPUs or general purpose processors. This tutorial will focus on two of the newest commercial FPGA-related technologies, High Level Synthesis (HLS) and Programmable Logic integrated tightly with high performance embedded processors. In particular, we will present a detailed introduction to Vivado HLS, which is capable of synthesizing optimized FPGA circuits from algorithmic descriptions in C, C++ and SystemC. We will also present an introduction to the architecture of Zynq devices and show how interesting system architectures can be constructed using High Level Synthesis and the programmable logic portion of these devices.

Patent
Yatharth K. Kochar1
15 Mar 2013
TL;DR: In this article, a boot code is initiated from boot code stored in nonvolatile memory responsive to a power-on-resettable (PONR) switch, and the first register value and name string are converted to a first string value, which is provided as a first filename.
Abstract: A method includes initiating a boot of a system-on-chip coupled to a boot device. The boot is initiated from boot code stored in nonvolatile memory responsive to a power-on-reset. Under control of the boot code: a first register value is loaded into a register; a name string from the boot code is accessed; the first register value is obtained from the register; and the first register value and name string are converted to a first string value, which is provided as a first filename. The boot device is searched for a boot image file with the first filename. If the first filename is not found in the boot device, the first register value is incremented to provide a second register value. The obtaining, converting, and searching are repeated using a second filename generated using the second register value, and a valid filename for the boot image file is iteratively generated.

Proceedings ArticleDOI
16 Jun 2013
TL;DR: In this paper, the authors proposed a relay-based integrated circuit with Ru contact material, which is a promising alternative contacting electrode material because it forms an electrically conductive native oxide.
Abstract: Micro-electro-mechanical relays are of interest for low-power digital integrated circuits, due to their ideally zero OFF-state leakage current. The first relay-based integrated circuits utilized tungsten (W) as the contacting material because of its excellent resistance to physical wear. However, W oxidizes easily so that the contact resistance increases to an unacceptably high level over the device operating lifetime. Ruthenium (Ru) is a promising alternative contacting electrode material because it forms an electrically conductive native oxide. In this work, challenges and solutions for integrating Ru into a relay fabrication process are described. Ru-contact relays with good switching behavior and more stable ON-state resistance than W-contact relays are demonstrated.

Journal ArticleDOI
TL;DR: A head-body finite automaton (HBFA) is proposed, which implements SPM in two parts: a head DFA (H-DFA) and a body NFA (B-NFA), which achieves 3x to 8x throughput when matching real-life large dictionaries against inputs with high match ratios.
Abstract: Conventionally, dictionary-based string pattern matching (SPM) has been implemented as Aho-Corasick deterministic finite automaton (AC-DFA). Due to its large memory footprint, a large-dictionary AC-DFA can experience poor cache performance when matching against inputs with high match ratio on multicore processors. We propose a head-body finite automaton (HBFA), which implements SPM in two parts: a head DFA (H-DFA) and a body NFA (B-NFA). The H-DFA matches the dictionary up to a predefined prefix length in the same way as AC-DFA, but with a much smaller memory footprint. The B-NFA extends the matching to full dictionary lengths in a compact variable-stride branch data structure, accelerated by single-instruction multiple-data (SIMD) operations. A branch grafting mechanism is proposed to opportunistically advance the state of the H-DFA with the matching progress in the B-NFA. Compared with a fully populated AC-DFA, our HBFA prototype has <;1/5 construction time, requires <;1/20 runtime memory, and achieves 3x to 8x throughput when matching real-life large dictionaries against inputs with high match ratios. The throughput scales up 27x to over 34 Gbps on a 32-core Intel Manycore Testing Lab machine based on the Intel Xeon X7560 processors.

Patent
Stephen M. Trimberger1
22 Apr 2013
TL;DR: In this article, a method, non-transitory computer readable medium, and apparatus for performing physically unclonable function (PUF) burn-in are disclosed, and the method identifies, by a processor, a natural output of an integrated circuit before the integrated circuit is initialized.
Abstract: A method, non-transitory computer readable medium, and apparatus for performing physically unclonable function (PUF) burn-in are disclosed. For example, the method identifies, by a processor, a natural output of an integrated circuit before the integrated circuit is initialized, identifies, by the processor, a physical characteristic of the integrated circuit associated with the physically unclonable function, and ages, by the processor, the physical characteristic of the integrated circuit to burn-in the natural output of the integrated circuit.

Patent
11 Dec 2013
TL;DR: In this article, a boot loader is configured to search a second boot device of the plurality of boot devices for an uncorrupt boot image, in response to failing to find a corrupted boot image in the first boot device.
Abstract: In some disclosed implementations, a system-on-chip on a first IC die includes a boot loader circuit configured to search a first boot device, of a plurality of boot devices coupled to and external to the first IC die, for an uncorrupt boot image. The boot loader circuit is configured to search a second boot device of the plurality of boot devices for an uncorrupt boot image, in response to failing to find an uncorrupt boot image in the first boot device. The boot loader is also configured to load a set of instructions included in the uncorrupt boot image into a memory circuit of the SOC, in response to finding an uncorrupt boot image.

Patent
Brendan Farley1, James Hudner1, Ivan Bogue1, Declan Carey1, Darragh Walsh1, Marc Erett1 
27 Jun 2013
TL;DR: In this paper, an analog-to-digital converter (ADC) with a bank of comparators and a window controller is described, where the window controller can selectively activate first comparators associated with a window size.
Abstract: An analog-to-digital converter ("ADC") (300) is disclosed The ADC includes a bank (310) of comparators (312) and a window controller (315) The window controller is coupled to the bank of comparators to selectively activate first comparators of the bank of comparators associated with a window size and to selectively inactivate second comparators of the bank of comparators

Proceedings ArticleDOI
09 Mar 2013
TL;DR: In this article, the authors investigated various failure modes for logic relays and showed that structural fatigue, dielectric charging, and contact stiction are not reliability-limiting issues.
Abstract: Micro-electro-mechanical (MEM) relays are an intriguing alternative to transistors for ultra-low-power digital logic applications [1]. This paper investigates various failure modes for logic relays. Experimental results are presented to show that structural fatigue, dielectric charging, and contact stiction are not reliability-limiting issues. Contact resistance instability caused by surface oxidation and contamination is the primary challenge, and can be influenced by device design and operating conditions.

Patent
05 Sep 2013
TL;DR: In this paper, a method relating generally to generating a boot image, as performed by an information handling system, for an embedded device is disclosed, which includes a public key obtained by a boot-image generator.
Abstract: A method relating generally to generating a boot image, as performed by an information handling system, for an embedded device is disclosed. This method includes a public key obtained by a boot image generator. A first hash for the public key is generated by the boot image generator. The first hash is provided to a signature generator. A first signature for the first hash is generated by the signature generator. A first partition for the boot image is obtained by the boot image generator. A second hash for the first partition is generated by the boot image generator. The second hash is provided to the signature generator. A second signature for the second hash is generated by the signature generator. The boot image generator and the signature generator are programmed into the information handling system. The boot image includes the public key, the first signature, and the second signature. The boot image is output from the information handling system.

Journal ArticleDOI
TL;DR: Integration techniques enable the utilization of the transceiver in FPGAs with both wire-bond and flip-chip packages and resolve significant challenges with receiver input and transmitter output insertion loss, power integrity, ESD, and reliability.
Abstract: This paper describes the design of a 0.5-6.6 Gb/s fully-adaptive low-power quad transceiver embedded in low-leakage 28 nm CMOS FPGAs. Integration techniques enable the utilization of the transceiver in FPGAs with both wire-bond and flip-chip packages and resolve significant challenges with receiver input and transmitter output insertion loss, power integrity, ESD, and reliability. The transceiver clocking network provides continuous operation range up to the maximum speed and incorporates two wide-range ring-based PLLs for enhanced clocking flexibility. The receiver front-end utilizes a 3-stage CTLE with wide input common-mode to remove the post-cursor ISI. The CTLE is fully adaptive using an LMS algorithm and edge-based equalization. The transmitter utilizes a 3-tap FIR. The transceiver achieves BER 10-15 at 6.6 Gb/s over a 20 dB loss channel. Power consumption is 129 mW from 1.2 V and 1 V supplies.

Patent
Wojciech A. Koszek1
16 Oct 2013
TL;DR: In this article, computer-readable media and devices for executing a plurality of startup instructions are disclosed, where a first processor of a device accesses a plurality (i.e., a subset) of the startup instructions in response to a startup of the device and executes a first task and a second task at the same time.
Abstract: Methods, computer-readable media and devices for executing a plurality of startup instructions are disclosed. For example, a method includes a first processor of a device accessing a plurality of startup instructions in response to a startup of the device. The first processor then executes a first startup instruction of the plurality of startup instructions to perform a first task and executes a second startup instruction of the plurality of startup instructions. The executing the second startup instruction causes the first processor to send a further instruction to a second processor of the device to perform a second task. At least a portion of the first task and at least a portion of the second task are performed at a same time.

Patent
Gaurav Malhotra1
14 Oct 2013
TL;DR: In this article, an apparatus consisting of a phase interpolator, a detector and a slicer is described, which is coupled to the slicer to provide a sampling signal for a sampling position of the phase interpolators.
Abstract: An apparatus generally relating to a receiver is disclosed. In this apparatus, the receiver includes a phase interpolator, a detector and a slicer. The slicer is coupled to the phase interpolator to provide a sampling signal for a sampling position of the phase interpolator. The detector is coupled to the slicer to receive the sampling signal. The detector is configured to adjust a code of the phase interpolator to adjust the sampling position iteratively in response to the sampling signal to tune the sampling position of the receiver toward an optimum therefor.

Patent
22 Apr 2013
TL;DR: In this paper, a method, non-transitory computer readable medium, and apparatus for preventing accelerated aging of a physically unclonable function (PUF) circuit are disclosed.
Abstract: A method, non-transitory computer readable medium, and apparatus for preventing accelerated aging of a physically unclonable function (PUF) circuit are disclosed. For example, the method monitors an environmental condition associated with the physically unclonable function circuit, detects a change in the environmental condition associated with the physically unclonable function circuit, and, in response to the change in the environmental condition, implements a security function for preventing the accelerated aging of the physically unclonable function circuit.

Patent
25 Feb 2013
TL;DR: In this article, a method for synthesizing an HLL program is described, where one or more variables to observe and/or control in a function of HLL programs are added to the function in the synthesized program, and a corresponding interface circuit is instantiated in a synthesized design.
Abstract: A method is provided for synthesizing an HLL program. For one or more variables to observe and/or control in a function of the HLL program, a first code segment is added to the function in the HLL program. For each of the one or more variables a respective second code segment is also added to the HLL program. In response to encountering the first code segment during synthesis of the HLL program, a memory is instantiated in a synthesized design. In response to encountering the second code segment during synthesis of the HLL program, a respective interface circuit is instantiated in the synthesized design. Each interface circuit is configured to replicate a state of the corresponding variable in the memory during operation of the synthesized design. A table is generated that maps names of the one or more variables to respective memory addresses in the memory.