scispace - formally typeset
Search or ask a question

Showing papers presented at "Southern Conference Programmable Logic in 2019"


Proceedings ArticleDOI
10 Apr 2019
TL;DR: A measurement method is presented and applied over the five available interfaces of Zynq-7000 devices, considering the most used alternatives, providing a better understanding of system performance.
Abstract: Zynq-7000 devices from Xilinx has gained strong popularity in the last years. Several documents and examples about interfaces usage and how to communicate the programmable logic with the processor are available, but some of them are not properly explained and in particular, the maximum throughput is not clearly specified. With this purpose, in this work a measurement method is presented and applied over the five available interfaces, considering the most used alternatives. Tests were carried on a Zybo board, but the results can be easily used to estimate the performance of others systems setups. Special hardware features and functionality are also discussed, providing a better understanding of system performance. Related papers were studied but none of them presents comparable information as to provide a fair comparison.

6 citations


Proceedings ArticleDOI
10 Apr 2019
TL;DR: A concrete communication block has been successfully implemented and utilized for a quick implementation of a data acquisition system based on a Xilinx Zynq-7030 FPGA Mezzanine Card and a custom FMC module with an 8-bit 500 MSPS ADC.
Abstract: A portable architectural design strategy is described for the implementation of reconfigurable virtual instrumentation based on programmable Systems-on-Chip integrating microprocessors and FPGA in the same physical device. The key role is played by a general purpose communication block as a means to efficiently separate the activities carried out in the microprocessor and in the FPGA. Both parts interact according to simple logic protocols by reading and writing data on the common memory resources of the communication block. The architecture of the proposed communication system can be easily implemented in practically any modern programmable System-on-Chip. With the proposed strategy, the porting of embedded software programs and associated FPGA designs among different device families and vendors is facilitated. A structured methodology is proposed for handling complex real-time systems based on these programmable Systems-on-Chip. We described a concrete communication block that has been successfully implemented and utilized for a quick implementation of a data acquisition system based on a Xilinx Zynq-7030 FPGA Mezzanine Card (FMC) and a custom FMC module with an 8-bit 500 MSPS ADC.

6 citations


Proceedings ArticleDOI
01 Apr 2019
TL;DR: In this paper, an FPGA-based system that implements Campbell mode operation was successfully tested in the CNEA RA-3 research and production reactor at Ezeiza Atomic Center.
Abstract: Neutron flux monitoring in research reactors can range from shutdown to full power over 10 to 12 decades. At low power, neutron flux is usually measured with proportional counters in pulse mode. The detector is moved away from high flux zones to avoid pulse saturation until ionization chambers are in range. Campbell mode allows to make a measurement using a single detector able to cover the entire operating range. This work presents an FPGA-based system that implements Campbell mode operation. The system prototype was successfully tested in the CNEA RA-3 research and production reactor at Ezeiza Atomic Center.

3 citations


Proceedings ArticleDOI
10 Apr 2019
TL;DR: This work presents a low-resource, low-level portable implementation approach that can be easily scaled up in the field programmable gate array market.
Abstract: GPS receivers constitute a topic of great importance since they have application in many fields of science and industry as geodesy, aviation, security and defense to name a few. The understanding of their internals, including the nature of signals and the algorithms involved in their processing are crucial in the development of customized receivers. In this work, development, simulation and testing of search and tracking modules on a field programmable gate array (FPGA) is presented. Additionally, the architecture of the proposed system is presented along with front end design and implementation. This work presents a low-resource, low-level portable implementation approach that can be easily scaled up.

2 citations


Proceedings ArticleDOI
01 Apr 2019
TL;DR: This paper explores the problem of flow metering in 100 GbE links, presenting a flow exporter architecture based on a FPGA acceleration card using only on-chip memory, and considers thatFPGA-fabric offers adequate flexibility and performance for this task and is capable of reducing overall system cost.
Abstract: This paper explores the problem of flow metering in 100 GbE links, presenting a flow exporter architecture based on a FPGA acceleration card using only on-chip memory. Peak performance without packet sampling even at the maximum packet rate is assured and means to avoid data loss are provided, since a low level of aggregation is achieved. This is the first approach in a series of architectures that are built upon the previous one, where the resources of the custom hardware are gradually increased, improving the aggregation level, while the required commodity hardware resources for subsequent stages are consequently lowered. We consider that FPGA-fabric offers adequate flexibility and performance for this task and is capable of reducing overall system cost. A functional prototype of the system has been implemented on the Xilinx VCU118 development board configured to export TCP sessions records. This achievement represents a cornerstone of a 100 GbE FPGA flow exporter design, that aims for supporting in the order of tens of millions concurrent flows.

2 citations


Proceedings ArticleDOI
01 Apr 2019
TL;DR: The design was validated at both logical and functional level and it was implemented in three different FPGA, from different families and manufacturers: 5CEBA4 by Intel, M2GL025 by Microsemi and HX-8K (iCE40 family) by Lattice.
Abstract: This paper presents the design of a FPGA based control unit for an implantable neuromodulation circuit. The design was validated at both logical and functional level and it was implemented in three different FPGA, from different families and manufacturers: 5CEBA4 (CycloneV family) by Intel, M2GL025 (IGL002 family) by Microsemi and HX-8K (iCE40 family) by Lattice. The design uses several parameterized constants such as output electrode number, therapy number and system clock frequency to create different sort of instances at synthesis time. For a given configuration with 4 therapies and 16 electrodes the design required less than 3900 logic elements. For that configuration the FPGAs total core power consumption was measured at 37°C and 7.8 MHz. The results were as low as 3.6 mW when delivering a continuous stimulation burst at a frequency of 10 kHz.

1 citations


Proceedings ArticleDOI
10 Apr 2019
TL;DR: This work shows an implementation of a full-bridge converter model using HFP cores and shows that the HFP-based model achieve a simulation step around 10 ns in this case, however, when decreasing the integration step, numerical resolution can become an issue.
Abstract: Programmable logic is becoming usual in Hardware-In-the-Loop (HIL) emulation due to its acceleration capabilities. HIL technique is specifically useful for verifying power electronics. But even using programmable logic, if integration steps below 100 ns are required and floating-point is the chosen representation, it has not been possible to reach real time simulations. With the release of devices with HFP (Hardened Floating-Point) cores -dedicated floating-point blocks implemented in silicon-, the minimum achievable simulation step decreases significantly. This work shows an implementation of a full-bridge converter model using HFP cores. Results show that the HFP-based model achieve a simulation step around 10 ns in this case. However, when decreasing the integration step, numerical resolution can become an issue. Thus, designers face a trade-off before selecting 32-bit floating-point representation for a model: better integration steps vs. accuracy limits. In this way, resolution and accuracy are also studied.

1 citations


Proceedings ArticleDOI
10 Apr 2019
TL;DR: This paper presents the implementation of the search process of a content-based image retrieval system, using metric spaces to perform the search and recovery of the image.
Abstract: The amount of multimedia information on digital platforms has been increasing over the years. Social networks and the advancement of technology have been a determining factor for this event. Due to this fact, the organization, qualification and handling of this type of information has become indispensable, as well as assuring the user the quality of the service in content and execution time in the retrieval of information. This paper presents the implementation of the search process of a content-based image retrieval system, using metric spaces to perform the search and recovery of the image. The high level synthesis is used to development the IP block that will carry out the recovery process in the programmable logic. The experiments are performed on a PYNQ-Z1 board from $\mathbf{Xillinx}\bigcirc\!\!\!\!\!\!{\mathrm{c}}$ and on a CPU $\mathbf{Intel}\bigcirc\!\!\!\!\!\!{\mathrm{c}}$ Core i5 7th generation. The effectiveness of the implementation is supported by the results obtained.

1 citations


Proceedings ArticleDOI
01 Apr 2019
TL;DR: Two hardware schemes for calculating luminosity histograms using FPGAs using embedded RAM blocks and a parallel structure of accomulators that can be easyly adapted to any input bus width are described.
Abstract: This paper describes two hardware schemes for calculating luminosity histograms using FPGAs. The first circuit makes extensive use of the embedded RAM blocks present in many FPGA models. The second alternative is a parallel structure of accomulators that can be easyly adapted to any input bus width. During the histogram computation, each processed pixel increments the value of the register corresponding to its luminance level. Therefore, if several pixels are evaluated at the same time, writing conflicts can be generated when a specific luminosity register is updated by more than one pixel. In the two proposed architectures these collision problems are eliminated. The calculation is made directly from the DC coefficients of the compressed video. This fact minimizes data bandwidth per frame, allowing a fast determination of similarity. The presented histogram circuits are part of an FPGA-based custom processor to calculate the similarity between two video frames by cross-correlating their histograms.

1 citations


Proceedings ArticleDOI
10 Apr 2019
TL;DR: This paper proposes software to hardware migration methodology for the DUTILS environment that allows a seamless integration between software and hardware design and the verification process flow of the whole system.
Abstract: Modern FPGA developments require flexible and Agile methodologies to support complex designs meeting the current highly demanding time-to-market metrics. Traditional hardware development processes based on waterfall flows are not adequate to get the most of the new reconfigurable FPGA technologies. Co-design and co-verification techniques allow handling both software and hardware development in a highly integrated process. However, such integration requires a deep knowledge of both hardware and software development. DUTILS is a Python/Cocotb-based environment for concurrent development suitable for modern software development technologies. This paper proposes software to hardware migration methodology for the DUTILS environment that allows a seamless integration between software and hardware design and the verification process flow of the whole system.

1 citations


Proceedings ArticleDOI
10 Apr 2019
TL;DR: A high-speed communication link implementation for a high-resolution camera used in a low earth orbit (LEO) satellite, that was already in space, was implemented and it has worked successfully in space.
Abstract: This work presents a high-speed communication link implementation for a high-resolution camera used in a low earth orbit (LEO) satellite, that was already in space. The whole acquisition system involves three Microsemi FPGAs that move a great amount of data over a multi-link high-speed serial channel. The implementation of a custom protocol is proposed to perform a point-to-point communication. To validate the proposal, tests on hardware are presented, reaching an average speed up to 2.96 Gbps. Finally, the proposed design is compared with other proprietary solution from Microsemi, highlighting the pros and cons of each one. The protocol presented in this work, was implemented and it has worked successfully in space.

Proceedings ArticleDOI
01 Apr 2019
TL;DR: The design is the first step in the development of a low power, wireless recording system for the acquisition of EEG signals and the validation was fulfilled using simulations, comparing the compressed output against one obtained with the software version of the algorithm written in C.
Abstract: This paper presents a hardware implementation of a multi-channel EEG lossless compression algorithm. The design is the first step in the development of a low power, wireless recording system for the acquisition of EEG signals. It was written in VHDL and tested in a Cyclone V FPGA. The validation was fulfilled using simulations, comparing the compressed output against one obtained with the software version of the algorithm written in C. For 21 channels, 16 bit per sample and using a 50 MHz clock, it achieved an average compression time per sample of $\pmb{0.52} \mu s$ , and an average power consumption of 10 mW per channel.

Proceedings ArticleDOI
01 Apr 2019
TL;DR: This paper presents highperformance architectures for performing the finite field inversion using Gaussian Normal Bases (GNB) and a digit-level serial-in parallel-out multiplier (DL-SIPO) over GF(2163).
Abstract: Inversion is the most computationally expensive finite field operation in public-key cryptographic such as elliptic curve cryptography (ECC). This paper presents highperformance architectures for performing the finite field inversion using Gaussian Normal Bases (GNB) and a digit-level serial-in parallel-out multiplier (DL-SIPO) over GF(2163). We propose three architectures to carry out the inversion operation. The first one is based on classic Itoh-Tsujji Algorithm (ITA), the second one carries out the inversion operation according to the NIST binary fields over GF $(2^{163})$ and finally, the last one is based on Fermat's Little Theorem (FLT). The architectures were designed using VHDL description, synthesized on the Stratix IV FPGA using Quartus Prime 17.0, and verified in ModelSim and Matlab. The synthesis results show that the designed architectures present a very good performance using low area. In this case, the processing time and area resources to compute the inversion operation were 114.2, 115.9 and 114.5 ns using 11624, 11558 and 11690 LUTs, respectively.

Proceedings ArticleDOI
01 Apr 2019
TL;DR: An Adaptive Template Matching algorithm for tracking marks in videos that allows the identification through Normalized Cross-Correlation metric and the achievement of a real time condition for an UAV navigation application.
Abstract: Computer vision techniques employing convolution techniques are widely used to identify objects and patterns in images. This work describes an Adaptive Template Matching algorithm for tracking marks in videos. A SoC implementation is also presented for a non-adaptive case. The window used for searching a template is reduced, presuming that the template should not be far from its location in the previous frame. The proposed algorithm allows the identification through Normalized Cross-Correlation metric. The initial proposal is a computer-based implementation using computer vision libraries. As an alternative for the high time of processing, an alternative SoC implementation is presented. The results show the achievement of a real time condition for an UAV navigation application.

Proceedings ArticleDOI
01 Apr 2019
TL;DR: In this article, the authors describe a digital count-rate meter and flux-change rate meter implemented using Field Programmable Gate Array (FPGA) technology with proprietary algorithms developed for measuring pulse-mode flux and its variations at startup of nuclear research reactors.
Abstract: This paper describes a digital count-rate meter and flux-change-rate meter implemented using Field Programmable Gate Array (FPGA) technology with proprietary algorithms developed for measuring pulse-mode flux and its variations at startup of nuclear research reactors. Due to its auto-adjusting counting time implementation which optimizes the trade-off between statistical precision and response time, it provides a wide range of count-rate from 0.1 to $\pmb{10^{6}}$ pulse per second with response time inversely proportional to the actual count-rate.

Proceedings ArticleDOI
10 Apr 2019
TL;DR: This tutorial reviews historical milestones and main concepts regarding the pipelining of electronic circuits, and analyses by examples aspects such as construction hints, pipeline metrics, effects of registering, preferential pipeline directions, and synchronization failures.
Abstract: This tutorial reviews historical milestones and main concepts regarding the pipelining of electronic circuits. Although the technique emerged in the 1960s, it remains a direct way to simultaneously increase throughput and reduce power in FPGA-based systems. However, the efficacy of pipelining is limited by the dominance of register and routing delays. This work focuses on bit-level pipelining. It analyses by examples keys aspects such as construction hints, pipeline metrics, effects of registering, preferential pipeline directions, and synchronization failures. The text condenses the first section of the invited tutorial lecture at the 2019 Southern Conference on Programmable Logic (SPL). Whenever is possible, numeric examples are particularized to FPGA technology, but in some cases, cell-based ASICs data are deemed more convenient. The ideas would be useful for students of an advanced course on digital electronics, or PhD candidates interested in the details of the design of integrated circuits.