Showing papers on "PowerPC published in 2014"

PDF

Open Access

Proceedings Article•DOI•

Anatomy of High-Performance Many-Threaded Matrix Multiplication

[...]

Tyler M. Smith¹, Robert A. van de Geijn¹, Mikhail Smelyanskiy², Jeff R. Hammond³, Field G. Van Zee¹ - Show less +1 more•Institutions (3)

University of Texas at Austin¹, Intel², Argonne National Laboratory³

19 May 2014

TL;DR: This work describes how BLIS extends the "GotoBLAS approach" to implementing matrix multiplication (GEMM), and shows that with the advent of many-core architectures such as the IBM PowerPC A2 processor and the Intel Xeon Phi processor, parallelizing both within and around the inner kernel, as the BLIS approach supports, is not only convenient, but also necessary for scalability.

...read moreread less

Abstract: BLIS is a new framework for rapid instantiation of the BLAS. We describe how BLIS extends the "GotoBLAS approach" to implementing matrix multiplication (GEMM). While GEMM was previously implemented as three loops around an inner kernel, BLIS exposes two additional loops within that inner kernel, casting the computation in terms of the BLIS micro-kernel so that porting G E M M becomes a matter of customizing this micro-kernel for a given architecture. We discuss how this facilitates a finer level of parallelism that greatly simplifies the multithreading of GEMM as well as additional opportunities for parallelizing multiple loops. Specifically, we show that with the advent of many-core architectures such as the IBM PowerPC A2 processor (used by Blue Gene/Q) and the Intel Xeon Phi processor, parallelizing both within and around the inner kernel, as the BLIS approach supports, is not only convenient, but also necessary for scalability. The resulting implementations deliver what we believe to be the best open source performance for these architectures, achieving both impressive performance and excellent scalability.

...read moreread less

115 citations

Proceedings Article•DOI•

On the in-field functional testing of decode units in pipelined RISC processors

[...]

Paolo Bernardi¹, Riccardo Cantoro¹, L. Ciganda¹, Ernesto Sanchez¹, M. Sonza Reorda¹, S. De Luca², R. Meregalli², A. Sansonetti² - Show less +4 more•Institutions (2)

Polytechnic University of Turin¹, STMicroelectronics²

01 Oct 2014

TL;DR: The paper details a strategy based on instruction classification and manipulation, and signatures collection based on the Instruction Set of the processor that reaches over 90% of stuck-at fault coverage while an instruction coverage based approach does not overcome 70%.

...read moreread less

Abstract: The paper is dealing with the in-field test of the decode unit of RIS C processors through functional test programs following the SBST approach. The paper details a strategy based on instruction classification and manipulation, and signatures collection. The method does not require the knowledge of detailed implementation information (e.g., the netlist), but is based on the Instruction Set of the processor. The proposed method is evaluated on an industrial SoC device, which includes a PowerPC derived processor. Results demonstrate the efficiency and effectiveness of the strategy; the proposed solution reaches over 90% of stuck-at fault coverage while an instruction coverage based approach does not overcome 70%.

...read moreread less

23 citations

Proceedings Article•DOI•

MPSoCBench: A toolset for MPSoC system level evaluation

[...]

Liana Duenha¹, Marcelo Guedes¹, Henrique Wakimoto de Almeida¹, Matheus Boy¹, Rodolfo Azevedo¹ - Show less +1 more•Institutions (1)

State University of Campinas¹

14 Jul 2014

TL;DR: The MPSoCBench is a scalable set of MPSoCs organized in platforms with 1, 2, 4, 8, 16, 32, or 64 cores, cross-compilers, IPs, interconnections, and 17 parallel version of software from well-known benchmarks.

...read moreread less

Abstract: Recent design methodologies and tools aim at enhancing the design productivity by providing a software development platform before defining the final MPSoC architecture details. However, the simulation can only be efficiently performed when using a modeling and simulation engine that supports the system behavior description in a high abstraction level. The lack of MPSoC virtual platform prototyping integrating both scalable hardware and software in order to create and evaluate new methodologies and tools motivated us to develop MPSoCBench. This toolset is a scalable set of MPSoCs including four different ISAs (PowerPC, MIPS, SPARC, and ARM) organized in platforms with 1, 2, 4, 8, 16, 32, or 64 cores, cross-compilers, IPs, interconnections, and 17 parallel version of software from well-known benchmarks. This tool also provides power consumption estimation for MIPS and SPARC processors. The MPSoCBench sums 864 different configurations automated through scripts.

...read moreread less

22 citations

Journal Article•DOI•

FPGA-based module for SURF extraction

[...]

Tomas Krajnik¹, Jan Svab¹, Sol Pedre, Petr Čížek¹, Libor Přeučil¹ - Show less +1 more•Institutions (1)

Czech Technical University in Prague¹

01 Apr 2014

TL;DR: A complete hardware and software solution of an FPGA-based computer vision embedded module capable of carrying out SURF image features extraction algorithm, which allows to use the SURF algorithm in applications with power and spatial constraints, such as autonomous navigation of small mobile robots.

...read moreread less

Abstract: We present a complete hardware and software solution of an FPGA-based computer vision embedded module capable of carrying out SURF image features extraction algorithm. Aside from image analysis, the module embeds a Linux distribution that allows to run programs specifically tailored for particular applications. The module is based on a Virtex-5 FXT FPGA which features powerful configurable logic and an embedded PowerPC processor. We describe the module hardware as well as the custom FPGA image processing cores that implement the algorithm's most computationally expensive process, the interest point detection. The module's overall performance is evaluated and compared to CPU and GPU-based solutions. Results show that the embedded module achieves comparable distinctiveness to the SURF software implementation running in a standard CPU while being faster and consuming significantly less power and space. Thus, it allows to use the SURF algorithm in applications with power and spatial constraints, such as autonomous navigation of small mobile robots.

...read moreread less

20 citations

Proceedings Article•DOI•

Dynamic partial reconfiguration manager

[...]

Jimmy Tarrillo¹, Fernando A. Escobar², Fernanda Lima Kastensmidt¹, Carlos Valderrama²•Institutions (2)

Universidade Federal do Rio Grande do Sul¹, Faculté polytechnique de Mons²

26 May 2014

TL;DR: This article presents a generic DPR manager IP core, whose versatility allows the use of either any embedded processor or simple control logic, whose advantages and interest over traditional solutions are shown.

...read moreread less

Abstract: Dynamic partial reconfiguration (DPR) is a technique that optimizes resource utilization of SRAM-based FPGAs, since it allows changing, on the fly, the functionality of a portion of its logic. A common DPR development flow requires the use of, at least, a microprocessor and several development tools (EDK, XSDK, PlanAhead); moreover, proposals are mainly based on MicroBlaze, ARM or PowerPC embedded processors, which also require extra memory control blocks. This article presents a generic DPR manager IP core (Intellectual Property), whose versatility allows the use of either any embedded processor or simple control logic. Results in terms of reconfiguration time and resources for Virtex 5 and Virtex 6 SRAM-FPGAs show its advantages and interest over traditional solutions.

...read moreread less

20 citations

Patent•

General data collection module based on OPC UA

[...]

Wang Linkun, Yan Xiaofeng

23 Jul 2014

TL;DR: In this article, the utility model provides a general data collection module based on OPC UA, which includes an on-site side protection circuit, an onboard bus transceiver, an opto-isolator, an FPGA, an NOR FLASH chip, a 50MHz crystal oscillator, a PowerPC processor, an EEPROM chip, and a watchdog circuit.

...read moreread less

Abstract: The utility model provides a general data collection module based on OPC UA, and the data collection module comprises an on-site side protection circuit (1), an on-site bus transceiver (2), an opto-isolator (3), an FPGA (4), an NOR FLASH chip (5), a 50MHz crystal oscillator (6), a PowerPC processor (7), an EEPROM chip (8), a 24MHz crystal oscillator (9), a watchdog circuit (10), a first 802.3 physical-layer transceiver (11) which is used for OPC UA communication, a second 802.3 physical-layer transceiver (12) which is used for configuration, and a power supply circuit part (13). Through a configurable data model and the technology of OPC UA communication under an embedded platform, the data collection module provided by the utility model can achieve the standardization of the integration of underlying data with an upper monitoring and management system under a plurality of communication architectures in the field of industrial automation.

...read moreread less

8 citations

Journal Article•DOI•

Embedded Linux platform for data acquisition systems

[...]

Jigneshkumar J. Patel, Nagaraj Reddy, Praveena Kumari, Rachana Rajpal, Harshad Pujara, R. K. Jha, Praveen Kalappurakkal - Show less +3 more

01 May 2014-Fusion Engineering and Design

TL;DR: This scalable hardware–software system is designed and developed to explore the emerging open standards for data acquisition requirement of Tokamak experiments and offers a single chip solution with processor, peripherals such ADC interface controller, Gigabit Ethernet controller, memory controller amongst other peripherals.

...read moreread less

7 citations

Journal Article•DOI•

From distributed to multicore architecture in the RFX-mod real time control system

[...]

Gabriele Manduchi¹, Adriano Luchetta¹, Anton Soppelsa¹, Cesare Taliercio¹•Institutions (1)

European Atomic Energy Community¹

01 Mar 2014-Fusion Engineering and Design

TL;DR: The whole system has been finally commissioned in RFX in only two weeks, with the usage of MARTe allowing a rapid development of the control system and, in particular, its intrinsic simulation ability gave the possibility of carrying out most debugging in simulation, without affecting machine operation.

...read moreread less

7 citations

Proceedings Article•DOI•

DRuiD: Designing reconfigurable architectures with decision-making support

[...]

Giovanni Mariani, Gianluca Palermo, Roel Meeuws¹, Vlad-Mihai Sima¹, Cristina Silvano, Koen Bertels¹ - Show less +2 more•Institutions (1)

Delft University of Technology¹

01 Jan 2014

TL;DR: A framework capable of learning application characteristics that make them suitable for certain computing elements that is composed of an expert system that supports the designer in the mapping decision and gives hints on possible code modifications to be applied to make the functionality more suitable for a computing element.

...read moreread less

Abstract: Application development for heterogeneous platforms requires to code and map functionalities on a set of different computing elements. As a consequence, the development process needs a clear understanding of both, application requirements and heterogeneous computing technologies. To support the development process, we propose a framework called DRuiD capable of learning application characteristics that make them suitable for certain computing elements. The framework is composed of an expert system that supports the designer in the mapping decision and gives hints on possible code modifications to be applied to make the functionality more suitable for a computing element. The experimental results are tailored for a heterogeneous and reconfigurable platform (the Xilinx-ml510) including two computational elements, i.e. a Virtex5 FPGA and a PowerPC. The expert system identifies 88.9% of the times what are the functionalities that are accelerated efficiently by using the FPGA, without requiring the kernel porting. Additionally, we present two case studies demonstrating the potentialities of the framework to give hints on high level code modifications for an efficient kernel mapping on the FPGA.

...read moreread less

7 citations

Patent•

System and method for loading PowerPC system guide file through serial port

[...]

Xie Chaowen, Lei Chunhua

05 Nov 2014

TL;DR: In this paper, a simple system loading method for loading a PowerPC system guide file through a serial port is presented, which includes the steps of circuit board design of a loading scheme, logic implement of a CPLD and design of loading software of a PC end.

...read moreread less

Abstract: The invention discloses a system and method for loading a PowerPC system guide file through a serial port. The method includes the steps of circuit board design of a loading scheme, logic implement of a CPLD and design of loading software of a PC end; after the requirements of the three modules are met, the file is sent to the CPLD through the serial port through the loading software at the PC end, and the file is programmed to a Norflash through the logic in the CPLD so that the system guide file loading can be completed. The invention provides the simple system loading method. The logic of Norflash programming is achieved on the CPLD instead of achieving a programming algorithm on the software of the PC end, and therefore the programming speed is ensured; meanwhile, due to the scheme, the software at the PC end can be quietly easily developed by reading the file to be loaded and then sending the file through the serial port according to the well-defined packaging data frame in the frame format. In addition, the system replaces a special simulator and a special programming environment, the loading process of the system guide file is simplified, and the debugging efficiency in the early period is improved.

...read moreread less

6 citations

Journal Article•DOI•

Design of an FPGA-based embedded system for the ATLAS Tile Calorimeter front-end electronics test-bench

[...]

F. Carrio¹, Hyun-Chul Kim², P Moreno³, Robert Reed³, C Sandrock³, Vinicius Schettino⁴, A. Shalyugin⁵, C. A. Solans⁶, J. Souza⁴, G. Usai², Alberto Valero¹ - Show less +7 more•Institutions (6)

Spanish National Research Council¹, University of Texas at Arlington², University of the Witwatersrand³, Universidade Federal de Juiz de Fora⁴, Joint Institute for Nuclear Research⁵, CERN⁶

18 Mar 2014-Journal of Instrumentation

TL;DR: A new test-bench based on a Xilinx Virtex-5 FPGA that implements an embedded system using a PowerPC 440 microprocessor hard core and custom IP cores and a light Linux version runs on the PowerPC microprocessor and handles the IP cores which implement the different functionalities needed to perform the desired tests.

...read moreread less

Abstract: The portable test-bench for the certification of the ATLAS tile hadronic calorimeter front-end electronics has been redesigned for the present Long Shutdown (LS1) of LHC, improving its portability and expanding its functionalities. This paper presents a new test-bench based on a Xilinx Virtex-5 FPGA that implements an embedded system using a PowerPC 440 microprocessor hard core and custom IP cores. A light Linux version runs on the PowerPC microprocessor and handles the IP cores which implement the different functionalities needed to perform the desired tests such as TTCvi emulation, G-Link decoding, ADC control and data reception.

...read moreread less

Patent•

Transient characteristic test system and method of mutual inductor based on digital simulation of Rogowski coil

[...]

Xu Changbao, Gao Jipu, Wang Yu, Dai Yu, Luo Qiang - Show less +1 more

06 Aug 2014

TL;DR: In this article, a transient characteristic test system of a mutual inductor based on digital simulation of a Rogowski coil is presented, which consists of a signal conditioning module, an optical fiber sending module, optical fiber Ethernet module, a digital analog converter, an FPGA (Field Programmable Gate Array), a POWERPC, a crystal oscillator, a human-computer interface and a storage module.

...read moreread less

Abstract: The invention discloses a transient characteristic test system of a mutual inductor based on digital simulation of a Rogowski coil. The system comprises a signal conditioning module, an optical fiber sending module, an optical fiber Ethernet module, a digital analog converter, an FPGA (Field Programmable Gate Array), a POWERPC, a crystal oscillator, a human-computer interface and a storage module, wherein the crystal oscillator is connected with the FPGA and the POWERPC, the FPGA is connected with the digital analog converter, the optical fiber sending module, the optical fiber Ethernet module and the POWERPC, the POWERPC is connected with the storage module and the human-computer interface, and the digital analog converter is connected with the signal conditioning module. The FPGA comprises a synchronization module, a data receiving module and a DAC (Digital-to-Analog Converter) control module. The POWERPC comprises a digital simulation model module of the Rogowski coil, a parameter configuration module, a test data processing module and a transient characteristic analyzing module. Based on an ideal Rogowski coil model, a differential signal output is established for all-digital simulation test to be not limited in individual professional laboratories. Advantages of hardware integration and software integration can be compared, and all-around simulation test of transient characteristics of the electronic mutual inductor can be realized.

...read moreread less

Patent•

Multifunctional communication interface machine device based on PowerPC embedded system

[...]

Niu Zhenbo, Zhang Chengbin, Xu Dagui

26 Feb 2014

TL;DR: In this article, a multifunctional communication interface machine device based on the PowerPC embedded system is described, which consists of a CPU core board, a mother board, interface boards and a power module.

...read moreread less

Abstract: The invention relates to a multifunctional communication interface machine device, in particular to a multifunctional communication interface machine device based on a PowerPC embedded system. The multifunctional communication interface machine device based on the PowerPC embedded system comprises a CPU core board, a mother board, interface boards and a power module. A processor of the PowerPC embedded system is carried by the CPU core board. The mother board serves as a main board of the multifunctional communication interface machine device, and mainly comprises a CPLD circuit, an extension serial port circuit, a bus level conversion circuit, an RTC clock circuit, a CAN interface circuit and a 422/485 interface circuit. The interface boards comprise CAN interface boards and 422/485 interface boards. The CPU core board is connected with the bus level conversion circuit on the mother board, and is connected with the CPLD circuit, the extension serial port circuit and the RTC clock circuit after level conversion. The CPLD circuit is connected with the CAN interface circuit through a bus of the address and data time division multiplexing mode. Four paths of 232 serial interfaces are extended by the extension serial port circuit through a four-channel asynchronous transceiver STC16C554 chip to be connected with the 422/485 interface circuit. The CAN interface boards and the 422/485 interface boards are connected with the CAN interface circuit and the 422/485 interface circuit on the mother board respectively.

...read moreread less

Journal Article•DOI•

Synthesis of optimal digital shapers with arbitrary noise using simulated annealing

[...]

Alberto Regadío¹, Alberto Regadío², Sebastián Sánchez-Prieto¹, Jesús Tabero²•Institutions (2)

University of Alcalá¹, Instituto Nacional de Técnica Aeroespacial²

21 Feb 2014-Nuclear Instruments & Methods in Physics Research Section A-accelerators Spectrometers Detectors and Associated Equipment

TL;DR: In this article, the authors present the structure, design and implementation of a new way of determining the optimal shaping in time-domain for spectrometers by means of simulated annealing.

...read moreread less

Abstract: This paper presents the structure, design and implementation of a new way of determining the optimal shaping in time-domain for spectrometers by means of simulated annealing. The proposed algorithm is able to adjust automatically and in real-time the coefficients for shaping an input signal. A practical prototype was designed, implemented and tested on a PowerPC 405 embedded in a Field Programmable Gate Array (FPGA). Lastly, its performance and capabilities were measured using simulations and a neutron monitor.

...read moreread less

Proceedings Article•DOI•

Implementation of an improved connected component labeling algorithm using FPGA-based platform

[...]

Jai Gopal Pandey¹, Abhijit Karmakar¹, Amit Kumar Mishra¹, C. Shekhar¹, S. Gurunarayanan² - Show less +1 more•Institutions (2)

Council of Scientific and Industrial Research¹, Birla Institute of Technology and Science²

22 Jul 2014

TL;DR: The equivalence handling mechanism of Stefano-Bulgarelli (SB) algorithm is improved to achieve complete merger for all the possible cases and the results demonstrate that the improved algorithm handles equivalences efficiently and gives accurate count of connected components.

...read moreread less

Abstract: Labeling of connected components is one of the most fundamental operations in the area of image and video processing. This paper presents a field-programmable gate array (FPGA) platform based approach for implementing an efficient and improved two-scan equivalence-based connected component labeling algorithm. The implementation utilizes standard intellectual-property (IP) elements, FPGA off-the-shelf components, peripherals available on the Xilinx ML-507 FPGA platform and runs on an embedded PowerPC 440 processor available in the Xilinx Virtex-5 xc5vfx70t FPGA device. In this work, the equivalence handling mechanism of Stefano-Bulgarelli (SB) algorithm is improved to achieve complete merger for all the possible cases. The improved algorithm is tested using binary test patterns and standard images. The results demonstrate that the improved algorithm handles equivalences efficiently and gives accurate count of connected components. The proposed FPGA-based system arrangement can be efficiently utilized in many practical image and video processing applications, which uses connected component labeling algorithm.

...read moreread less

Proceedings Article•DOI•

Architectures and algorithms for image and video processing using FPGA-based platform

[...]

Jai Gopal Pandey, Arindam Karmakar, S. Gurunarayanan¹•Institutions (1)

Birla Institute of Technology and Science¹

16 Jul 2014

TL;DR: By identifying, building and integrating all necessary hardware and software components, an embedded implementation of a kernel-based mean shift (KBMS) object tracking algorithm has been proposed.

...read moreread less

Abstract: The work illustrates the use of platform-based design to achieve efficiently-configured hardware-software system solution that can meet the conflicting demands of high performance, low power and quick turnaround time for embedded system development. It presents embedded system design techniques using field-programmable gate arrays (FPGAs) for image and video processing application. Here, by identifying, building and integrating all necessary hardware and software components, an embedded implementation of a kernel-based mean shift (KBMS) object tracking algorithm has been proposed [1]. To fulfill the specific needs of hardware/software implementation Virtex-5 FXT FPGA device (which has an embedded PowerPC processor) available on Xilinx ML-507 platform has been used [2].

...read moreread less

Advancing Deductive Program-Level Verification for Real-World Application: Lessons Learned from an Industrial Case Study

[...]

Thorsten Bormer

01 Jan 2014

TL;DR: The PikeOS system is multi-platform and the kernel’s specification has to hide any PowerPC implementation details to ensure proper encapsulation, so the state of the PowerPC hardware on the machine level is reified with the help of a specification structure PPC_c.

...read moreread less

Abstract: Kernel and Hardware Model Because system calls are at the user’s interface to the kernel and the PikeOS system is multi-platform, the kernel’s specification has to hide any PowerPC implementation details to ensure proper encapsulation. The state of the PowerPC hardware on the machine level is reified with the help of a specification structure PPC_c containing, e.g., the contents of all relevant registers usually invisible on the C level – this corresponds to the abstract version of the proc_t C data structure introduced for the baby hypervisor

...read moreread less

Proceedings Article•DOI•

A Compiler Optimization to Increase the Efficiency of WCET Analysis

[...]

Mohamed Abdel Maksoud¹, Jan Reineke¹•Institutions (1)

Saarland University¹

08 Oct 2014

TL;DR: A parameterized compiler optimization to reduce analysis time and memory consumption during the two steps of worst-case execution time (WCET) analysis makes use of a synchronization instruction, which flushes queues in the memory subsystem.

...read moreread less

Abstract: For complex microprocessors, micro-architectural analysis and precise path analysis constitute the most expensive steps in worst-case execution time (WCET) analysis. We introduce a parameterized compiler optimization to reduce analysis time and memory consumption during the two steps. The optimization makes use of a synchronization instruction, which flushes queues in the memory subsystem. By injecting this instruction at selected program points, analysis uncertainty about the state of the pipeline and the memory subsystem can be drastically reduced, at the cost of an increase in execution time. A parameter allows the user to control the trade-off between increased analysis efficiency and decreased worst-case performance. We have developed a prototype implementation of the optimization for the PowerPC instruction set architecture, and evaluate it using a version of AbsInt's WCET analyzer aiT for the PowerPC 7448, a high-performance microprocessor used in safety-critical real-time systems. On a set of Malardalen benchmarks, we observe an analysis speedup of around 635% at the cost of an increase in the WCET bound of 6%. Moreover, under a traditional ILP-based path analysis, the WCET bound is decreased by 5% while the analysis is sped up by 350%.

...read moreread less

DOI•

Design of Digital SoC for Operation at High Temperatures

[...]

Radisav Cojbasic

01 Jan 2014

TL;DR: The essential part of this thesis focuses on the design, implementation, fabrication and high-temperature measurements of on-chip Latch based SRAM, PowerPC e200 based microcontroller, digital temperature sensor and Flash A/D converter.

...read moreread less

Abstract: There is a growing demand for Systems-on-Chip, integrating microprocessors, on-chip memories, data converters and a variety of sensors, which are capable of reliable operation at high temperatures. For instance, modern aircraft industry demands microcontrollers and electric motors to operate at high temperatures, in order to replace present hydraulic structures. This thesis explains how to design digital SoC which are capable of reliable operation at high temperatures. The essential part of this thesis focuses on the design, implementation, fabrication and high-temperature measurements of on-chip Latch based SRAM, PowerPC e200 based microcontroller, digital temperature sensor and Flash A/D converter. Embedded on-chip SRAM modules are one of the most important components in the modern SoC. We analyze thermally-caused failures in the 6T SRAM cell and elaborate on its reliability. Further, we show that Latch based SRAM modules are preferable to 6T SRAM for reliable operation beyond 150C, by comparing two 1kB SRAM modules implemented in standard 0.18um SOI CMOS process. We demonstrate reliable SRAM operation at 275C (fmax = 10MHz, Ptot = 400mW), that is by far the highest reported operating temperature for digital on-chip SRAM module. Designing SoCs for reliable operation at elevated temperatures is a significant challenge, due to increased static leakage current, reduced carrier mobility, and increased electromigration. We propose to customize a PowerPC e200 based SoC by using a dynamically reconfigurable clock frequency, exhaustive clock gating, and electromigration-resistant power distribution network. We fabricated a 20x9mm2 chip implementing this design in 0.35um Bulk CMOS process. We present worldâs first PowerPC based SoC for reliable operation at 225C (fmax = 30MHz, Ptot = 1.2W). This design outperforms previously reported PowerPC based SoCs, which are not operational at temperatures beyond 125C. The on-chip measurements of the p-n junction temperature allow reliability improvements for the SoC that operates at high temperatures. Low-resolution temperature measurements are efficiently used for adjusting the optimal operation frequency and supply voltage. We used the Time-to-Digital conversion technique to design a fully-digital temperature sensor. We designed and simulated a fully-digital 5bit temperature sensor for 10C resolution temperature measurements in between Tj,min = -45C and Tj,max = 125C. Further, using a single clock cycle Time-to-Digital conversion technique, we present a fully-digital 5bit Pulse based Flash ADC implemented in 0.18um Bulk CMOS process. Measurement results demonstrate the state-of-the-art power efficiency result of 450 fJ/conv (fmax = 83MHz, Ptot = 900uW).

...read moreread less

Proceedings Article•DOI•

Harnessing Unreliable Cores in Heterogeneous Architecture: The PyDac Programming Model and Runtime

[...]

Bin Huang¹, Ron Sass¹, Nathan DeBardeleben², Sean Blanchard²•Institutions (2)

University of North Carolina at Charlotte¹, Los Alamos National Laboratory²

23 Jun 2014

TL;DR: This work proposes a Python-based task parallel programming model called PyDac, which provides a two-level programming model based on the divide-and-conquer strategy and shows that through the use of double and triple modular redundancy it is able to complete the benchmarks with the correct results while only incurring a proportional performance penalty.

...read moreread less

Abstract: Heterogeneous many-core architectures combined with scratch-pad memories are attractive because they promise better energy efficiency than conventional architectures and a good balance between single-thread performance and multi-thread throughput. However, programmers will need an environment for finding and managing the large degree of parallelism, locality, and system resilience. We propose a Python-based task parallel programming model called PyDac to support these objectives. PyDac provides a two-level programming model based on the divide-and-conquer strategy. The PyDac runtime system allows threads to be run on unreliable hardware by dynamically checking the results without involvement from the programmer. To test this programming model and runtime, an unconventional heterogeneous architecture consisting of PowerPC and ARM cores was developed and emulated on an FPGA device. We inject faults during the execution of micro-benchmarks and show that through the use of double and triple modular redundancy we are able to complete the benchmarks with the correct results while only incurring a proportional performance penalty.

...read moreread less

Patent•

Method of realizing FC (Fibre Channel) communication by POWERPC cloud storage platform adopting SCST

[...]

Li Gongchen, Gao Ming, Jin Changxin, Liu Qiang

19 Nov 2014

TL;DR: In this article, a method of realizing Fibre Channel (Fibre Channel) communication by a POWERPC cloud storage platform adopting SCST has been proposed, where related SCST software is transplanted to a Yocto system, modules are complied into the kernel and the file system through compiling a recipe file of a source code pack in the YoctO system, and smooth transplantation of the modules can be realized; then, the well-compiled SCST kernel and file system are downloaded into a specific single board of the PowerPC; and finally, the configuration

...read moreread less

Abstract: The invention discloses a method of realizing FC (Fibre Channel) communication by a POWERPC cloud storage platform adopting SCST. Firstly, related SCST software is transplanted to a Yocto system, modules are complied into the kernel and the file system through compiling a recipe file of a source code pack in the Yocto system, the source code pack is modified in view of a PowerPC embedded platform, and smooth transplantation of the modules can be realized; then, the well-compiled SCST kernel and the file system are downloaded into a specific single board of the PowerPC; and finally, the configuration file scst.conf of the SCST is modified, a driver of the SCST and a QlogicFC card generated through compilation is loaded, and the SCST administration tool scstadmin is used for realizing FC communication of the cloud storage platform. The method of realizing FC (Fibre Channel) communication by POWERPC cloud storage platform adopting SCST has the advantages that the design is reasonable and reliable, the cost is low, the performance is excellent, and wide application of the SCST in the aspect of the PowerPC cloud storage platform FC communication can be realized.

...read moreread less

Integration of embedded data processing algorithms inside PAMELA devices

[...]

Mariano González, Eduardo Barrera Lopez de Turiso, Nestor Fernández Bernardo, Raúl Meléndez de Francisco, Ángel Alcaide Pardo, Gerardo Aranguren Aramendia, P. M. Monje - Show less +3 more

08 Jul 2014

TL;DR: The developed software architecture is presented and the necessary steps to add new data processing algorithms to SMA in order to increase the processing capabilities of PAMELA devices are described.

...read moreread less

Abstract: PAMELA (Phased Array Monitoring for Enhanced Life Assessment) SHMTM System is an integrated embedded ultrasonic guided waves based system consisting of several electronic devices and one system manager controller. The data collected by all PAMELA devices in the system must be transmitted to the controller, who will be responsible for carrying out the advanced signal processing to obtain SHM maps. PAMELA devices consist of hardware based on a Virtex 5 FPGA with a PowerPC 440 running an embedded Linux distribution. Therefore, PAMELA devices, in addition to the capability of performing tests and transmitting the collected data to the controller, have the capability of perform local data processing or pre-processing (reduction, normalization, pattern recognition, feature extraction, etc.). Local data processing decreases the data traffic over the network and allows CPU load of the external computer to be reduced. Even it is possible that PAMELA devices are running autonomously performing scheduled tests, and only communicates with the controller in case of detection of structural damages or when programmed. Each PAMELA device integrates a software management application (SMA) that allows to the developer downloading his own algorithm code and adding the new data processing algorithm to the device. The development of the SMA is done in a virtual machine with an Ubuntu Linux distribution including all necessary software tools to perform the entire cycle of development. Eclipse IDE (Integrated Development Environment) is used to develop the SMA project and to write the code of each data processing algorithm. This paper presents the developed software architecture and describes the necessary steps to add new data processing algorithms to SMA in order to increase the processing capabilities of PAMELA devices.An example of basic damage index estimation using delay and sum algorithm is provided.

...read moreread less

Proceedings Article•DOI•

FPGA-based hardware/software implementation for MIMO wireless communications

[...]

Korkeart Boonyi¹, Jukkrit Tagapanij¹, Akkarat Boonpoonga²•Institutions (2)

Mahanakorn University of Technology¹, King Mongkut's University of Technology North Bangkok²

19 Mar 2014

TL;DR: In this paper, utilization resource and operation performance in term of equivalent gates and operating cycles are shown.

...read moreread less

Abstract: This paper proposes an efficient architecture for FPGA implementation of MGS-QRD in MIMO wireless communication systems. The proposed architecture is based on the Hardware/Software (HW/SW) design. To achieve the efficient architecture, the systolic architecture is applied to MGS-QRD and then the conventional QR triangular array of (2m2+2m+1) cells onto a linear architecture of m+1 cell is employed to reduce the number of required QR processors. The reduced cells are constructed with a number of basic processing elements such as multipliers and adders etc. The basic elements are constructed by HW architectures. The SW of PowerPC core is used to control to achieve the QR decomposition. In this paper, utilization resource and operation performance in term of equivalent gates and operating cycles are shown.

...read moreread less

Patent•

PowerPC (Personal Computer) module with radiation resisting performance

[...]

Wang Hao, Zhu Xinzhong, Chen Jing, Shi Hongxin, Tian Wenbo - Show less +1 more

05 Feb 2014

TL;DR: In this article, a PowerPC (Personal Computer) module with a radiation-resisting index is presented, which adopts a multi-chip laminated structure; unpacked chips (PowerPC, FPGA (Field Programmable Gate Array), Flash, SRAM (Static Random Access Memory) and the like) with space radiation resisting indexes are selected to form a small-size PowerPC module with the radiation resisting performance.

...read moreread less

Abstract: The invention discloses a PowerPC (Personal Computer) module with a radiation resisting performance. The module adopts a multi-chip laminated structure; unpacked chips (PowerPC, FPGA (Field Programmable Gate Array), Flash, SRAM (Static Random Access Memory) and the like) with space radiation resisting indexes are selected to form a small-size PowerPC module with the radiation resisting performance, and the small-size PowerPC module is enabled to be suitable for a space track environment. The FPGA is adopted inside the module as a peripheral bridge blade for PowerPC access, so that the PowerPC module has the advantages of flexibility, tailorability, addictivity and the like. According to the PowerPC module with the radiation resisting performance, the interconnection length of the chips becomes shorter by interconnection among the unpacked chips; compared with a traditional computer, the PowerPC module has the advantage that the transmission characteristic of signals is improved.

...read moreread less

D Flip Flop with Different Technologies

[...]

Amit Grover, Sumer Singh

01 Jan 2014

TL;DR: A new implementation of efficient D-Flip-Flop (DFF) using Gate-Diffusion-Input (GDI) technique, PowerPC, DSTC, and HLFF is explained, showing advantages and drawbacks of GDI DFF as compared to other methods.

...read moreread less

Abstract: This article explains a new implementation of efficient D-Flip-Flop (DFF) using Gate-Diffusion-Input (GDI) technique, PowerPC, DSTC, and HLFF. This DFF design allows reducing power-delay product and area of the circuit, while maintaining low complexityof logic design. Performance comparison with other DFF design techniques is presented, with respect to gatearea, number of devices, delay and power dissipation, showing advantages and drawbacks of GDI DFF as compared to other methods. The performance is carried out by HSPICE simulation with 180 nm & 90 nm CMOS technology.

...read moreread less

Proceedings Article•DOI•

Porting and systematic testing of an embedded RTOS

[...]

Anil K Vishwakarma¹, K.V Suresh¹, U.K Singh¹•Institutions (1)

Defence Research and Development Organisation¹

01 Dec 2014

TL;DR: The details of the port of the popular MicroC/OS-II RTOS to an avionics platform based on a PowerPC 7410 processor with multiple peripherals including RS422, MIL1553, digital IO and timers are presented.

...read moreread less

Abstract: The use of a Real-Time Operating System (RTOS) is now quite common in embedded systems because of the requirement for multi-tasking. We present the details of our port of the popular MicroC/OS-II RTOS to an avionics platform based on a PowerPC 7410 processor with multiple peripherals including RS422, MIL1553, digital IO and timers. Also, the systematic testing that we have carried out to validate the port is also presented. Further, detailed performance measurements have been carried out, and we also compare the performance of our MicroC/OS-II port with that of RT-Linux on the same platform.

...read moreread less

Patent•

Digital phase position checking device by using identical clock source for calibrating sampling time

[...]

Zhang Mingfu, Li Shaofei, Mei Houxi, Jin Mingshuan, Li He, Geng Zhendong, Ren Xingqiang, An Ning, Wang Wei, Zhang Zhansheng - Show less +6 more

10 Dec 2014

TL;DR: In this article, a digital phase position checking device by using an identical clock source for calibrating sampling time has been presented, where the clock source is used to calibrate the sampling time.

...read moreread less

Abstract: The invention discloses a digital phase position checking device by using an identical clock source for calibrating sampling time. The digital phase position checking device by using the identical clock source for calibrating the sampling time comprises an FPGA (field-programmable gate array) hardware unit used for front-end signal acquisition, a PowerPC microprocessor used for back-end data processing, an SDRAM (synchronous dynamic random access memory) and a FLASH, wherein the FPGA hardware unit is connected with the PowerPC microprocessor through a bus, and the SDRAM and the FLASH are connected with the bus. A synchronous pulse signal receiving interface, an A/D (analog/digital) sampling interface, two parallel FT3 data receiving interfaces and two parallel optical Ethernet data receiving interfaces are arranged at an input end of the FPGA hardware unit. The digital phase position checking device by using the identical clock source for calibrating the sampling time has the advantages of simple structure, convenience in manufacture and high practicality, phase position checking error caused by crystal oscillator error from different clock sources in the prior art can be eliminated, response consistency is guaranteed, and urgent demands of intelligent substations and digital substations on high precision of electrical quantity phase position checking can be satisfied.

...read moreread less

Patent•

Method for achieving data synchronization on POWERPC framework cloud storage platform by adopting DRBD and RAPIDIO

[...]

Gao Ming, Jin Changxin, Liu Qiang

17 Sep 2014

TL;DR: In this paper, a method for achieving data synchronization on a POWERPC framework cloud storage platform by adopting a DRBD and an RAPIDIO is presented, where the DRBD is transplanted to a Yocto system, and after reasonable configuration, the data synchronization function of the could storage platform is achieved through a RapidIO bus.

...read moreread less

Abstract: The invention discloses a method for achieving data synchronization on a POWERPC framework cloud storage platform by adopting a DRBD and an RAPIDIO and belongs to the technical field of cloud storage The method comprises the following steps that the DRBD is transplanted to a Yocto system, wherein in the Yocto system, the DRBD is compiled into a file system through formula files of a source code package of the DRBD, and the smooth transplantation of the module is achieved by modifying part of the files of the source code package for a PowerPC embedded platform; downloading is carried out, wherein the file system with the compiled DRBD is downloaded into a specific single board of the a PowerPC framework; the DRBD is configured, wherein configuration files drbd conf of the DRBD are modified, and after reasonable configuration, the data synchronization function of the could storage platform is achieved through a RapidIO bus The method has the advantages that the structure is reliable, cost is low, and the performance is superior The method can be widely applied to the PowerPC framework cloud storage platform

...read moreread less

Journal Article•DOI•

Intensive computing on a large data volume with a short-vector single instruction multiple data processor

[...]

Ioan Ungurean¹, Vasile Gheorghita Gaitan¹, Nicoleta-Cristina Gaitan¹•Institutions (1)

Ştefan cel Mare University of Suceava¹

25 Aug 2014-Iet Computers and Digital Techniques

TL;DR: The authors conclude that the PowerXCell 8i processor can be efficiently used for the execution of algorithms that require intensive computations on huge data volume.

...read moreread less

Abstract: In this study, the authors want to evaluate the performances of the PowerXCell 8i processor, which is based on Cell Broadband Engine architecture. For this purpose, the authors chose an algorithm for the k-nearest neighbour problem. The authors optimised this algorithm for efficient exploitation of the facilities provided by this architecture. The authors evaluated the PowerXCell 8i performances by algorithm execution with single- and double-precision calculations. For both cases, the performances were evaluated with and without SIMDisation. For single-precision calculations, the authors achieved a maximum speed-up of 43.85 with SIMDisation by activating 6 synergetic processor element (SPE) processors and 39.73 without SIMDisation by activating 16 SPE processors. For double-precision calculations, the authors achieved a maximum speed-up of 34.79 with SIMDisation by activating 9 SPE processors and 32.71 without SIMDisation by activating 12 SPE processors. These values related to the execution on the PowerPC processor element processor and are due to the accessing way of the main memory by the SPE cores, through the DMA transfers who are performed in parallel with the computing operations. The authors conclude that this process can be efficiently used for the execution of algorithms that require intensive computations on huge data volume.

...read moreread less

Patent•

Network data package filtering method based on Power PC hardware frame

[...]

Li Yan

10 Dec 2014

TL;DR: In this paper, a network data package filtering method based on Power PC hardware frame and belongs to the technical field of network data packages filtering is presented, which mainly solves the problems of poor efficiency, proneness to system crash, inconvenient filtering strategy loading and inconvenient debugging.

...read moreread less

Abstract: The invention discloses a network data package filtering method based on Power PC hardware frame and belongs to the technical field of network data package filtering. The method mainly solves the problems of poor efficiency, proneness to system crash, inconvenient filtering strategy loading and inconvenient debugging. The technical scheme is that the Power PC hardware frame adopts a PowerPC multi-core network processor, and the Power PC multi-core network processor adopts a DPAA technology. The method includes the following steps: acquiring a network data package in the user space of an operation system; filtering the network data package according to a filtering matching list. A management user can conduct configuration modification on the content of a filtering matching list through a filtering rule configuration module, and timeliness of the filtering matching content is ensured.

...read moreread less