Showing papers on "Program counter published in 2011"

PDF

Open Access

Proceedings Article•DOI•

Aveksha: a hardware-software approach for non-intrusive tracing and profiling of wireless embedded systems

[...]

Matthew Tancreti¹, Mohammad Sajjad Hossain¹, Saurabh Bagchi¹, Vijay Raghunathan¹•Institutions (1)

01 Nov 2011

TL;DR: This paper designs a debug board that interfaces with the on-chip debug module of an embedded node's processor through the JTAG port and provides three modes of event logging and tracing: breakpoint, watchpoint, and program counter polling, which is an operating system-agnostic solution.

...read moreread less

Abstract: It is important to get an idea of the events occurring in an embedded wireless node when it is deployed in the field, away from the convenience of an interactive debugger. Such visibility can be useful for post-deployment testing, replay-based debugging, and for performance and energy profiling of various software components. Prior software-based solutions to address this problem have incurred high execution overhead and intrusiveness. The intrusiveness changes the intrinsic timing behavior of the application, thereby reducing the fidelity of the collected profile. Prior hardware-based solutions have involved the use of dedicated ASICs or other tightly coupled changes to the embedded node's processor, which significantly limits their applicability.In this paper, we present Aveksha, a hardware-software approach for achieving the above goals in a non-intrusive manner. Our approach is based on the key insight that most embedded processors have an on-chip debug module (which has traditionally been used for interactive debugging) that provides significant visibility into the internal state of the processor. We design a debug board that interfaces with the on-chip debug module of an embedded node's processor through the JTAG port and provides three modes of event logging and tracing: breakpoint, watchpoint, and program counter polling. Using expressive triggers that the on-chip debug module supports, Aveksha can watch for, and record, a variety of programmable events of interest. A key feature of Aveksha is that the target processor does not have to be stopped during event logging (in the last two of the three modes), subject to a limit on the rate at which logged events occur. Aveksha also performs power monitoring of the embedded wireless node and, importantly, enables power consumption data to be correlated to events of interest.Aveksha is an operating system-agnostic solution. We demonstrate its functionality and performance using three applications running on Telos motes; two in TinyOS and one in Contiki. We show that Aveksha can trace tasks and other generic events at the function and task-level granularity. We also describe how we used Aveksha to find a subtle bug in the TinyOS low power listening protocol.

...read moreread less

51 citations

Book Chapter•DOI•

A reversible processor architecture and its reversible logic design

[...]

Michael Kirkedal Thomsen¹, Holger Bock Axelsen¹, Robert Glück¹•Institutions (1)

University of Copenhagen¹

04 Jul 2011

TL;DR: All-in-all, this paper demonstrates that the design of a complete reversible computing architecture is possible and can serve as the core of a programmable reversible computing system.

...read moreread less

Abstract: We describe the design of a purely reversible computing architecture, Bob, and its instruction set, BobISA. The special features of the design include a simple, yet expressive, locally-invertible instruction set, and fully reversible control logic and address calculation. We have designed an architecture with an ISA that is expressive enough to serve as the target for a compiler from a high-level structured reversible programming language. All-in-all, this paper demonstrates that the design of a complete reversible computing architecture is possible and can serve as the core of a programmable reversible computing system.

...read moreread less

34 citations

Patent•

Processor operable to ensure code integrity

[...]

Andrew F. Glew, Daniel A. Gerrity, Clarence T. Tegreene

04 Aug 2011

TL;DR: In this paper, a processor can be used to ensure that program code can only be used for a designed purpose and not exploited by malware, using logic operable to execute a program instruction and to distinguish whether the program instruction is a legitimate branch instruction or a nonlegitimate branch instruction.

...read moreread less

Abstract: A processor can be used to ensure that program code can only be used for a designed purpose and not exploited by malware. Embodiments of an illustrative processor can comprise logic operable to execute a program instruction and to distinguish whether the program instruction is a legitimate branch instruction or a non-legitimate branch instruction.

...read moreread less

32 citations

Proceedings Article•DOI•

16-Bit RISC processor design for convolution application

[...]

Samiappa Sakthikumaran¹, S. Salivahanan¹, V. S. Kanchana Bhaaskaran¹•Institutions (1)

Sri Sivasubramaniya Nadar College of Engineering¹

03 Jun 2011

TL;DR: The utility of the RISC processor is extended towards convolution application, one of the most important signal processing application, and the total dissipated power by the processor is depicted to be approximately 329.3 μW.

...read moreread less

Abstract: In this paper, we propose a 16-bit non-pipelined RISC processor, which is used for signal processing applications. The processor consists of the blocks, namely, program counter, clock control unit, ALU, IDU and registers. Advantageous architectural modifications have been made in the incrementer circuit used in program counter and carry select adder unit of the ALU in the RISC CPU core. Furthermore, a high speed and low power modified Wallace tree multiplier has been designed and introduced in the design of ALU. The RISC processor has been designed for executing 27-instruction set. It is expandable up to 32 instructions, based on the user requirements. The processor has been realized using Verilog HDL, simulated using Modelsim 6.2 and synthesized using Synopsys. Power estimation and area estimation is done using Synopsys Design Vision using SAED 90nm CMOS technology and timing estimation is done using Synopsys Primetime. In this paper, we have extended the utility of the processor towards convolution application, which is one of the most important signal processing application. The simulations depict the total dissipated power by the processor to be approximately 329.3 μW with the total area of 65012 nm2.

...read moreread less

27 citations

Patent•

Techniques for handling divergent threads in a multi-threaded processing system

[...]

Lin Chen¹, David Rigel Garcia Garcia¹, Andrew Evan Gruber¹, Guofang Jiao¹•Institutions (1)

Qualcomm¹

07 Sep 2011

TL;DR: In this paper, the authors describe techniques for handling divergent thread conditions in a multi-threaded processing system, where the control flow unit may select one of the target program counter values and a minimum resume counter value as a value to load into the program counter register.

...read moreread less

Abstract: This disclosure describes techniques for handling divergent thread conditions in a multi-threaded processing system. In some examples, a control flow unit may obtain a control flow instruction identified by a program counter value stored in a program counter register. The control flow instruction may include a target value indicative of a target program counter value for the control flow instruction. The control flow unit may select one of the target program counter value and a minimum resume counter value as a value to load into the program counter register. The minimum resume counter value may be indicative of a smallest resume counter value from a set of one or more resume counter values associated with one or more inactive threads. Each of the one or more resume counter values may be indicative of a program counter value at which a respective inactive thread should be activated.

...read moreread less

15 citations

Patent•

Instruction predication using unused datapath facilities

[...]

Adam J. Muff¹, Paul E. Schardt¹, Robert A. Shearer¹, Matthew R. Tubbs¹•Institutions (1)

IBM¹

19 Dec 2011

TL;DR: In this paper, a method and circuit arrangement for selectively predicating an instruction in an instruction stream based upon a value corresponding to a predication register address indicated by a portion of an operand associated with the instruction is presented.

...read moreread less

Abstract: A method and circuit arrangement for selectively predicating an instruction in an instruction stream based upon a value corresponding to a predication register address indicated by a portion of an operand associated with the instruction. A first compare instruction in an instruction stream stores a compare result in at a register address of a predication register. The register address of the predication register is stored in a portion of an operand associated with a second instruction, and during decoding the second instruction, the predication register is accessed to determine a value stored at the register address of the predication register, and the second instruction is selectively predicated based on the value stored at the register address of the predication register.

...read moreread less

14 citations

Patent•

Multi-threaded instruction buffer design

[...]

Jama I. Barreh¹, Robert T. Golla, Manish K. Shah•Institutions (1)

Business International Corporation¹

07 Mar 2011

TL;DR: In this article, an instruction buffer for a processor configured to execute multiple threads is described, where the instruction buffer is configured to receive instructions from a fetch unit and provide instructions to a selection unit.

...read moreread less

Abstract: An instruction buffer for a processor configured to execute multiple threads is disclosed. The instruction buffer is configured to receive instructions from a fetch unit and provide instructions to a selection unit. The instruction buffer includes one or more memory arrays comprising a plurality of entries configured to store instructions and/or other information (e.g., program counter addresses). One or more indicators are maintained by the processor and correspond to the plurality of threads. The one or more indicators are usable such that for instructions received by the instruction buffer, one or more of the plurality entries of a memory array can be determined as a write destination for the received instructions, and for instructions to be read from the instruction buffer (and sent to a selection unit), one or more entries can be determined as the correct source location from which to read.

...read moreread less

13 citations

Patent•

Controlling simulation systems

[...]

Hideaki Komatsu¹, Shingo Nagai¹, Fumitomo Ohsawa¹•Institutions (1)

IBM¹

27 Sep 2011

TL;DR: In this paper, a method for controlling a simulation system includes storing first-stage and second-stage tables in which a value of a predicted time until arrival of an I/O instruction and a type of the instruction are included as entries for each program counter of an instruction set simulator, and in which the value of an earliest time until an output event from a peripheral simulator is included as an entry for each type of instruction.

...read moreread less

Abstract: A method for controlling a simulation system includes storing first-stage and second stage tables in which a value of a predicted time until arrival of an I/O instruction and a type of the instruction are included as entries for each program counter of an instruction set simulator, and in which a value of an earliest time until an output event from a peripheral simulator is included as an entry for each type of instruction; looking up the first-stage table to obtain the type of the instruction and the value of the predicted time until arrival of the instruction, looking up the second-stage table with reference to the obtained type of the instruction to obtain the value of the earliest time until the output event from the peripheral simulator, and returning the predicted time until arrival of the instruction and the earliest time until the output event from the peripheral simulator.

...read moreread less

8 citations

Patent•

Processor architecture special for high-performance programmable logic controller (PLC)

[...]

Shuting Zeng, Zhijia Yang, Yan Lu, Chuang Xie, Zhifeng Liu, Duan Maoqiang - Show less +2 more

19 Jan 2011

TL;DR: In this paper, a PLC application specific instruction processor (ASIP) and a general processor are connected via the interface of the PLC ASIP and the general processor, where the ASIP includes an instruction memory, an instruction counter, a instruction register, instruction decoder, a control unit, a functional block unit, functional block register group, data memory, a register groups, a bits processor, a jump & call instruction and access instruction processing unit, I/O data memory and a state register.

...read moreread less

Abstract: The utility model relates to a processor architecture special for a high-performance programmable logic controller (PLC). The processor architecture includes a PLC application specific instruction processor (ASIP), and a general processor, wherein, the PLC ASIP is connected to the general processor via the interface of the PLC ASIP and the general processor; the PLC ASIP includes an instruction memory, an instruction counter, an instruction register, an instruction decoder, a control unit, a functional block unit, a functional block register group, a data memory, a register group, a bits processor, a jump & call instruction and access instruction processing unit, I/O data memory and a state register, The processor architecture reduces the instruction number executed by the PLC processor and accelerates the execution speed of the PLC program by designing the PLC specific instruction-set which is accord with the PLC instruction characteristics, thereby improving the processing performance of the PLC processor on functional block instructions.

...read moreread less

7 citations

Patent•

Storing a target address of a control transfer instruction in an instruction field

[...]

Jama I. Barreh, Manish K. Shah, Christopher H. Olson

30 Nov 2011

TL;DR: In this article, a control transfer instruction (CTI) is modified to include at least a portion of the computed target address information indicating this modification has been performed, for example, in a pre-decode bit, in some cases, CTI modification may be performed only when a target address is a "near" target, rather than a "far" target.

...read moreread less

Abstract: A control transfer instruction (CTI), such as a branch, jump, etc, may have an offset value for a control transfer that is to be performed The offset value may be usable to compute a target address for the CTI (eg, the address of a next instruction to be executed for a thread or instruction stream) The offset may be specified relative to a program counter In response to detecting a specified offset value, the CTI may be modified to include at least a portion of a computed target address Information indicating this modification has been performed may be stored, for example, in a pre-decode bit In some cases, CTI modification may be performed only when a target address is a “near” target, rather than a “far” target Modifying CTIs as described herein may eliminate redundant address calculations and produce a savings of power and/or time in some embodiments

...read moreread less

5 citations

Patent•

Methods and apparatus for changing a sequential flow of a program using advance notice techniques

[...]

James Norris Dieffenderfer, Michael William Morrow

28 Jun 2011

TL;DR: In this article, a processor implements an apparatus and a method for providing advance notice of an indirect branch address, where a target address generated by an instruction is automatically identified and a next program address is prepared based on a most current target address before a branch instruction utilizing the most current indirect target address is speculatively executed.

...read moreread less

Abstract: A processor implements an apparatus and a method for providing advance notice of an indirect branch address. A target address generated by an instruction is automatically identified. A next program address is prepared based on a most current target address before an indirect branch instruction utilizing the most current target address is speculatively executed. The apparatus suitably employs a register for holding an instruction memory address that is specified by a program as a most current indirect address of an indirect branch instruction. The apparatus also employs a next program address selector that selects the most current indirect address from the register as the next program address for use in speculatively executing the indirect branch instruction.

...read moreread less

Patent•

Instruction predication using instruction address pattern matching

[...]

Mark J. Hickey¹, Adam J. Muff¹, Matthew R. Tubbs¹, Charles D. Wait¹•Institutions (1)

IBM¹

16 Dec 2011

TL;DR: In this article, the authors propose a method to prevent execution of an instruction based at least in part on determining that the address is within a range of addresses of the instruction, which is similar to our approach.

...read moreread less

Abstract: A particular method includes receiving, at a processor, an instruction and an address of the instruction. The method also includes preventing execution of the instruction based at least in part on determining that the address is within a range of addresses.

...read moreread less

Patent•

Method and device for recombining runtime instruction

[...]

Jiaxiang Wang

29 Apr 2011

TL;DR: In this article, an instruction running environment is buffered, the machine instruction segment to be scheduled is obtained, and the second jump instruction which directs an entry address of an instruction recombining platform is inserted before the last instruction of the obtained machine Instruction segment to generate the recombined instruction segment comprising the address A″; the value A of the address register of the buffered instruction run environment is modified to the addressA″;

...read moreread less

Abstract: A method for recombining runtime instruction comprising: an instruction running environment is buffered; the machine instruction segment to be scheduled is obtained; the second jump instruction which directs an entry address of an instruction recombining platform is inserted before the last instruction of the obtained machine instruction segment to generate the recombined instruction segment comprising the address A″; the value A of the address register of the buffered instruction running environment is modified to the address A″; the instruction running environment is recovered. A device for recombining the runtime instruction comprising: an instruction running environment buffering and recovering unit suitable for buffering and recovering the instruction running environment; an instruction obtaining unit suitable for obtaining the machine instruction segment to be scheduled; an instruction recombining unit suitable for generating the recombined instruction segment comprised the address A″; and an instruction replacing unit suitable for modifying the value of the address register of the buffered instruction running environment to the address of the recombined instruction segment. The monitoring and control of the runtime instruction of the computing device is completed.

...read moreread less

Posted Content•

Deterministic Real-time Thread Scheduling

[...]

Heechul Yun, Cheolgi Kim, Lui Sha

12 Apr 2011-arXiv: Operating Systems

TL;DR: By introducing the concept of Worst Case Executable Instructions (WCEI), the main idea is to use timing insensitive deterministic events, e.g, an instruction counter, in conjunction with a real-time clock to schedule threads.

...read moreread less

Abstract: Race condition is a timing sensitive problem. A significant source of timing variation comes from nondeterministic hardware interactions such as cache misses. While data race detectors and model checkers can check races, the enormous state space of complex software makes it difficult to identify all of the races and those residual implementation errors still remain a big challenge. In this paper, we propose deterministic real-time scheduling methods to address scheduling nondeterminism in uniprocessor systems. The main idea is to use timing insensitive deterministic events, e.g, an instruction counter, in conjunction with a real-time clock to schedule threads. By introducing the concept of Worst Case Executable Instructions (WCEI), we guarantee both determinism and real-time performance.

...read moreread less

Patent•

Program generating device, program generating program, and program generating method

[...]

Makoto Katsukura¹, Masanori Nakata¹•Institutions (1)

Mitsubishi Electric¹

26 Jan 2011

TL;DR: In this paper, a terminal device that is this program generating device generates program area specifying information that specifies the placement area of an operating program executed by a remote control device, and then the terminal device appends a program specifying process to a measuring program that measures the execution state of the operating program.

...read moreread less

Abstract: A terminal device that is this program generating device generates program area specifying information that specifies the placement area of an operating program executed by a remote control device. Also, on the basis of the program area specifying information and a program counter value of the remote control device, the terminal device appends a program specifying process that specifies the operating program executed by the remote control device to a measuring program that measures the execution state of the operating program. As a result, there is generated a measuring program that measures changes in the operating state of software in real time while reducing the effect on the operation of the software.

...read moreread less

Patent•

Multi-thread processors and methods for instruction execution and synchronization therein and computer program products thereof

[...]

Yangang Zhang¹•Institutions (1)

VIA Technologies¹

08 Mar 2011

TL;DR: In this article, methods for instruction execution and synchronization in a multi-thread processor are provided, where each thread is provided a counter value pointing to one of the instructions in the instruction sequence.

...read moreread less

Abstract: Methods for instruction execution and synchronization in a multi-thread processor are provided, wherein in the multi-thread processor, multiple threads are running and each of the threads can simultaneously execute a same instruction sequence. A source code or an object code is received and then compiled to generate the instruction sequence. Instructions for all of function calls within the instruction sequence are sorted according to a calling order. Each thread is provided a counter value pointing to one of the instructions in the instruction sequence. A main counter value is determined according to the counter values of the threads such that all of the threads simultaneously execute an instruction of the instruction sequence that the main counter value points to.

...read moreread less

Patent•

System and method to evaluate a data value as an instruction

[...]

Lucian Codrescu¹, Erich James Plondke¹, Suresh K. Venkumahanti¹•Institutions (1)

Qualcomm¹

23 May 2011

TL;DR: In this paper, a system and method to evaluate a data value as an instruction is disclosed, and an apparatus configured to execute program code includes an execute unit that can execute a first instruction associated with a location of a second instruction.

...read moreread less

Abstract: A system and method to evaluate a data value as an instruction is disclosed. For example, an apparatus configured to execute program code includes an execute unit configured to execute a first instruction associated with a location of a second instruction. The first instruction is identified by a program counter. The apparatus also includes a decode unit configured to receive the second instruction from the location and to decode the second instruction to generate a decoded second instruction without changing the program counter to point to the second instruction. The first and second instruction are virtual machine instructions and the execute unit is adapted to interprete these virtual machine instructions.

...read moreread less

Patent•

Fault-tolerant processor

[...]

Tsar Kov Aleksej Nikolaevich, Arjashev Sergej Ivanovich, Bobkov Sergej Genad Evich, Borodaj Vladimir Ehrnestovich, Vasilegin Boris Vladimirovich, Nagaev Konstantin Dmitrievich, Osipenko Pavel Nikolaevich, Pavlov Aleksandr Alekseevich, Khoruzhenko Oleg Vladimirovich - Show less +5 more

27 Apr 2011

TL;DR: In this article, a fault-tolerant processor includes two basic devices: a master node and an operating node, which includes an operation code decoder, a clock pulse generator, a control unit, an instruction counter, an address register and a correlation unit.

...read moreread less

Abstract: FIELD: information technology. ^ SUBSTANCE: fault-tolerant processor includes two basic devices: a master node and an operating node. The master node includes an operation code decoder, a clock pulse generator, a control unit, an instruction counter, an address register and a correlation unit. The operating node includes a shift counter, a number register, an accumulator register, an extra register, an extra code register, an adder and a control unit. The technical result is achieved by including a correlation unit for detecting and correcting errors of the processor control memory, as well as by including a control unit which enables to detect and correct errors of the arithmetic logic unit when performing arithmetic and logic operations. ^ EFFECT: high failure-tolerance of the processor due to detection and correction of errors. ^ 8 cl, 8 dwg

...read moreread less

Patent•

Instruction fetch apparatus, processor and program counter addition control method

[...]

Hitoshi Kai¹, Hiroaki Sakaguchi¹, Hiroshi Kobayashi¹, Katsuhiko Metsugi¹, Haruhisa Yamamoto¹, Yousuke Morita¹, Hasegawa Koichi¹, Taichi Hirao¹ - Show less +4 more•Institutions (1)

Sony Broadcast & Professional Research Laboratories¹

10 Feb 2011

TL;DR: In this paper, an instruction fetch apparatus is disclosed which includes a program counter configured to manage the address of an instruction targeted to be executed in a program in which instructions belonging to a plurality of instruction sequences are placed sequentially.

...read moreread less

Abstract: An instruction fetch apparatus is disclosed which includes: a program counter configured to manage the address of an instruction targeted to be executed in a program in which instructions belonging to a plurality of instruction sequences are placed sequentially; a change designation register configured to designate a change of an increment value on the program counter; an increment value register configured to hold the changed increment value; and an addition control section configured such that if the change designation register designates the change of the increment value on the program counter, then the addition control section increments the program counter based on the changed increment value held in the increment value register, the addition control section further incrementing the program counter by an instruction word length if the change designation register does not designate any change of the increment value on the program counter

...read moreread less

Patent•

Methods and apparatus for saving conditions prior to a reset for post reset evaluation

[...]

Thomas Andrew Sartorius¹, Subodh Singh¹•Institutions (1)

Qualcomm¹

02 Dec 2011

TL;DR: A processor reset control circuit is configured to automatically capture a pre-reset value of processor information stored in one or more hardware registers, as part of a reset operation state machine and prior to changing the processor information to its architecturally required post reset value.

...read moreread less

Abstract: A processor reset control circuit is configured to automatically capture a pre-reset value of processor information stored in one or more hardware registers, as part of a reset operation state machine and prior to changing the processor information to its architecturally required post reset value. Such pre-reset processor information includes, for example one or more pre-reset values of the processor program counter (PC) and one or more pre-reset values of an operating-state mode register, both of which may be captured in one or more pre-reset capture storage devices which are then made available for evaluation purposes. Such pre-reset capture storage devices store pre-reset information in response to the reset and maintain the stored pre-reset information until another reset occurs.

...read moreread less

Patent•

Instruction address adjustment in response to logically non-significant operations

[...]

Mark J. Hickey¹, Adam J. Muff¹, Matthew R. Tubbs¹, Charles D. Wait¹•Institutions (1)

IBM¹

26 Oct 2011

TL;DR: In this paper, a method, apparatus, and program product execute instructions of an instruction stream and detect logically non-significant operations in the instruction stream, then, based on that detection, a target or source address of a subsequent instruction is adjusted.

...read moreread less

Abstract: A method, apparatus, and program product execute instructions of an instruction stream and detect logically non-significant operations in the instruction stream. Then, based on that detection, a target or source address of a subsequent instruction is adjusted. In some instances, doing so enables a greater number of addresses, e.g., registers, to be accessed in a given number of bit positions within an instruction format.

...read moreread less

Patent•

4-bit RISC (Reduced Instruction-Set Computer) microcontroller

[...]

Chen Qinxue, Ding Dongmin, Jin Xiang

19 Oct 2011

TL;DR: In this paper, a 4-bit RISC (Reduced Instruction-Set Computer) microcontroller with a control module, a program memory, a register file, a reset module and at least one peripheral function module is presented.

...read moreread less

Abstract: The invention provides a 4-bit RISC (Reduced Instruction-Set Computer) microcontroller comprising a control module, a program memory, a register file, a reset module, a clock module and at least one peripheral function module; the control module comprises an instruction register, an instruction encoder, a stack, an ALU (Arithmetic Logic Unit) and a program counter; a two-level two-phase assembly line architecture is adopted; each instruction cycle is divided into a first phase and a second phase; in the first phase, the control module completes the operations of stack warehousing, program memory reading, register reading, instruction encoding and ALU arithmetic; and in the second phase, the control module completes the operations of stack pulling, instruction register latching, register writing and program counter rewriting. The invention further provides an intelligent toy comprising a control chip. The control chip has improved voice playing effect and lower production cost.

...read moreread less

Book Chapter•DOI•

Managing complexity through abstraction: a refinement-based approach to formalize instruction set architectures

[...]

Fangfang Yuan¹, Stephen Wright¹, Kerstin Eder¹, David May•Institutions (1)

University of Bristol¹

26 Oct 2011

TL;DR: This paper presents a method to specify, design and construct sound and complete ISAs by stepwise refinement and formal proof using the formal method Event-B, and develops a generic ISA modeling template in Event- B to facilitate reuse.

...read moreread less

Abstract: Verifying the functional correctness of a processor requires a sound and complete specification of its Instruction Set Architecture (ISA). Current industrial practice is to describe a processor's ISA informally using natural language often with added semi-formal notation to capture the functional intent of the instructions. This leaves scope for errors and inconsistencies. In this paper we present a method to specify, design and construct sound and complete ISAs by stepwise refinement and formal proof using the formal method Event-B. We discuss how the automatically generated Proof Obligations help to ensure self-consistency of the formal ISA model, and how desirable properties of ISAs can be enforced within this modeling framework. We have developed a generic ISA modeling template in Event-B to facilitate reuse. The key value of reusing such a template is increased model integrity. Our method is now being used to formalize the ISA of the XMOS XCore processor with the aim to guarantee that the documentation of the XCore matches the silicon and the silicon matches the architectural intent.

...read moreread less

Patent•

Semiconductor circuit and design device

[...]

Masayuki Tsuji

01 Dec 2011

TL;DR: In this paper, the authors propose a hardware accelerator for enabling a processor to quickly start a HPC. But the hardware accelerator needs to be integrated into the processor to access the data of the argument of the function of the program.

...read moreread less

Abstract: PROBLEM TO BE SOLVED: To provide a semiconductor circuit for enabling a processor to quickly start a hardware accelerator.SOLUTION: This semiconductor circuit is provided with: a memory (214) for storing data; a processor (212) for executing a program, and for, when the value of a program counter showing the address under execution of the program is turned to be a hardware accelerator start address, writing the data of the argument of the function of the program in the address of the stack pointer of the memory, and for outputting the address of the stack pointer; and a hardware accelerator (213) for, when the value of the program counter of the processor is turned to be the hardware accelerator start address, inputting the address of the stack pointer from the processor, and for reading the data of the argument of a function from the memory based on the address of the stack pointer, and for performing the processing of the function integrated into hardware by using the data of the argument.

...read moreread less

Patent•

Byte-oriented microcontroller having wider program memory bus supporting macro instruction execution, accessing return address in one clock cycle, storage accessing operation via pointer combination, and increased pointer adjustment amount

[...]

Hsiao-Ming Huang

06 Jul 2011

TL;DR: In this article, a byte-oriented microcontroller includes a program memory, program memory bus, and a core circuit for executing at least one instruction by processing a plurality of instruction bytes fetched from the program memory.

...read moreread less

Abstract: An exemplary byte-oriented microcontroller includes a program memory, a program memory bus, and a core circuit. The program memory bus has a bus width wider than one instruction byte, and the core circuit is coupled to the program memory through the program memory bus for executing at least one instruction by processing a plurality of instruction bytes fetched from the program memory. The core circuit includes a fetch unit, for fetching the instruction bytes through the program memory bus and re-ordering the fetched instruction bytes to form a complete instruction.

...read moreread less

Proceedings Article•DOI•

Microblaze: an application-independent fpga-based profiler (abstract only)

[...]

Fadi Obeidat¹, Robert H. Klenke¹•Institutions (1)

Virginia Commonwealth University¹

27 Feb 2011

TL;DR: This research proposes an application-independent profiling technique using the MicroBlaze/FPGA platform where profiling library or user-defined functions can be achieved by tracing the unique instruction flow that distinguishes functions from each other rather than monitoring the program counter value (the addresses of the functions).

...read moreread less

Abstract: Monitoring the functional behavior of an application is an important capability that assists in exploring the performance of a target application against different SW/HW implementations. Recently, there have been efforts to exploit the ability to trace the internal signals of the soft-core processors for developing FPGA-based profiling tools to monitor programs running on these processors. However, these previously developed techniques are application-dependent, i.e., they require the designer either to edit the HDL code or the application code to obtain the desired trace information when targeting new applications. In this research, we propose an application-independent profiling technique using the MicroBlaze/FPGA platform where profiling library or user-defined functions can be achieved by tracing the unique instruction flow that distinguishes functions from each other rather than monitoring the program counter value (the addresses of the functions). Hence, modifying the application code or targeting new application does not require reconfiguring the FPGA or modifying the application code for targeting the same functions. This technique can be used to analyze the target application at the source code level, observing the dominant operations and demanded resources that characterize the system behavior. In addition, this technique can assist in selecting the appropriate processor architecture for a given application by considering MicroBlaze as a reference architecture from which the functional behavior of the target application can be mapped to the performance of other architectures.

...read moreread less

Patent•

Method for hiding texture latency and managing registers on a processor

[...]

Simon Moy, Shihao Wang, Zhengqian Qiu

14 Dec 2011

TL;DR: In this article, a method for hiding texture latency in a multi-thread virtual pipeline (MVP) processor including the steps of: allowing the MVP processor to start running a main rendering program, segmenting registers of various MVP kernel instances according to the length set, acquiring a plurality of register sets with the same length, binding the register sets to chipsets of the processor at the beginning of the running of the kernel instance, allowing a shader thread to give up a processing time slot occupied by the shader thread after sending a texture detail request, and setting a Program Counter (PC)

...read moreread less

Abstract: A method for hiding texture latency in a multi-thread virtual pipeline (MVP) processor including the steps of: allowing the MVP processor to start running a main rendering program; segmenting registers of various MVP kernel instances in the MVP processor according to the length set, acquiring a plurality of register sets with the same length, binding the register sets to chipsets of the processor at the beginning of the running of the kernel instance; allowing a shader thread to give up a processing time slot occupied by the shader thread after sending a texture detail request, and setting a Program Counter (PC) value in the case of return; and returning texture detail and allowing the shader thread to restart running.

...read moreread less

Patent•

Next-instruction-type field

[...]

Jorn Nystad

24 Aug 2011

TL;DR: In this paper, a next-instruction-type field is used to control selection of to which processing pipeline the next instruction is issued before that next instruction has been fetched and decoded.

...read moreread less

Abstract: A graphics processing unit core (26) includes a plurality of processing pipelines (38, 40, 42, 44). A program instruction of a thread of program instructions being executed by a processing pipeline includes a next-instruction-type field (36) indicating an instruction type of a next program instruction following the current program instruction within the processing thread concerned. This next-instruction-type field is used to control selection of to which processing pipeline the next instruction is issued before that next instruction has been fetched and decoded. The next-instruction-type field may be passed along the processing pipeline as the least significant four bits within a program counter value associated with a current program instruction (32). The next-instruction-type field may also be used to control the forwarding of thread state variables between processing pipelines when a thread migrates between processing pipelines prior to the next program instruction being fetched or decoded.

...read moreread less

Proceedings Article•DOI•

A 5-stage pipelined embedded processor with optimized handling exception

[...]

Wenjiang Li, Song Zhang, Xiong Jiang, Yaohui Zhang

01 Dec 2011

TL;DR: This paper presents the design and implementation of a embedded processor, xCore_AHB, featuring precise interrupt and exception, which is compatible with ARMv4 architecture, and the precise exception mechanism of this design provides not only the quick entrance of the interrupt handle programs but also the interruption handle programs with the right return address by an additional program counter in write back stage of the pipeline and its support circuits.

...read moreread less

Abstract: This paper presents the design and implementation of a embedded processor, xCore_AHB, featuring precise interrupt and exception, which is compatible with ARMv4 architecture. The precise exception mechanism of this design provides not only the quick entrance of the interrupt handle programs but also the interrupt handle programs with the right return address by an additional program counter in write back stage of the pipeline and its support circuits. The proposed exception controller saves about 30% area compared with the traditional exception controller. The proposed design has been implemented with the 0.18um 1P6M CMOS process of SMIC. The chip operates 1.2DMIPS at a frequency of 100MHz with 33mW power dissipation.

...read moreread less

Patent•

Information processing device and emulation processing program and method

[...]

Nakayama Takashi¹, Kazuyoshi Watanabe¹•Institutions (1)

Fujitsu¹

05 Oct 2011

TL;DR: In this article, the authors propose an emulation processing method that causes a computer including a first and a second processor to execute emulation processing, where the first processor calculates a next instruction address next to a received instruction address, and the second processor read out instruction information on the basis of the first instruction address and execute processing based on the instruction information read out.

...read moreread less

Abstract: An emulation processing method causing a computer including a first and a second processor to execute emulation processing, the emulation processing method includes: calculate a next instruction address next to a received instruction address, and transmit, to the second processor, the calculated instruction address and instruction information read out on the basis of the calculated instruction address, transmit, to the first processor, a first instruction address that is an instruction address included in an execution result of executed processing, and execute processing based on the instruction information received from the first processor, when a second instruction address that is the instruction address received from the first processor is identical to the first instruction address, and read out instruction information on the basis of the first instruction address and execute processing based on the instruction information read out, when the second instruction address is not identical to the first instruction address.

...read moreread less