scispace - formally typeset
Search or ask a question

Showing papers on "Program counter published in 2006"


Patent
31 Oct 2006
TL;DR: A system and method for profiling a software application may include means for capturing profiling information corresponding to an instruction identified as having executed coincident with the occurrence of a runtime event, and for associating the profiling information with the event in an event set as mentioned in this paper.
Abstract: A system and method for profiling a software application may include means for capturing profiling information corresponding to an instruction identified as having executed coincident with the occurrence of a runtime event, and for associating the profiling information with the event in an event set. In some embodiments, the identified instruction, which may have triggered the event, may be located in the program code sequence at a predetermined position relative to the current program counter value at the time the event was detected. The predetermined relative position may be fixed dependent on the processor architecture and may also be dependent on the event type. The predetermined relative position may be zero, indicating that when the event was detected, the program counter value corresponded to an instruction associated with the event. If the identified instruction is an ambiguity-creating instruction, an indication of ambiguity may be associated with the event.

78 citations


Journal ArticleDOI
TL;DR: A microarchitectural implementation of Minos that achieves negligible impact on cycle time with a small investment in die area is presented, as well as and minor changes to the Linux kernel to handle the tag bits and perform virtual memory swapping.
Abstract: We present Minos, a microarchitecture that implements Biba's low water-mark integrity policy on individual words of data. Minos stops attacks that corrupt control data to hijack program control flow, but is orthogonal to the memory model. Control data is any data that is loaded into the program counter on control-flow transfer, or any data used to calculate such data. The key is that Minos tracks the integrity of all data, but protects control flow by checking this integrity when a program uses the data for control transfer. Existing policies, in contrast, need to differentiate between control and noncontrol data a priori, a task made impossible by coercions between pointers and other data types, such as integers in the C language. Our implementation of Minos for Red Hat Linux 6.2 on a Pentium-based emulator is a stable, usable Linux system on the network on which we are currently running a web server (http://minos.cs.ucdavis.edu). Our emulated Minos systems running Linux and Windows have stopped ten actual attacks. Extensive full-system testing and real-world attacks have given us a unique perspective on the policy tradeoffs that must be made in any system, such as Minos; this paper details and discusses these. We also present a microarchitectural implementation of Minos that achieves negligible impact on cycle time with a small investment in die area, as well as and minor changes to the Linux kernel to handle the tag bits and perform virtual memory swapping.

69 citations


Journal Article
TL;DR: In this paper, the authors introduce new methods for detecting control-flow side channel attacks, transforming C source code to eliminate such attacks, and checking that the transformed code is free of controlflow side channels.
Abstract: We introduce new methods for detecting control-flow side channel attacks, transforming C source code to eliminate such attacks, and checking that the transformed code is free of control-flow side channels. We model control-flow side channels with a program counter transcript, in which the value of the program counter at each step is leaked to an adversary. The program counter transcript model captures a class of side channel attacks that includes timing attacks and error disclosure attacks. Further, we propose a generic source-to-source transformation that produces programs provably secure against control-flow side channel attacks. We implemented this transform for C together with a static checker that conservatively checks x86 assembly for violations of program counter security; our checker allows us to compile with optimizations while retaining assurance the resulting code is secure. We then measured our technique's effect on the performance of binary modular exponentiation and real-world implementations in C of RC5 and IDEA: we found it has a performance overhead of at most 5x and a stack space overhead of at most 2x. Our approach to side channel security is practical, generally applicable, and provably secure against an interesting class of side channel attacks.

65 citations


Patent
18 Jan 2006
TL;DR: In this paper, an instruction decoder generates register usage information for an instruction from each of the threads, a priority generator that generates a priority for each instruction based on the register usage and state information of instructions currently executing in an execution pipeline, and selection logic that dispatches at least one instruction from a thread based on priority of the instructions.
Abstract: An apparatus for scheduling dispatch of instructions among a plurality of threads being concurrently executed in a multithreading processor is provided. The apparatus includes an instruction decoder that generate register usage information for an instruction from each of the threads, a priority generator that generates a priority for each instruction based on the register usage information and state information of instructions currently executing in an execution pipeline, and selection logic that dispatches at least one instruction from at least one thread based on the priority of the instructions. The priority indicates the likelihood the instruction will execute in the execution pipeline without stalling. For example, an instruction may have a high priority if it has little or no register dependencies or its data is known to be available; or may have a low priority if it has strong register dependencies or is an uncacheable or synchronized storage space load instruction.

46 citations


Proceedings ArticleDOI
21 Oct 2006
TL;DR: This paper presents a generic instruction-level runtime taint checking architecture for handling non-control data attacks and demonstrates effective usages of the architecture to detect buffer overflow and format string attacks.
Abstract: Current taint checking architectures monitor tainted data usage mainly with control transfer instructions. An alarm is raised once the program counter becomes tainted. However, such architectures are not effective against non-control data attacks. In this paper we present a generic instruction-level runtime taint checking architecture for handling non-control data attacks. Under our architecture, instructions are classified as either Taintless-Instructions or Tainted-Instructions prior to program execution. An instruction is called a Tainted-Instruction if it is supposed to deal with tainted data. Otherwise it is called a Taintless-Instruction. A security alert is raised whenever a Taintless-Instruction encounters tainted data at runtime. The proposed architecture is implemented on the SimpleScalar simulator. The preliminary results from experiments on SPEC CPU 2000 benchmarks show that there are a significant amount of Taintless-Instructions. We also demonstrate effective usages of our architecture to detect buffer overflow and format string attacks.

45 citations


Patent
06 Nov 2006
TL;DR: In this article, a system and method for program counter and data tracing is described, which enables increased visibility into the hardware and software state of the processor core. But it does not address the problem of program counter detection.
Abstract: A system and method for program counter and data tracing is disclosed. The tracing mechanism of the present invention enables increased visibility into the hardware and software state of the processor core.

41 citations


Patent
02 Jul 2006
TL;DR: In this paper, the authors present a method, device and/or system of verifying that a secure code is executed by a processor, using a verifier to verify that the processor had executed the gating code only if the processor performed at least one operation.
Abstract: Some demonstrative embodiments of the invention include a method, device and/or system of verifying that a secure code is executed by a processor. The device may include, for example, a memory to store a secure code; a processor intended to execute a gating code, wherein the gating code, when executed by the processor, results in the processor to perform at least one operation and set a program counter of the processor to point to an entry point of the secure code; and a verifier to verify that the processor had executed the gating code only if the processor performs the at least one operation. Other embodiments are described and claimed.

23 citations


Patent
Lei Wang1
08 Aug 2006
TL;DR: A backward branch prediction queue (BBQ) as discussed by the authors was proposed to assist an embedded processor to overcome an inevitable control hazard caused in a pipeline execution for a conditional branch instruction, and the average prediction accuracy is up to 82% and some applications may even have an accuracy of 99%.
Abstract: A programmable backward jump instruction prediction mechanism includes a backward branch prediction queues (BBQ) for assisting an embedded processor to overcome an inevitable control hazard caused in a pipeline execution for a conditional branch instruction. A large percentage of nested loops exists in an application program executed by the embedded processor, and thus when the backward branch encounters a nested loop, the behavior of branch of a nested loop is similar to a queue that will automatically restore its original status; the whole nested loop iterates at a center and repeats the execution of innermost loops (Queue Front) and leaves the prediction miss to the next backward branch (an outer loop, Queue Next); once if an outer loop hits a branchy, the inner loop will repeat the branch ( and returns to the innermost loop Queue Front). Since the program counter (PC) and the branch address of the queue can be used for determining whether or not the program execution is still in a nested loop or whether or not a jump is from a backward branch by the target address of the branch instruction. It is only necessary to predict an execution and compare a specific branch address in the queue for each time, and thus the queue structure needs not to store too many instructions or quickly compare a large number of data by the associative memory technique. The hardware is very simple, but the effect is excellent. According to the simulation analysis of the application program, it is discovered that the average prediction accuracy is up to 82% and some applications may even have an accuracy of 99%. The hardware mechansim of the invention features a low cost and a low level of complexity, and thus fully satifying the requirements for low cost, low power consumption, and high performance/cost ratio of an embedded processor.

21 citations


Patent
08 Dec 2006
TL;DR: In this paper, a system and method for program counter and data tracing in a multi-issue processor is described, which enables increased visibility into the hardware and software state of the processor core.
Abstract: A system and method for program counter and data tracing in a multi-issue processor is disclosed. Instructions are traced in program sequence order. In one embodiment instructions are traced in graduation order from a reorder buffer. The tracing mechanism of the present invention enables increased visibility into the hardware and software state of the processor core.

17 citations


Patent
Yun Du1, Guofang Jiao1, Chun Yu1
29 Aug 2006
TL;DR: In this paper, a thread scheduler includes context units for managing the execution of threads where each context unit includes a load reference counter for maintaining a counter value indicative of a difference between a number of data requests and the data returns associated with the particular context unit.
Abstract: A thread scheduler includes context units for managing the execution of threads where each context unit includes a load reference counter for maintaining a counter value indicative of a difference between a number of data requests and a number of data returns associated with the particular context unit. A context controller of the thread context unit is configured to refrain from forwarding an instruction of a thread when the counter value is nonzero and the instruction includes a data dependency indicator indicating the instruction requires data returned by a previous instruction.

15 citations


Patent
14 Feb 2006
TL;DR: In this paper, methods and systems for debugging a program executing on a processor are described. In a first implementation, a processing system includes a processor configured for switching to a debug mode from a non-debug mode upon executing a software breakpoint.
Abstract: Methods and systems are provided for debugging a program executing on a processor. In a first implementation, a processing system includes a processor configured for switching to a debug mode from a non-debug mode upon executing a software breakpoint. The system may include a program memory configured to hold instructions for a program, where the software breakpoint replaces at least one of the instructions. The system may also include an instruction replacement register separate from the program memory that is configured to receive the replaced instruction from any of the processor and an external debugger. The system may further include a control component that controls whether the processor fetches a next instruction for execution from the program memory or from the instruction replacement register.

Patent
20 Sep 2006
TL;DR: In this article, a compiler for compiling program instructions in dependence upon a predetermined decoder input instruction alignment is presented. But the compiler is limited to a subset of program instructions that can be reordered within a storage region of program memory.
Abstract: A compiler is provided for compiling program instructions in dependence upon a predetermined decoder input instruction alignment. The compiler comprises a program instruction sequence generator operable to process source code to produce a sequence comprising a plurality of program instructions for input to a decoder. At least one program instruction is reordered within a storage region of program memory. The storage region has an associated memory address and an offset value. The offset value gives a starting location of said program instruction within the memory address. The reordering of the program instruction is such that manipulations of instruction units of the plurality of program instructions required to achieve the predetermined decoder input instruction alignment are less complex than manipulations that would be required if no reordering had been performed. According to a further aspect, a program instruction aligner is provided to shift at least one portion of the reordered (reformatted) program instruction to produce the predetermined decoder-input instruction alignment. The offset value and an instruction length are supplied as control inputs to the program instruction aligner. A plurality of connections between register fields and shifter fields of the program instruction aligner is such that at least one of said plurality of register fields is connected to only a subset of said plurality of shifter fields.

Proceedings ArticleDOI
01 Oct 2006
TL;DR: A micro-architectural mechanism to validate control flow transfer at run-time at machine instruction level using a hardware table consisting of legitimate indirect branches and their target pairs (IBPs) to aid the validation.
Abstract: Current micro-architecture blindly uses the address in the program counter to fetch and execute instructions without validating its legitimacy. Whenever this blind-folded instruction sequencing is not properly addressed at a higher level by system, it becomes a vulnerability of control data attacks, today's dominant and most critical security threats. To remedy it, this paper proposes a micro-architectural mechanism to validate control flow transfer at run-time at machine instruction level. It is proposed to have a hardware table consisting of legitimate indirect branches and their target pairs (IBPs) to aid the validation. The IBP table is implemented in the form of a cascading Bloom filter to store the security information as well as to enable fast validating. Based on a key observation that branch prediction unit existing in most speculative-execution processors already provides a portion of the control flow validation, our scheme activates the validation only on indirect branch mis-predictions. Because of the Bloom filter and the rarity of mis-predictions of indirect branches, the validation incurs moderate storage overhead and little performance penalty.

Patent
Hiroyuki Yamashita1
15 Jun 2006
TL;DR: In this article, the authors propose an instruction-set-simulator generating device for simulating an instruction execution process of a real-central processing unit on a host central processing unit that differs from the real-Central Processing Unit.
Abstract: An instruction-set-simulator generating device that generates an instruction-set-simulator program for simulating an instruction execution process of a real central processing unit on a host central processing unit that differs from the real central processing unit, the instruction-set-simulator generating device comprises: an application-program reading unit that reads an application program that is executable on the real central processing unit; an execution-stage instruction conversion unit that converts a function of an instruction in the application program into at least one instruction (execution-stage instruction) for simulation on the host central processing unit; a fetch-stage instruction generating unit that generates at least one instruction (fetch-stage instruction) that simulates operation timing of an instruction fetch stage among pipeline stages of the real central processing unit prior to the execution-stage instruction; and an instruction-set-simulator program output unit that generates the instruction-set-simulator program based on the execution-stage instruction and the fetch-stage instruction; at least one of the execution-stage instruction conversion unit and the fetch-stage instruction generating unit generating a counter instruction for simulating an execution time of the real central processing unit.

Patent
25 Aug 2006
TL;DR: In this article, a system that includes a memory, a memory card, a processor, and a power supply is described, where a program instruction driver comprises program instructions for providing a selection of the program instruction, reading program instruction associated with the selection, and executing the program instructions.
Abstract: A system that includes a memory, a memory card, a processor, and a power supply is provided. The memory is configured to store a program instruction driver and the memory card is configured to store a program instruction. The power supply is configured to generate a voltage and is connected to the processor. The processor, which is in communication with the memory and the memory card, is configured to execute the program instruction driver stored in the memory. The program instruction driver comprises program instructions for providing a selection of the program instruction, reading the program instruction associated with the selection, and executing the program instruction.

Patent
31 Jan 2006
TL;DR: In this paper, a method, storage medium, processor instruction and processor to for specifying a value in a first portion of a conditional pre-fetch instruction associated with a branch instruction used for effectuating a branch operation, specifying a target instruction address in a second portion of the instruction, evaluating the value to determine whether a condition is met, and prefetching one or more instructions starting at the target address into an instruction buffer of the processor when the condition was met, is provided.
Abstract: A method, storage medium, processor instruction and processor to for specifying a value in a first portion of a conditional pre-fetch instruction associated with a branch instruction used for effectuating a branch operation, specifying a target instruction address in a second portion of the instruction, evaluating the value to determine whether a condition is met, and pre-fetching one or more instructions starting at the target instruction address into an instruction buffer of the processor when the condition is met, is provided.

Patent
Hitoshi Suzuki1
12 Jul 2006
TL;DR: In this paper, the processor system includes an instruction cache for storing a prefetched instruction, an instruction execution section for executing the instruction stored in the instruction cache, a branch target address register for storing the address of the branch target instruction, a register write detector for detecting writing to the branch-target address register by the instruction execution sections, and a prefetch controller for starting prefetch of the Branch-Target Instruction (BTI) in response to a detection result of the register-write detector.
Abstract: The processor system includes an instruction cache for storing a prefetched instruction, an instruction execution section for executing the instruction stored in the instruction cache, a branch target address register for storing the address of the branch target instruction, a register write detector for detecting writing to the branch target address register by the instruction execution section, and a prefetch controller for starting prefetch of the branch target instruction in response to a detection result of the register write detector.

Patent
David A. Luick1
03 Feb 2006
TL;DR: In this article, the authors present a method and apparatus for prefetching instruction lines from a level 2 cache by identifying a branch instruction targeting an instruction that is outside of the first instruction line.
Abstract: Embodiments of the present invention provide a method and apparatus for prefetching instruction lines. In one embodiment, the method includes fetching a first instruction line from a level 2 cache, identifying, in the first instruction line, a branch instruction targeting an instruction that is outside of the first instruction line, extracting an address from the identified branch instruction, and prefetching, from the level 2 cache, a second instruction line containing the targeted instruction using the extracted address.

Patent
28 Aug 2006
TL;DR: In this article, a method of scheduling trace packets in an integrated circuit generating trace packets of plural types stores trace data in respective first-in-first-out buffers is presented.
Abstract: A method of scheduling trace packets in an integrated circuit generating trace packets of plural types stores trace data in respective first-in-first-out buffers. If a timing trace data first-in-first-out buffer is empty, timing trace data packet is transmitted. If a program counter overall data first-in-first-out buffer is not empty and the processor is at a data interruptible boundary, a program counter data packet is transmitted. If data first-in-first-out buffer is not empty, a data packet is transmitted. The program counter data packets include program counter sync data, program counter exception data, program counter relative branch data and program counter absolute branch data.

Patent
06 Apr 2006
TL;DR: In this paper, a secure bit may be generated within a single mobile multimedia processor chip, based on the received first, second and third indicators, and on other internal state, such as instruction cache, interrupt, and program counter value associated with the current instruction.
Abstract: Methods and systems for processing video data are disclosed herein and may comprise receiving in a single mobile multimedia processor chip at least one indicator relating to how input multimedia data is processed. A further indicator may be generated within the single mobile multimedia processor chip, based on the at least one indicator, which identifies whether output data generated from the input multimedia data is secure. The at least one indicator may comprise a first indicator, which identifies whether an instruction cache is used to process the current instruction, a second indicator, which identifies whether an interrupt is used to process the current instruction, and a third indicator, which specifies a program counter value associated with the current instruction. A secure bit may be generated within the single mobile multimedia processor chip, based on the received first, second and third indicators, and on other internal state.

Patent
18 Jan 2006
TL;DR: In this paper, an irregular software pipelined loop conditioned upon data in a condition register in a compiler scheduled very long instruction word data processor is modified to prevent over-execution upon loop exit.
Abstract: This invention modifies an irregular software pipelined loop conditioned upon data in a condition register in a compiler scheduled very long instruction word data processor to prevent over-execution upon loop exit. The method replaces a register modifying instruction with an instruction conditional upon the inverse condition register if possible. The method inserts a conditional register move instruction to a previously unused register within the loop if possible without disturbing the schedule. Then a restoring instruction is added after the loop. Alternatively, both these two functions can be performed by a delayed register move instruction. Instruction insertion is into a previously unused instruction slot of an execute packet. These changes can be performed manually or automatically by the compiler.

Journal Article
TL;DR: This paper describes a monitoring system that uses a sequence of program counter values to monitor program progress, and compiler techniques that automatically generate the monitoring code, which both simplifies and reduces the overhead of monitoring.
Abstract: The increased popularity of grid systems and cycle sharing across organizations requires scalable systems that provide facilities to locate resources, to be fair in the use of those resources, and to monitor jobs executing on remote systems. This paper presents a novel and lightweight approach to monitoring the progress and correctness of a parallel computation on a remote, and potentially fraudulent, host system. We describe a monitoring system that uses a sequence of program counter values to monitor program progress, and compiler techniques that automatically generate the monitoring code. This approach improves on earlier work by omitting the need to duplicate computation, which both simplifies and reduces the overhead of monitoring. Our approach allows dynamic and accountable cycle-sharing across the Internet. Experimental results show that the overhead of our system is negligible and our monitoring approach is scalable.

Book ChapterDOI
22 May 2006
TL;DR: This chapter describes the ACL2 theorem proving system and shows how it can be used to model and verify hardware using refinement and make functional verification a bottleneck in the microprocessor design cycle.
Abstract: In this chapter, we describe the ACL2 theorem proving system and show how it can be used to model and verify hardware using refinement. This is a timely problem, as the ever-increasing complexity of microprocessor designs and the potentially devastating economic consequences of shipping defective products has made functional verification a bottleneck in the microprocessor design cycle, requiring a large amount of time, human effort, and resources [1, 58]. For example, the 1994 Pentium FDIV bug cost Intel $475 million and it is estimated that a similar bug in the current generation Intel Pentium processor would cost Intel $12 billion [2].

Patent
24 Mar 2006
TL;DR: In this article, a processor and processing method for reusing arbitrary sections of program code provides improved upgrade capability for systems with non-alterable read only memory (ROM) and a more flexible instruction set in general.
Abstract: A processor and processing method for reusing arbitrary sections of program code provides improved upgrade capability for systems with non-alterable read only memory (ROM) and a more flexible instruction set in general. A specific program instruction is provided in the processor instruction set for directing program execution to a particular start address, where the start address is specified in conjunction with the specific program instruction. An end address is also specified in conjunction with the specific program instruction and the processor re-directs control upon completion of code execution between the start and end address to either another specified address, or to a stored program counter value corresponding to the next instruction in sequence after the specific program instruction. A loop count may also be supplied for repeatedly executing the code between the start and end address until the count has expired.

Book ChapterDOI
15 May 2006
TL;DR: This work presents additional hardware mechanisms, eliminating inconsistencies in counter interrupt delivery, based on standard processor debugging facilities, and at the expense of a small number of additionally generated exceptions.
Abstract: Log-based recovery protocols enable process replicas in distributed systems to replay a computation up to the point where a previous computation failed. One fundamental assumption underlying these protocols is the piecewise deterministic (PWD) execution model, stating that recovery must not execute, but simulate the execution of nondeterministic events in order to maintain consistency. One such source of nondeterminism are asynchronous events triggering software signal handlers, an issue known to be solved by instruction counters. Efficient implementations in software have been shown to be practical, but require significant changes to applications and system software. Hardware counters, in contrast, allow running software unmodified. A number of processors implementing the Intel x86 instruction set architecture provide monitoring registers with properties similar to a true instruction counter. Designed for application profiling, these facilities reveal a number issues to be resolved when utilized for applications like the PWD model, which demands for a maximum in precision during replay. We discuss some of the most prominent problems faced when using performance counters for protocols satisfying the PWD model. We present additional hardware mechanisms, eliminating inconsistencies in counter interrupt delivery, based on standard processor debugging facilities, and at the expense of a small number of additionally generated exceptions.

Patent
14 Jun 2006
TL;DR: In this article, a system for determining the target address of a branch instruction is described, which includes a branch target buffer (BTB) and a comparator, coupled with a PC register and a BTB, for comparing the PC value of the current instruction with an output of the BTB corresponding to a previous instruction.
Abstract: A system for determining the target address of a branch instruction is disclosed. The system includes: a branch target buffer (BTB), containing at least an entry storing the target address of the branch instruction, the entry being indexed according to a program counter (PC) value of an instruction prior to the branch instruction; a PC register, containing a PC value of a current instruction; and a comparator, coupled to the PC register and the BTB, for comparing the PC value of the current instruction with an output of the BTB corresponding to a previous instruction.

Proceedings ArticleDOI
01 Sep 2006
TL;DR: A low-power body sensor network (BSN) control processor for human body communication (HBC) with the performance of 254 nodes management and the proposed `TCAM-based period scheduler' provides up to 10-MIPS performance only when it is needed.
Abstract: This paper presents a low-power body sensor network (BSN) control processor for human body communication (HBC) with the performance of 254 nodes management. The proposed `instantaneous program execution with external program counter' scheme provides up to 10-MIPS performance only when it is needed, and the `TCAM-based period scheduler' manages 254 HBC nodes with 21.6-muW power consumption. They are verified by the implementation of the BSN controller and shows 254 nodes management with 4.2-MIPS performance

Patent
27 Jun 2006
TL;DR: In this paper, a coprocessor is used to perform one or more specialized operations that can be offloaded from a primary or general purpose processor (12), and it is important to allow efficient communication and interfacing between the processor and the coproscessor (14).
Abstract: A coprocessor ( 14 ) may be used to perform one or more specialized operations that can be off-loaded from a primary or general purpose processor ( 12 ). It is important to allow efficient communication and interfacing between the processor ( 12 ) and the coprocessor ( 14 ). In one embodiment, a coprocessor ( 14 ) generates and provides instructions ( 200, 220 ) to an instruction pipe ( 20 ) in the processor ( 12 ). Because the coprocessor ( 14 ) generated instructions are part of the standard instruction set of the processor ( 12 ), cache ( 70 ) coherency is easy to maintain. Also, circuitry ( 102 ) in coprocessor ( 14 ) may perform an operation on data while circuitry ( 106 ) in coprocessor ( 14 ) is concurrently generating processor instructions ( 200, 220 ).

Patent
11 Apr 2006
TL;DR: In this paper, the interleaved multi-threading instruction pipeline utilizes a number of clock cycles that is less than an instruction issue rate for each of a plurality of program threads that are stored within the memory unit.
Abstract: A processor device is disclosed and includes a memory unit and at least one interleaved multi-threading instruction pipeline. The interleaved multi-threading instruction pipeline utilizes a number of clock cycles that is less than an instruction issue rate for each of a plurality of program threads that are stored within the memory unit. The memory unit includes six instruction caches. Further, the processor device includes six register files and each of the six register files is associated with one of the six instruction caches. Each of the plurality of program threads is associated with one of the six register files. Further, each of the six program threads includes a plurality of instructions and each of the plurality of instructions is stored within one of the six instruction caches of the memory.

Patent
12 Jan 2006
TL;DR: In this article, a microcomputer that can process plural tasks time-divisionally and in parallel is described, where one of a plural programs described by one of the tasks is described as a looped specific task in which the increment of program addresses is fixed, a program counter is usable as a timer counter, a peripheral function instruction is set so as to indicate one or more general-purpose registers as an operand.
Abstract: A microcomputer that can process plural tasks time-divisionally and in parallel, wherein one of a plural programs described by one of the tasks is described as a looped specific task in which the increment of program addresses is fixed, a program counter is usable as a timer counter, a peripheral function instruction is described in the specific task, the peripheral function instruction is set so as to indicate one or more general-purpose registers as an operand. The CPU executes the peripheral function instruction as one instruction and achieves information needed to execute the instruction by a general-purpose register and stores the execution result into the general-purpose registers. An instruction code encoding system includes an operation code and plural operands for indicating operation targets of an instruction in an instruction code and executing an instruction indicated by the operation code on the operation targets. When the operation targets indicated by the plural operands are set to a combination in which an execution result does not vary, the processing corresponding to an instruction different is executed.