scispace - formally typeset
Search or ask a question

Showing papers on "Pipeline (computing) published in 1989"


Journal ArticleDOI
TL;DR: Based on the scattered look-ahead technique, fully pipelined and fully hardware efficient linear bidirectional systolic arrays for recursive digital filters are presented and the decomposition technique is extended to time-varying recursive systems.
Abstract: A look-ahead approach (referred to as scattered look-ahead) to pipeline recursive loops is introduced in a way that guarantees stability. A decomposition technique is proposed to implement the nonrecursive portion (generated due to the scattered look-ahead process) in a decomposed manner to obtain concurrent stable pipelined realizations of logarithmic implementation complexity with respect to the number of loop pipeline stages (as opposed to linear). The upper bound on the roundoff error in these pipelined filters is shown to improve with an increase in the number of loop pipeline stages. Efficient pipelined realizations are studied of both direct-form and state-space-form recursive digital filters. Based on the scattered look-ahead technique, fully pipelined and fully hardware efficient linear bidirectional systolic arrays for recursive digital filters are presented. The decomposition technique is extended to time-varying recursive systems. >

373 citations


Journal ArticleDOI
01 Apr 1989
TL;DR: A parameterizable code reorganization and simulation system was developed and used to measure instruction-level parallelism and the average degree of superpipelining metric is introduced, suggesting that this metric is already high for many machines.
Abstract: Superscalar machines can issue several instructions per cycle. Superpipelined machines can issue only one instruction per cycle, but they have cycle times shorter than the latency of any functional unit. In this paper these two techniques are shown to be roughly equivalent ways of exploiting instruction-level parallelism. A parameterizable code reorganization and simulation system was developed and used to measure instruction-level parallelism for a series of benchmarks. Results of these simulations in the presence of various compiler optimizations are presented. The average degree of superpipelining metric is introduced. Our simulations suggest that this metric is already high for many machines. These machines already exploit all of the instruction-level parallelism available in many non-numeric applications, even without parallel instruction issue or higher degrees of pipelining.

316 citations


Proceedings ArticleDOI
01 Apr 1989
TL;DR: The architecture of the EMC-R, a highly parallel dataflow machine under development, has a strongly connected arc dataflow model; a direct matching scheme; a RISC-based design; a deadlock-free on-chip packet switch; and an integration of a packet-based circular pipeline and a register-based advanced control pipeline.
Abstract: A highly parallel (more than a thousand) dataflow machine EM-4 is now under development. The EM-4 design principle is to construct a high performance computer using a compact architecture by overcoming several defects of dataflow machines. Constructing the EM-4, it is essential to fabricate a processing element (PE) on a single chip for reducing operation speed, system size, design complexity and cost. In the EM-4, the PE , called EMC-R, has been specially designed using a 50,000-gate gate array chip. This paper focuses on an architecture of the EMC-R. The distinctive features of it are: a strongly connected arc dataflow model; a direct matching scheme; a RISC-based design; a deadlock-free on-chip packet switch; and an integration of a packet-based circular pipeline and a register-based advanced control pipeline. These features are intensively examined, and the instruction set architecture and the configuration architecture which exploit them are described.

178 citations


Journal ArticleDOI
TL;DR: Based on the clustered look-ahead and incremental output computation approaches, incremental block-state structure is derived for block implementation of state-space filters of multiplication complexity linear in block size and also extended for the multirate recursive filtering case.
Abstract: For pt.I see ibid., vol.37, no.7, p.1099 (1989). Block implementation and fine-grain pipelined block implementation of recursive digital filters are discussed. A new technique of incremental output computation is introduced which requires a linear complexity in block size. Based on the clustered look-ahead and incremental output computation approaches, incremental block-state structure is derived for block implementation of state-space filters of multiplication complexity linear in block size. The incremental block-state structure is also extended for the multirate recursive filtering case. The techniques of scattered look-ahead, clustered look-ahead, decomposition, and incremental output computation are combined to introduce several pipeline stages inside the recursive loop of the block filter. Deeply pipelined block filter structures are derived for implementation of direct-form and state-space-form recursive digital filters. The multiplication complexity of these pipelined block filters is linear with respect to the block size and logarithmic with respect to the number of loop pipeline stages, and the complexities due to pipelining and block processing are additive. >

168 citations


Proceedings ArticleDOI
01 Jul 1989
TL;DR: The system architecture and the programming environment of the Pixel Machine - a parallel image computer with a distributed frame buffer based on an array of asynchronous MIMD nodes with parallel access to a large frame buffer is described.
Abstract: We describe the system architecture and the programming environment of the Pixel Machine - a parallel image computer with a distributed frame buffer.The architecture of the computer is based on an array of asynchronous MIMD nodes with parallel access to a large frame buffer. The machine consists of a pipeline of pipe nodes which execute sequential algorithms and an array of m × n pixel nodes which execute parallel algorithms. A pixel node directly accesses every m-th pixel on every n-th scan line of an interleaved frame buffer. Each processing node is based on a high-speed, floating-point programmable processor.The programmability of the computer allows all algorithms to be implemented in software. We present the mappings of a number of geometry and image-computing algorithms onto the machine and analyze their performance.

165 citations


Patent
08 Nov 1989
TL;DR: In this article, a data processor for executing, instructions realized by wired logic, by a pipeline system, includes a plurality of instruction registers and arithmetic operation units of the same number.
Abstract: The data processor for executing, instructions realized by wired logic, by a pipeline system, includes a plurality of instruction registers, and arithmetic operation units of the same number. A plurality of instructions read in the instruction registers in one machine cycle at a time are processed in parallel by the plurality of arithmetic operation units.

86 citations


Proceedings ArticleDOI
01 Apr 1989
TL;DR: In this article, three schemes to reduce the cost of branches are presented in the context of a general pipeline model and compared with the best hardware scheme for a moderately pipelined processor.
Abstract: Pipelining has become a common technique to increase throughput of the instruction fetch, instruction decode, and instruction execution portions of modern computers. Branch instructions disrupt the flow of instructions through the pipeline, increasing the overall execution cost of branch instructions. Three schemes to reduce the cost of branches are presented in the context of a general pipeline model. Ten realistic Unix domain programs are used to directly compare the cost and performance of the three schemes and the results are in favor of the software-based scheme. For example, the software-based scheme has a cost of 1.65 cycles/branch vs. a cost of 1.68 cycles/branch of the best hardware scheme for a highly pipelined processor (11-stage pipeline). The results are 1.19 (software scheme) vs. 1.23 cycles/branch (best hardware scheme) for a moderately pipelined processor (5-stage pipeline).

73 citations


Proceedings ArticleDOI
05 Nov 1989
TL;DR: The authors present algorithms that will equalize delays automatically by inserting a minimal number of active delay elements to lengthen short paths to design wave-pipelined circuits.
Abstract: Wave pipelining is a technique for pipelining digital systems that can increase the clock frequency without increasing the number of storage elements. Due to limits and variations in fabrication, the clock frequency can be increased by a factor of 2 to 3 by using the best available design methods. The authors present algorithms that will equalize delays automatically by inserting a minimal number of active delay elements to lengthen short paths. This method can be combined with delay balancing by adjusting gate speeds to design wave-pipelined circuits. >

66 citations


Patent
19 Oct 1989
TL;DR: In this paper, the authors present a method and apparatus for sequentially executing instruction words comprising parallel first and second sequences of program instructions, each instruction having independently selectable execution cycle count latencies.
Abstract: Method and apparatus is disclosed for sequentially executing instruction words comprising parallel first and second sequences of program instructions, each instruction having independently selectable execution cycle count latencies. After the occurrence of an exception, such as may arise from a page fault, the method and apparatus of the present invention identifies any instruction, begun after the instruction which caused the exception, that has completed its execution before execution of the exception provoking instruction was inhibited. The detection of an exception causes the processor to inhibit further execution of the exception provoking instruction. Instructions which have yet to complete their execution prior to the inhibition of the exception provoking instruction, are called pending instructions, and are similarly inhibited from further execution. Subsequently, the exception is serviced and the exception inducing instruction is restarted for re-execution in the processor. According to the present invention, the pending instructions are subsequently restarted and executed in the sequence of their occurrence at the time the exception provoking instruction caused the processor to inhibit further instruction execution without re-execution of the aforementioned completed instruction. The method and apparatus according to the present invention is further applicable to computing systems having a plurality of processors, of either the same or different type, such as floating point and integer processors. When applied to systems having a plurality of processors, the method and apparatus of the present invention inhibits all such further execution plural processors upon the detection of an exception in one of the processors. In processors other than the processors serving the exception, the present invention envisions method and apparatus for generation and execution of no-op instructions until the processor servicing the exception causes the execution of the pending instructions to be re-started, at which time the other processors also re-start execution of instruction which were pending at the time further execution of the instructions was inhibited.

65 citations


Proceedings ArticleDOI
01 Apr 1989
TL;DR: The architectural and organizational tradeoffs made during the design of the MultiTitan covered the entire space of processor design, from the instruction set and virtual memory architecture through the pipeline and organization of the machine.
Abstract: This paper describes the architectural and organizational tradeoffs made during the design of the MultiTitan, and provides data supporting the decisions made. These decisions covered the entire space of processor design, from the instruction set and virtual memory architecture through the pipeline and organization of the machine. In particular, some of the tradeoffs involved the use of an on-chip instruction cache with off-chip TLB and floating-point unit, the use of direct-mapped instead of associative caches, the use of 64-bit vs. 32-bit data bus, and the implementation of hardware pipeline interlocks.

63 citations


Patent
13 Sep 1989
TL;DR: In this article, an apparatus is provided for selectively switching the buffer between an asynchronous and a synchronous mode of operation, and the switching apparatus includes circuits for alternately opening and closing the first and second pass gates when the buffer is in its synchronous and asynchronous modes, respectively.
Abstract: A pipeline memory access circuit has a memory address buffer for buffering memory addresses. The buffer has a first and a second pass gate, and each of the pass gates has a pair of complementary metal-oxide-semiconductor (CMOS) transistors. An apparatus is provided for selectively switching the buffer between an asynchronous and a synchronous mode of operation. The switching apparatus includes circuits for alternately opening and closing the first and second pass gates when the buffer is in its synchronous mode of operation and for simultaneously opening both of the pass gates when the buffer is in its asynchronous mode of operation.

Book ChapterDOI
TL;DR: This chapter examines the impact of current technology on the design of special-purpose database machines (DBMs) and provides a survey of the DBMs that focuses on the adaptability of the designs to the current technology.
Abstract: Publisher Summary This chapter examines the impact of current technology on the design of special-purpose database machines (DBMs) and provides a survey of the DBMs that focuses on the adaptability of the designs to the current technology. The computer architectures fall into four groups—namely, single instruction stream-single data stream (SISD), single instruction stream-multiple data stream (SIMD), multiple instruction stream-single data stream (MISD), and multiple instruction stream-multiple data stream (MIMD). Database computer (DBC) uses two forms of parallelism: an entire cylinder is processed in parallel and the system performs queries in a pipeline fashion by separate units around two rings. The systems in high very large scale integration (VLSI)-compatible database machines are designed based on the constraints imposed by technology. In general, they are highly parallel with regular and simple architectures. The future database machine designers ought to concentrate on two important issues: (1) the investigation of the effect and benefit of a specialized database operating system on their DBM designs and (2) the optimization of the system throughput rather than improving the response time of a single request.

Patent
20 Jul 1989
TL;DR: Bubble compression in a pipelined central processing unit (CPU) of a computer system is provided in this paper, where a bubble represents a stage in the pipeline that cannot perform any useful work due to the lack of data from an earlier pipeline stage.
Abstract: Bubble compression in a pipelined central processing unit (CPU) of a computer system is provided. A bubble represents a stage in the pipeline that cannot perform any useful work due to the lack of data from an earlier pipeline stage. When a particular pipeline stage has stalled, the CPU instructions that have already passed through the stage continue to move ahead and leave behind vacant stages or bubbles. If a bubble is introduced into a pipeline and the pipeline subsequently stalls, the disclosed CPU takes advantage of this stalled condition to compress the previously introduced bubble.

Patent
10 Jul 1989
TL;DR: In this article, an image processing system is disclosed in which the various image processing circuits are arranged in a pipeline such that the output of each circuit is passed on to the next circuit in the pipeline without storing the data between circuits of the pipeline.
Abstract: An image processing system is disclosed in which the various image processing circuits are arranged in a pipeline such that the output of each circuit is passed on to the next circuit in the pipeline without storing the data between circuits of the pipeline. The individual processing circuits are program-controlled by a common controller in order to properly synchronize various operations such as scanning, compressing, expanding, rescaling, windowing and rotating. The rescaling operation is carried out by executing a sequence of program instructions applying to single pixels, or to single lines of the image. These instructions include deletion, duplication and passing through of the image element. Windowing is provided by inserting start-of-window and end-of-window instructions in the appropriate places in the sequence of rescaling instructions.

Patent
13 Jul 1989
TL;DR: In this article, a digital signal processor is used for motion compensation in reducing a required amount of calculations when an amount of distortion between a last frame block and a current frame block is calculated.
Abstract: The present invention improves a digital signal processor, more particularly, calculation methods for motion compensation in reducing a required amount of calculations when an amount of distortion between a last frame block and a current frame block; in processing a direct memory access at a higher efficiency; in processing a subdivided data calculation at a higher speed; in processing a branch instruction occurring in the pipeline process at a higher efficiency; and in processing an interruption occurring in a repeat process operation at greater convenience, and furthermore in reducing a required amount of calculations through minimum distortion searching processes hierarchized.

Journal ArticleDOI
TL;DR: This paper describes the HARP architectural model and discusses those features which support parallel instruction execution and examples are given which illustrate the effectiveness of these techniques in increasing the performance of HARP.

Patent
28 Feb 1989
TL;DR: In this paper, a data processor in accordance with the present invention makes it possible to perform pre-branch processing with respect to a return address in the initial stage of pipeline processing also on a subroutine return instruction.
Abstract: A data processor in accordance with the present invention makes it possible to perform pre-branch processing with respect to a return address in the initial stage of pipeline processing also on a subroutine return instruction, and therefore by providing a stack memory (PC stack) dedicated to a program counter (PC) for storing only return addresses of the subroutine return instruction, in executing a subroutine call instruction in an execution stage of a pipeline processing mechanism, the return address from the subroutine is pushed to the PC stack, and the pre-branch processing is performed to the address popped from the PC stack in decoding the subroutine return instruction in an instruction decoding stage.

Journal ArticleDOI
TL;DR: In this paper, a novel pipeline digital-to-analog converter (DAC) configuration, based on switched-capacitor techniques, is described, where an n-bit D/A conversion can be implemented by cascading n+1 unit cells.
Abstract: A novel pipeline digital-to-analog converter (DAC) configuration, based on switched-capacitor techniques, is described. An n-bit D/A conversion can be implemented by cascading n+1 unit cells. The device count of the circuit increases linearly, not exponentially, with the conversion accuracy. The new configuration can be pipelined. Hence, the conversion rate can be increased without requiring a higher clock rate. An experimental 10-b DAC prototype has been fabricated using a 3- mu m CMOS process. The results show that high-speed, high-accuracy, and low-power operation can be achieved without special process or postprocess trimming. >

Patent
06 Nov 1989
TL;DR: In this paper, the authors describe a pipeline laying on a surface located under water is to a great extent continuously performed because the pipes are arranged in line with a pipe string by means of at least two alternately used pipe carriers.
Abstract: The laying of a pipeline on a surface located under water is to a great extent continuously performed because the pipes are arranged in line with a pipe string by means of at least two alternately used pipe carriers.

Journal ArticleDOI
TL;DR: Suggestions are given for machine users and designers about vectorization on these two machines by testing two of the most widely used computers with vector facilities: the IBM 3090 and Cray X-MP.
Abstract: Vector pipelining and chaining are clarified through the use of timing and pipeline diagrams of the instruction execution process. The technique for evaluating the performance of the concurrent vector operations of vector processors is evaluated by testing two of the most widely used computers with vector facilities: the IBM 3090 and Cray X-MP. On the basis of the testing results analyzed at the assembler level, suggestions are given for machine users and designers about vectorization on these two machines. The ideas presented can be applied to other vector processors. The actual implementations, however, may differ, depending on individual machine architecture. >

Journal ArticleDOI
K. Kikuchi1, Y. Nukada1, Y. Aoki1, T. Kanou1, Y. Endo1, Takao Nishitani1 
15 Feb 1989
TL;DR: A three-input adder implemented in complementary CMOS reduced-swing logic, which is twice as fast as conventional CMOS logic, achieving a 25-ns instruction cycle, is shown.
Abstract: A single-chip real-time video/image processor (VISP) has been developed that integrates functions based on a variable seven-stage pipeline arithmetic architecture in a 16-bit fixed-point data format. A three-input adder implemented in complementary CMOS reduced-swing logic, which is twice as fast as conventional CMOS logic, achieving a 25-ns instruction cycle, is shown. Single-VISP processing times are: edge detection (3*3 Laplacian), 14.8 ms; distance calculation, 1.7 ms; temporal filtering (1-tap IR), 5.0 ms; linear quantization, 3.3 ms; and 3/5*3/5 picture reduction (separate 5-tap FIR), 5.9 ms. An example is shown of a two-dimensional discrete cosine transformation which requires 26.3 ms to execute with one VISP when 256*256 pixel processing at a 25-ns instruction cycle is employed. >

Patent
27 Apr 1989
TL;DR: In this paper, the wave form of one of the pulses which has passed through the point of intersection is analyzed to determine the possibility of an anomaly being present at the location of intersection, which can be correlated to half-cell readings which could be taken along the pipeline as a means of analyzing pipeline conditions.
Abstract: A system for ascertaining the existence and location of anomalies along the length of a member, such as an underground pipeline. Electrical pulses are imparted to the pipeline at opposite ends thereof, with these pulses being synchronized so that they meet at predetermined locations along the length of the pipeline. The wave form of one of the pulses which has passed through the point of intersection is analyzed to determine the possibility of an anomaly being present at the location of intersection. These readings can be correlated to half-cell readings which could be taken along the pipeline as a means of analyzing pipeline conditions.

Journal ArticleDOI
TL;DR: The VLSI as discussed by the authors is a microprogrammable, high-resolution, real-time geometrical mapping processor with a 50-MHz throughput with an accuracy of 20 bits using 1.2-mu m CMOS technology.
Abstract: A recently developed microprogrammable, high-resolution, real-time geometrical mapping processor VLSI is presented. The processor computes pixel addresses within the frame-buffer memory according to user-specified geometrical mapping functions. Its architecture permits high-speed operations and library extensions through a combination of elementary functions. It includes a CORDIC function generator, consisting of a one-dimensional pipeline array with high-speed parallel arithmetic circuits, and a pipeline control method. This results in a 50-MHz throughput rate with an accuracy of 20 bits using 1.2- mu m CMOS technology. The processor will be useful in high-definition television (HDTV) systems. >

Journal ArticleDOI
TL;DR: The authors investigated the tradeoff between space and time complexity of the proposed technique and achieved a fast, space-efficient implementation of the approach.
Abstract: The authors describe a scheme for performing fuzzy-reasoning and fuzzy-inference operations in parallel on mesh-connected array processors such as the geometric arithmetic parallel processor (GAPP) and the advanced systolic array processor (ASAP). The reasoning mechanism is a general, fuzzy, rule-based inference network. The array is configured into stages, and the scheme operates in a pipeline configuration, where different sets of data move sequentially between the stages. The authors investigated the tradeoff between space and time complexity of the proposed technique and achieved a fast, space-efficient implementation of the approach. Before presenting the proposed system, they review briefly the concepts of inexact reasoning, fuzzy inference, and mesh-connected fine-grain parallel computer architecture. >

Patent
30 Oct 1989
TL;DR: In this paper, a transport pipeline has a thermal insulator mounted on an outside surface thereof, and a pair of parallel extending electric wires without metallic protection, are mounted on the outside surface of the thermal insulators.
Abstract: A method and apparatus for heating a transport pipeline by inductive heating. The transport pipeline has a thermal insulator mounted on an outside surface thereof. A pair of parallel extending electric wires without metallic protection, are mounted on an outside surface of the thermal insulator.

Proceedings ArticleDOI
08 May 1989
TL;DR: The author shows that the ACS loop operation (although nonlinear) can exploit look-ahead, and proposes fine-grain pipelined and parallel architectures for area-efficient high-speed VLSI implementation of DP problems.
Abstract: The add-compare-select (ACS) loop operation in serial dynamic programming (DP) problems inherently limits the speed or the iteration period. The author shows that the ACS loop operation (although nonlinear) can exploit look-ahead. He uses techniques of look-ahead, decomposition, and incremental computation, and proposes fine-grain pipelined and parallel architectures for area-efficient high-speed VLSI implementation of DP problems. The word-level arithmetic implementation complexity of the architecture is proportional to the cube of the number of states of the DP problem, logarithmic in number of loop pipeline levels, linear in block size, and additive with respect to pipelining and block processing. The data-dependent nature of the quantization operation limits the opportunities to pipeline the quantizer loops. The author proposes an approach to transform the quantizer loop into an equivalent form that can exploit look-ahead. The transformed quantizer loop recursion is shown to be similar to the DP recursion, and variations of the DP processor architectures can be used for high-speed implementation of simple quantizer loops. >

Journal ArticleDOI
TL;DR: A novel computer architecture, called a cyclic pipeline computer (CPC), which is especially suited for Josephson technologies, which supports multiple instruction/multiple data stream (MIMD) by time-sharing the processor and the main memory among multiple instruction streams.
Abstract: Describes a novel computer architecture, called a cyclic pipeline computer (CPC), which is especially suited for Josephson technologies. Since each Josephson logic device acts as a latch, it is possible to use high-pitch and shallow logic pipelining without any increase in delay time and cost. Hence, both the processor and the main memory can be built from the Josephson devices and can be pipelined with the same pipeline pitch time. The CPC supports multiple instruction/multiple data stream (MIMD) by time-sharing the processor and the main memory among multiple instruction streams. In addition, it employs advanced control to speed up the computation for each instruction stream. >

Patent
30 Jun 1989
TL;DR: In this article, a pipelined processing unit which includes an instruction unit stage containing logic management apparatus for processing a set of complex instructions is described. But it is not shown how the logic management mechanism can be used to track different types of instructions of the complex instruction set being processed.
Abstract: A pipelined processing unit which includes an instruction unit stage containing logic management apparatus for processing a set of complex instructions. The logic management apparatus includes state control circuits which produce a series or sequence of control states used in tracking the different types of instructions of the complex instruction set being processed. Different ones of the states are used for different types of instructions so as to enable the different pipeline stages to operate both independently and jointly to complete the execution of different instructions of the complex instruction set.

Journal ArticleDOI
TL;DR: A dynamic programming processor with parallel and pipeline architecture is described, which has given very good results on a wide variety of applications and 0.48% error rate on tests with standard NATO tapes.
Abstract: A dynamic programming processor with parallel and pipeline architecture is described. A 2- mu m CMOS technology was applied to the DP processor, which is composed of 127309 transistors on a 7.17*8.62-mm/sup 2/ die and is housed in an 84-pin PLCC (plastic leaded chip carrier) or PGA (pin grid array) package. The clock frequency is 20 MHz, and the instruction cycle time is 100 ns. Precise electrical simulations permitted the safe use of nonstandard logic and area and power reduction. Implementation of a direct access to all internal registers has proven useful for chip test and software development. A system using one DP processor has given very good results on a wide variety of applications and 0.48% error rate on tests with standard NATO tapes. These results are significantly better than those published for other systems on the same tests. >

Patent
10 Aug 1989
TL;DR: In this paper, the cross-correlation of two complex sampled digital data signals X and Y uses a first N-stage CORDIC rotator of pipeline sequential form for rotating each of the real and imaginary data portions of the first (X) complex sampled signal sequentially through a summation of angles θ=ξi αi where ξi =+1 or -1, α1 =90° and αn-2 =tan-1 (2-n) for n=0, 1, 2, 3,... N-
Abstract: Apparatus for the cross-correlation of two complex sampled digital data signals X and Y uses a first N-stage CORDIC rotator of pipeline sequential form for rotating each of the real and imaginary data portions of the first (X) complex sampled signal sequentially through a summation of angles θ=ξi αi where ξi =+1 or -1, α1 =90° and αn-2 =tan-1 (2-n) for n=0, 1, 2, 3, . . . N-2) until XIm is approximately zero and a substantially zero phase angle is reached. The sign from each i-th stage, of this first pipeline is also utilized to determine the sign of rotation in each like-positioned i-th stage of a plurality M of additional CORDIC pipeline rotators, where M is the total number of time delays at which the cross-correlation function is evaluated. The real and imaginary portions of the complete M-th interval cross-correlation product are each obtained by multiplying the associated complex output of each of the Y pipeline rotators by the first pipeline magnitude signal output; N samples are then summed to provide at the j-th rotator output the appropriate pair of the real and imaginary parts of the j-th complex digital data output sample C(j).