scispace - formally typeset
Search or ask a question
Topic

Pipeline (computing)

About: Pipeline (computing) is a research topic. Over the lifetime, 26760 publications have been published within this topic receiving 204305 citations. The topic is also known as: data pipeline & computational pipeline.


Papers
More filters
Proceedings ArticleDOI
13 Aug 2016
TL;DR: A highly accurate SMART-based analysis pipeline that can correctly predict the necessity of a disk replacement even 10-15 days in advance and uses statistical techniques to automatically detect which SMART parameters correlate with disk replacement.
Abstract: Disks are among the most frequently failing components in today's IT environments. Despite a set of defense mechanisms such as RAID, the availability and reliability of the system are still often impacted severely. In this paper, we present a highly accurate SMART-based analysis pipeline that can correctly predict the necessity of a disk replacement even 10-15 days in advance. Our method has been built and evaluated on more than 30000 disks from two major manufacturers, monitored over 17 months. Our approach employs statistical techniques to automatically detect which SMART parameters correlate with disk replacement and uses them to predict the replacement of a disk with even 98% accuracy.

134 citations

Proceedings ArticleDOI
27 Feb 2011
TL;DR: This paper compares the delay and area of a comprehensive set of processor building block circuits when implemented on custom CMOS and FPGA substrates to infer how the microarchitecture of soft processors on FPGAs should be different from hard processors on customCMOS.
Abstract: As soft processors are increasingly used in diverse applications, there is a need to evolve their microarchitectures in a way that suits the FPGA implementation substrate. This paper compares the delay and area of a comprehensive set of processor building block circuits when implemented on custom CMOS and FPGA substrates. We then use the results of these comparisons to infer how the microarchitecture of soft processors on FPGAs should be different from hard processors on custom CMOS.We find that the ratios of the area required by an FPGA to that of custom CMOS for different building blocks varies significantly more than the speed ratios. As area is often a key design constraint in FPGA circuits, area ratios have the most impact on microarchitecture choices. Complete processor cores have area ratios of 17-27x and delay ratios of 18-26x. Building blocks that have dedicated hardware support on FPGAs such as SRAMs, adders, and multipliers are particularly area-efficient (2-7x area ratio), while multiplexers and CAMs are particularly area-inefficient (>100x area ratio), leading to cheaper ALUs, larger caches of low associativity, and more expensive bypass networks than on similar hard processors. We also find that a low delay ratio for pipeline latches (12-19x) suggests soft processors should have pipeline depths 20% greater than hard processors of similar complexity.

133 citations

Patent
30 Jul 2004
TL;DR: In this article, a decoder/encoder circuit accesses a first memory and a second memory in parallel in accordance with status information at decoding processing to perform decoding processing, stores the data after processing in a tracking memory, then transfers the data stored in the tracking memory to a host apparatus according to a request from the host apparatus, while writes the user data transferred in unit of blocks from the receiver in a third memory serving as a tracking buffer to start the encoder processing in the case of the encoding processing, and outputs the same to a clock generation circuit.
Abstract: A pipeline processing system capable of high speed operation and capable of realizing a reduction of power consumption and an information processing apparatus to which this is applied, wherein a decoder/encoder circuit accesses a first memory and a second memory in parallel in accordance with status information at decoding processing to perform decoding processing, stores the data after processing in a tracking memory, then transfers the data stored in the tracking memory to a host apparatus according to a request from the host apparatus, while writes the user data transferred in unit of blocks from the host apparatus in a third memory serving as a tracking buffer to start the encoder processing in the case of the encoding processing, accesses a plurality of memories in parallel in accordance with the status information to perform the encoding processing, and outputs the same to a clock generation circuit.

132 citations

Journal ArticleDOI
TL;DR: Logic embedded memory is an emerging technology that combines high transfer rates and computing power and Texram implements this technology and a new filtering algorithm to achieve high speed, high quality texture mapping.
Abstract: Logic embedded memory is an emerging technology that combines high transfer rates and computing power. Texram implements this technology and a new filtering algorithm to achieve high speed, high quality texture mapping. Integrating arithmetic units and large memory arrays on the same chip and thus exploiting the enormous internal transfer rates provides an elegant solution to the memory access bottleneck of high quality texture mapping. Using this technology, we can not only achieve higher texturing speed at lower system costs, we can also incorporate new functionalities such as detail mapping and footprint assembly to produce higher quality images at real time rendering speeds. Environment and video mapping are also integrated on the Texram, which therefore represents an autonomous and versatile texturing coprocessor. Logic enhanced memories might become the computing paradigm of the future, not just in graphics applications. Technological advances will foster this trend by providing an ever increasing amount of memory capacity and chip space for arithmetic units. As the ultimate solution, we can expect a complete 3D graphics pipeline including all memory systems integrated on a single chip.

131 citations

Journal ArticleDOI
TL;DR: This paper presents a high-throughput decoder design for the Quasi-Cyclic (QC) Low-Density Parity-Check (LDPC) codes, and two new techniques are proposed, including parallel layered decoding architecture (PLDA) and critical path splitting.
Abstract: This paper presents a high-throughput decoder design for the Quasi-Cyclic (QC) Low-Density Parity-Check (LDPC) codes. Two new techniques are proposed, including parallel layered decoding architecture (PLDA) and critical path splitting. PLDA enables parallel processing for all layers by establishing dedicated message passing paths among them. The decoder avoids crossbar-based large interconnect network. Critical path splitting technique is based on articulate adjustment of the starting point of each layer to maximize the time intervals between adjacent layers, such that the critical path delay can be split into pipeline stages. Furthermore, min-sum and loosely coupled algorithms are employed for area efficiency. As a case study, a rate-1/2 2304-bit irregular LDPC decoder is implemented using ASIC design in 90 nm CMOS process. The decoder can achieve the maximum decoding throughput of 2.2 Gbps at 10 iterations. The operating frequency is 950 MHz after synthesis and the chip area is 2.9 mm2.

130 citations


Network Information
Related Topics (5)
Cache
59.1K papers, 976.6K citations
86% related
Scalability
50.9K papers, 931.6K citations
85% related
Server
79.5K papers, 1.4M citations
82% related
Electronic circuit
114.2K papers, 971.5K citations
82% related
CMOS
81.3K papers, 1.1M citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202218
20211,066
20201,556
20191,793
20181,754
20171,548