scispace - formally typeset
Search or ask a question

Showing papers on "Redundancy (engineering) published in 2007"


Book
15 Mar 2007
TL;DR: This book is the first book on fault tolerance design with a systems approach and offers comprehensive coverage of both hardware and software fault tolerance, as well as information and time redundancy.
Abstract: There are many applications in which the reliability of the overall system must be far higher than the reliability of its individual components. In such cases, designers devise mechanisms and architectures that allow the system to either completely mask the effects of a component failure or recover from it so quickly that the application is not seriously affected. This is the work of fault-tolerant designers and their work is increasingly important and complex not only because of the increasing number of mission critical applications, but also because the diminishing reliability of hardware means that even systems for non-critical applications will need to be designed with fault-tolerance in mind. Reflecting the real-world challenges faced by designers of these systems, this book addresses fault tolerance design with a systems approach to both hardware and software. No other text on the market takes this approach, nor offers the comprehensive and up-to-date treatment Koren and Krishna provide. Students, designers and architects of high performance processors will value this comprehensive overview of the field. * The first book on fault tolerance design with a systems approach* Comprehensive coverage of both hardware and software fault tolerance, as well as information and time redundancy* Incorporated case studies highlight six different computer systems with fault-tolerance techniques implemented in their design* Available to lecturers is a complete ancillary package including online solutions manual for instructors and PowerPoint slides

670 citations


Journal ArticleDOI
TL;DR: A refined concept of synergy as a neural organization that ensures a one-to-many mapping of variables providing for both stability of important performance variables and flexibility of motor patterns to deal with possible perturbations and/or secondary tasks is proposed.
Abstract: Driven by recent empirical studies, we offer a new understanding of the degrees of freedom problem, and propose a refined concept of synergy as a neural organization that ensures a one-to-many mapping of variables providing for both stability of important performance variables and flexibility of motor patterns to deal with possible perturbations and/or secondary tasks. Empirical evidence is reviewed, including a discussion of the operationalization of stability/flexibility through the method of the uncontrolled manifold. We show how this concept establishes links between the various accounts for how movement is organized in redundant effector systems.

659 citations


Proceedings ArticleDOI
01 May 2007
TL;DR: This paper shows how to optimally generate MDS fragments directly from existing fragments in the system, and introduces a new scheme called regenerating codes which use slightly larger fragments than MDS but have lower overall bandwidth use.
Abstract: Peer-to-peer distributed storage systems provide reliable access to data through redundancy spread over nodes across the Internet. A key goal is to minimize the amount of bandwidth used to maintain that redundancy. Storing a file using an erasure code, in fragments spread across nodes, promises to require less redundancy and hence less maintenance bandwidth than simple replication to provide the same level of reliability. However, since fragments must be periodically replaced as nodes fail, a key question is how to generate a new fragment in a distributed way while transferring as little data as possible across the network. In this paper, we introduce a general technique to analyze storage architectures that combine any form of coding and replication, as well as presenting two new schemes for maintaining redundancy using erasure codes. First, we show how to optimally generate MDS fragments directly from existing fragments in the system. Second, we introduce a new scheme called regenerating codes which use slightly larger fragments than MDS but have lower overall bandwidth use. We also show through simulation that in realistic environments, regenerating codes can reduce maintenance bandwidth use by 25% or more compared with the best previous design - a hybrid of replication and erasure codes - while simplifying system architecture.

457 citations


Journal ArticleDOI
TL;DR: It is proved that warped oscillatory functions, a toy model for texture, have a signicantly sparser expansion in wave atoms than in other xed standard representations like wavelets, Gabor atoms, or curvelets.

302 citations


Journal ArticleDOI
TL;DR: It is suggested that community assembly is influenced by the traits of species and that observed changes in functional diversity provide no reason to believe that the functioning of natural systems is buffered against change by ecological redundancy.
Abstract: Spatial and temporal patterns in functional diversity can reveal the patterns and processes behind community assembly and whether ecological redundancy exists. Here, we analyse functional diversity in British avian assemblages over a period of about 20 years. Functional diversity is generally lower than expected by chance, indicating that assemblages contain species with relatively similar functional traits. One potential explanation is filtering for traits suitable to particular habitats, though other explanations exist. There was no evidence of ecological redundancy over the 20 years. In fact, changes in functional diversity were almost exactly proportional to changes in species richness. The absence of functional redundancy results from little redundancy intrinsic to the species’ functional relationships and also because compositional change was nonrandom. Observed extinction and colonization events caused greater changes in functional diversity than if these events were random. Our findings suggest that community assembly is influenced by the traits of species and that observed changes in functional diversity provide no reason to believe that the functioning of natural systems is buffered against change by ecological redundancy.

301 citations


Journal ArticleDOI
TL;DR: An analytical redundancy method using neural network modeling of the induction motor in vibration spectra is proposed for machine fault detection and diagnosis and it is shown that a robust and automatic induction machine condition monitoring system has been produced.
Abstract: Condition monitoring is desirable for increasing machinery availability, reducing consequential damage, and improving operational efficiency. Model-based methods are efficient monitoring systems for providing warning and predicting certain faults at early stages. However, the conventional methods must work with explicit motor models, and cannot be applied effectively for vibration signal diagnosis due to their nonadaptation and the random nature of vibration signal. In this paper, an analytical redundancy method using neural network modeling of the induction motor in vibration spectra is proposed for machine fault detection and diagnosis. The short-time Fourier transform is used to process the quasi-steady vibration signals to continuous spectra for the neural network model training. The faults are detected from changes in the expectation of vibration spectra modeling error. The effectiveness of the proposed method is demonstrated through experimental results, and it is shown that a robust and automatic induction machine condition monitoring system has been produced

260 citations


Journal ArticleDOI
TL;DR: In this article, Sen's capability approach was used to measure two components of well-being, namely standard of living and quality of life, for 170 countries in Africa, based on two multidimensional analyses, Totally Fuzzy Analysis and Factorial Analysis of Correspondences.

241 citations


Proceedings ArticleDOI
18 Jun 2007
TL;DR: A 65nm 256kb 8T SRAM operates in sub-V, at 350mV, and for a given area, sense-amplifier redundancy reduces read errors from offsets by a factor of five compared with device upsizing.
Abstract: A 65nm 256kb 8T SRAM operates in sub-V, at 350mV. Peripheral assists eliminate sub-V, bitline leakage without limiting read current, and for a given area, sense-amplifier redundancy reduces read errors from offsets by a factor of five compared with device upsizing.

203 citations


Proceedings ArticleDOI
09 Jun 2007
TL;DR: This paper analyzes the SPEC CPU2006 benchmarks using performance counter based experimentation from several state of the art systems, and uses statistical techniques such as principal component analysis and clustering to draw inferences on the similarity of the benchmarks and the redundancy in the suite and arrive at meaningful subsets.
Abstract: The recently released SPEC CPU2006 benchmark suite is expected to be used by computer designers and computer architecture researchers for pre-silicon early design analysis. Partial use of benchmark suites by researchers, due to simulation time constraints, compiler difficulties, or library or system call issues is likely to happen; but a random subset can lead to misleading results. This paper analyzes the SPEC CPU2006 benchmarks using performance counter based experimentation from several state of the art systems, and uses statistical techniques such as principal component analysis and clustering to draw inferences on the similarity of the benchmarks and the redundancy in the suite and arrive at meaningful subsets.The SPEC CPU2006 benchmark suite contains several programs from areas such as artificial intelligence and includes none from the electronic design automation (EDA) application area. Hence there is a concern on the application balance in the suite. An analysis from the perspective of fundamental program characteristics shows that the included programs offer characteristics broader than the EDA programs' space. A subset of 6 integer programs and 8 floating point programs can yield most of the information from the entire suite.

199 citations


Journal ArticleDOI
TL;DR: An optimized flooding scheme that minimizes transmission overhead in flooding is introduced and two simple and effective DFT-MSN data delivery schemes are proposed, namely, the replication-based efficient data delivery scheme (RED) and the message fault tolerance-based adaptive data Delivery scheme (FAD).
Abstract: This paper focuses on the delay/fault-tolerant mobile sensor network (DFT-MSN) for pervasive information gathering. We develop simple and efficient data delivery schemes tailored for DFT-MSN, which has several unique characteristics, such as sensor mobility, loose connectivity, fault tolerability, delay tolerability, and buffer limit. We first study two basic approaches, namely, direct transmission and flooding. We analyze their performance by using queuing theory and statistics. Based on the analytic results that show the trade-off between data delivery delay/ratio and transmission overhead, we introduce an optimized flooding scheme that minimizes transmission overhead in flooding. Then, we propose two simple and effective DFT-MSN data delivery schemes, namely, the replication-based efficient data delivery scheme (RED) and the message fault tolerance-based adaptive data delivery scheme (FAD). The RED scheme utilizes the erasure coding technology in order to achieve the desired data delivery ratio with minimum overhead. It consists of two key components for data transmission and message management. The former makes the decision on when and where to transmit data messages according to the delivery probability, which is the likelihood that a sensor can deliver data messages to the sink. The latter decides the optimal erasure coding parameters (including the number of data blocks and the needed redundancy) based on its current delivery probability. The FAD scheme employs the message fault tolerance, which indicates the importance of the messages. The decisions on message transmission and dropping are made based on fault tolerance for minimizing transmission overhead. The system parameters are carefully tuned on the basis of thorough analyses to optimize network performance. Extensive simulations are carried out for performance evaluation. Our results show that both schemes achieve a high message delivery ratio with acceptable delay. The RED scheme results in lower complexity in message and queue management, while the FAD scheme has a lower message transmission overhead.

184 citations


Journal ArticleDOI
Philippe Goupil1
TL;DR: In this paper, failure detection in the electrical flight control system of Airbus aircraft is discussed, where a nonlinear actuator model is used to generate a residual on which the failure is detected by oscillation counting.

Journal ArticleDOI
TL;DR: Results show that the variable neighborhood search method improves the performance of VND and provides a competitive solution quality at economically computational expense in comparison with the best-known heuristics including ant colony optimization, genetic algorithm, and tabu search.

Journal ArticleDOI
TL;DR: This paper proposes to exploit sensor spatial redundancy by defining subsets of sensors active in different time periods, to allow sensors to save energy when inactive and provides a modeling framework which can be extended to deal with additional features, such as reliability.

Proceedings ArticleDOI
25 Jun 2007
TL;DR: The results show that DCC has the potential to significantly outperform existing static DMR schemes, and is within 5% for a set of scalable parallel scientific and data mining applications with up to eight threads (16 processors).
Abstract: Aggressive CMOS scaling will make future chip multiprocessors (CMPs) increasingly susceptible to transient faults, hard errors, manufacturing defects, and process variations. Existing fault-tolerant CMP proposals that implement dual modular redundancy (DMR) do so by statically binding pairs of adjacent cores via dedicated communication channels and buffers. This can result in unnecessary power and performance losses in cases where one core is defective (in which case the entire DMR pair must be disabled), or when cores exhibit different frequency/leakage characteristics due to process variations (in which case the pair runs at the speed of the slowest core). Static DMR also hinders power density/thermal management, as DMR pairs running code with similar power/thermal characteristics are necessarily placed next to each other on the die. We present dynamic core coupling (DCC), an architectural technique that allows arbitrary CMP cores to verify each other's execution while requiring no static core binding at design time or dedicated communication hardware. Our evaluation shows that the performance overhead of DCC over a CMP without fault tolerance is 3% on SPEC2000 benchmarks, and is within 5% for a set of scalable parallel scientific and data mining applications with up to eight threads (16 processors). Our results also show that DCC has the potential to significantly outperform existing static DMR schemes.

Patent
26 Feb 2007
TL;DR: In this article, a data storage system that receives a data set from a software module (18A-D) includes a first tier storage device (240), a second tier storage devices (242), a redundancy reducer (21), and a migration engine (28).
Abstract: A data storage system (10) that receives a data set from a software module (18A-D) includes a first tier storage device (240), a second tier storage device (242), a redundancy reducer (21) and a migration engine (28). The first tier storage device (240) has a first effective storage capacity and the second tier storage device (242) can have a second effective storage capacity that is greater than the first effective storage capacity. The redundancy reducer (21) subdivides the data set into a plurality of data blocks (20) and reduces the redundancy of the data blocks (20). The migration engine (28) moves one or more of the data blocks (20) between the first tier storage device (240) and the second tier storage device (242) based on a migration parameter of the data block (20).

Journal ArticleDOI
TL;DR: A localized scan-based movement-assisted sensor deployment method (SMART) and several variations of it that use scan and dimension exchange to achieve a balanced state are proposed and an extended SMART is developed to address a unique problem called communication holes in sensor networks.
Abstract: The efficiency of sensor networks depends on the coverage of the monitoring area. Although, in general, a sufficient number of sensors are used to ensure a certain degree of redundancy in coverage, a good sensor deployment is still necessary to balance the workload of sensors. In a sensor network with locomotion facilities, sensors can move around to self-deploy. The movement-assisted sensor deployment deals with moving sensors from an initial unbalanced state to a balanced state. Therefore, various optimization problems can be defined to minimize different parameters, including total moving distance, total number of moves, communication/computation cost, and convergence rate. In this paper, we first propose a Hungarian-algorithm-based optimal solution, which is centralized. Then, a localized scan-based movement-assisted sensor deployment method (SMART) and several variations of it that use scan and dimension exchange to achieve a balanced state are proposed. An extended SMART is developed to address a unique problem called communication holes in sensor networks. Extensive simulations have been done to verify the effectiveness of the proposed scheme.

Journal ArticleDOI
TL;DR: This work puts forward an active-learning selection criterion that minimizes redundancy between the candidate images shown to the user at every feedback round and argues that the insensitivity to scale is desirable in this context and shows how to obtain it by the use of specific kernel functions.
Abstract: As the resolution of remote-sensing imagery increases, the full complexity of the scenes becomes increasingly difficult to approach. User-defined classes in large image databases are often composed of several groups of images and span very different scales in the space of low-level visual descriptors. The interactive retrieval of such image classes is then very difficult. To address this challenge, we evaluate here, in the context of satellite image retrieval, two general improvements for relevance feedback using support vector machines (SVMs). First, to optimize the transfer of information between the user and the system, we focus on the criterion employed by the system for selecting the images presented to the user at every feedback round. We put forward an active-learning selection criterion that minimizes redundancy between the candidate images shown to the user. Second, for image classes spanning very different scales in the low-level description space, we find that a high sensitivity of the SVM to the scale of the data brings about a low retrieval performance. We argue that the insensitivity to scale is desirable in this context, and we show how to obtain it by the use of specific kernel functions. Experimental evaluation of both ranking and classification performance on a ground-truth database of satellite images confirms the effectiveness of our approach

Proceedings ArticleDOI
09 Jun 2007
TL;DR: A proof-of-concept RMT implementation is created that demonstrates that AVF prediction can be used to maintain a low fault tolerance level without significant performance impact and creates a rigorous characterization of AVF behavior that can be easily implemented in hardware.
Abstract: Transient faults due to particle strikes are a key challenge in microprocessor design. Driven by exponentially increasing transistor counts, per-chip faults are a growing burden. To protect against soft errors, redundancy techniques such as redundant multithreading (RMT) are often used. However, these techniques assume that the probability that a structural fault will result in a soft error (i.e., the Architectural Vulnerability Factor (AVF)) is 100 percent, unnecessarily draining processor resources. Due to the high cost of redundancy, there have been efforts to throttle RMT at runtime. To date, these methods have not incorporated an AVF model and therefore tend to be ad hoc. Unfortunately, computing the AVF of complex microprocessor structures (e.g., the ISQ) can be quite involved.To provide probabilistic guarantees about fault tolerance, we have created a rigorous characterization of AVF behavior that can be easily implemented in hardware. We experimentally demonstrate AVF variability within and across the SPEC2000 benchmarks and identify strong correlations between structural AVF values and a small set of processor metrics. Using these simple indicators as predictors, we create a proof-of-concept RMT implementation that demonstrates that AVF prediction can be used to maintain a low fault tolerance level without significant performance impact.

Journal ArticleDOI
TL;DR: Three additional mitigation techniques are evaluated, including quadded logic, state machine encoding, and temporal redundancy, all well-known techniques in custom circuit technologies and it is suggested that none of these techniques provides greater reliability and often require more resources than TMR.
Abstract: With growing interest in the use of SRAM-based FPGAs in space and other radiation environments, there is a greater need for efficient and effective fault-tolerant design techniques specific to FPGAs. Triple-modular redundancy (TMR) is a common fault mitigation technique for FPGAs and has been successfully demonstrated by several organizations. This technique, however, requires significant hardware resources. This paper evaluates three additional mitigation techniques and compares them to TMR. These include quadded logic, state machine encoding, and temporal redundancy, all well-known techniques in custom circuit technologies. Each of these techniques are compared to TMR in both area cost and fault tolerance. The results from this paper suggest that none of these techniques provides greater reliability and often require more resources than TMR.

Proceedings ArticleDOI
25 Jun 2007
TL;DR: This paper proposes a software-based multi-core alternative for transient fault tolerance using process-level redundancy (PLR), which creates a set of redundant processes per application process and systematically compares the processes to guarantee correct execution.
Abstract: Transient faults are emerging as a critical concern in the reliability of general-purpose microprocessors. As architectural trends point towards multi-threaded multi-core designs, there is substantial interest in adapting such parallel hardware resources for transient fault tolerance. This paper proposes a software-based multi-core alternative for transient fault tolerance using process-level redundancy (PLR). PLR creates a set of redundant processes per application process and systematically compares the processes to guarantee correct execution. Redundancy at the process level allows the operating system to freely schedule the processes across all available hardware resources. PLR's software-centric approach to transient fault tolerance shifts the focus from ensuring correct hardware execution to ensuring correct software execution. As a result, PLR ignores many benign faults that do not propagate to affect program correctness. A real PLR prototype for running single-threaded applications is presented and evaluated for fault coverage and performance. On a 4-way SMP machine, PLR provides improved performance over existing software transient fault tolerance techniques with 16.9% overhead for fault detection on a set of optimized SPEC2000 binaries.

Journal ArticleDOI
TL;DR: This paper develops an exact solution method, based on the improved surrogate constraint (ISC) method, and uses this method to find optimal solutions to problems previously presented in the literature.
Abstract: When designing a system, there are two methods that can be used to improve the system's reliability without changing the nature of the system: 1) using more reliable components, and/or 2) providing redundant components within the system. The redundancy allocation problem attempts to find the appropriate mix of components & redundancies within a system in order to either minimize cost subject to a minimum level of reliability, or maximize reliability subject to a maximum cost and weight. Redundancy allocation problems can be classified into two groups; one allows the system to have a mix of components with different characteristics incorporated in the system, while the other only allows one type of each component. The former group has a much larger solution space compared to the latter, and therefore obtaining an exact optimal or even a high quality solution for this problem may be more difficult. Optimization techniques, based on meta-heuristic approaches, have recently been proposed to solve the redundancy allocation problem with a mix of components. However, an exact solution method has not been developed. In this paper, we develop an exact solution method, based on the improved surrogate constraint (ISC) method, and use this method to find optimal solutions to problems previously presented in the literature

Journal ArticleDOI
TL;DR: A randomized array transmission scheme is developed to secure wireless transmissions with inherent low-probability- of-interception (LPI) by exploiting the redundancy of transmit antenna arrays for deliberate signal randomization which, when combined with channel diversity, effectively randomizes the eavesdropper's signals but not the authorized receiver's signals.
Abstract: The use of signal processing techniques to protect wireless transmissions is proposed as a way to secure wireless networks at the physical layer. This approach addresses a unique weakness of wireless networks whereby network traffic traverses a public wireless medium mak- ing traditional boundary controls ineffective. Specifically, a randomized array transmission scheme is developed to guar- antee wireless transmissions with inherent low-probability- of-interception (LPI). In contrast to conventional spread spectrum or data encryption techniques, this new method exploits the redundancy of transmit antenna arrays for deliberate signal randomization which, when combined with channel diversity, effectively randomizes the eavesdropper's signals but not the authorized receiver's signals. The LPI of this transmission scheme is analyzed via proving the indeterminacy of the eavesdropper's blind deconvolution. Extensive simulations and some preliminary experiments are conducted to demonstrate its effectiveness. The proposed method is useful for securing wireless transmissions, or for supporting upper-layer key management protocols.

Journal ArticleDOI
TL;DR: From this study, it is found that the configurable logic block's routing network is vulnerable to domain crossing errors, or TMR defeats, by even 2-bit multiple-bit upsets.
Abstract: This paper discusses the limitations of single-FPGA triple-modular redundancy in the presence of multiple-bit upsets on Xilinx Virtex-II devices. This paper presents results from both fault injection and accelerated testing. From this study we have found that the configurable logic block's routing network is vulnerable to domain crossing errors, or TMR defeats, by even 2-bit multiple-bit upsets.

Journal ArticleDOI
TL;DR: This work proposes link structures for NoC that have properties for tolerating efficiently transient, intermittent, and permanent errors and presents the structures, operation, and designs for the different components of the links based on self-timed signaling.
Abstract: We propose link structures for NoC that have properties for tolerating efficiently transient, intermittent, and permanent errors. This is a necessary step to be taken in order to implement reliable systems in future nanoscale technologies. The protection against transient errors is realized using Hamming coding and interleaving for error detection and retransmission as the recovery method. We introduce two approaches for tackling the intermittent and permanent errors. In the first approach, spare wires are introduced together with reconfiguration circuitry. The other approach uses time redundancy, the transmission is split into two parts, where the data is doubled. In both structures the presence of permanent or intermittent errors is monitored by analyzing previous error syndromes. The links are based on self-timed signaling in which the handshake signals are protected using triple modular redundancy. We present the structures, operation, and designs for the different components of the links. The fault tolerance properties are analyzed using a fault model containing temporary, intermittent, and permanent faults that occur both as bursts and as single faults. The results show a considerable enhancement in the fault tolerance at the cost of performance and area, and with only a slight increase in power consumption.

01 Jan 2007
TL;DR: pStore as discussed by the authors is a secure distributed backup system based on an adaptive peer-to-peer network, which exploits unused personal hard drive space attached to the Internet to provide the distributed redundancy needed for reliable and effective data backup.
Abstract: In an effort to combine research in peer-to-peer systems with techniques for incremental backup systems, we propose pStore: a secure distributed backup system based on an adaptive peer-to-peer network. pStore exploits unused personal hard drive space attached to the Internet to provide the distributed redundancy needed for reliable and effective data backup. Experiments on a 30 node network show that 95% of the files in a 13 MB dataset can be retrieved even when 7 of the nodes have failed. On top of this reliability, pStore includes support for file encryption, versioning, and secure sharing. Its custom versioning system permits arbitrary version retrieval similar to CVS. pStore provides this functionality at less than 10% of the network bandwidth and requires 85% less storage capacity than simpler local tape backup schemes for a representative workload.

Journal ArticleDOI
TL;DR: Simulation results show that the proposed analytical redundancy-based FDI algorithm provides the same level of fault tolerance as in an SBW system with full hardware redundancy against single-point failures.
Abstract: This paper presents a novel observer-based analytical redundancy for a steer-by-wire (SBW) system. An analytical redundancy methodology was utilized to reduce the total number of redundant road-wheel angle (RWA) sensors in a triply redundant RWA-based SBW system while maintaining a high level of reliability. A full-state observer was designed using the combined model of the vehicle and SBW system to estimate the vehicle-body sideslip angle. The steering angle was then estimated from the observed and measured states of the vehicle (body sideslip angle and yaw rate) as well as the current input to the SBW electric motor(s). A fault detection and isolation (FDI) algorithm was developed using a majority voting scheme, which was then used to detect faulty sensor(s) to maintain safe drivability. The proposed analytical redundancy-based FDI algorithms and the linearized vehicle model were modeled in SIMULINK. Simulation results show that the proposed analytical redundancy-based FDI algorithm provides the same level of fault tolerance as in an SBW system with full hardware redundancy against single-point failures.

Journal ArticleDOI
TL;DR: In this article, reliability testing, reliability enhancement, and quality control for global navigation satellite system (GNSS) positioning are discussed, including rejection of possible outliers, and the use of a robust estimator, namely a modified Danish method.
Abstract: Monitoring the reliability of the obtained user position is of great importance, especially when using the global positioning system (GPS) as a standalone system. In the work presented here, we discuss reliability testing, reliability enhancement, and quality control for global navigation satellite system (GNSS) positioning. Reliability testing usually relies on statistical tests for receiver autonomous integrity monitoring (RAIM) and fault detection and exclusion (FDE). It is here extended by including an assessment of the redundancy and the geometry of the obtained user position solution. The reliability enhancement discussed here includes rejection of possible outliers, and the use of a robust estimator, namely a modified Danish method. We draw special attention to navigation applications in degraded signal-environments such as indoors where typically multiple errors occur simultaneously. The results of applying the discussed methods to high-sensitivity GPS data from an indoor experiment demonstrate that weighted estimation, FDE, and quality control yield a significant improvement in reliability and accuracy. The accuracy actually obtained was by 40% better than with equal weights and no FDE; the rms value of horizontal errors was reduced from 15 m to 9 m, and the maximum horizontal errors were largely reduced.

Proceedings ArticleDOI
01 Dec 2007
TL;DR: The possibility of providing redundancy with an older process technology, an unexplored and especially compelling application of die heterogeneity, is evaluated and it is shown that the overhead of the second die can be reduced to a 3degC temperature increase or a 4% performance loss, while also providing higher error resilience.
Abstract: Aggressive technology scaling over the years has helped improve processor performance but has caused a reduc- tion in processor reliability. Shrinking transistor sizes and lower supply voltages have increased the vulnerability of computer systems towards transient faults. An increase in within-die and die-to-die parameter variations has also led to a greater number of dynamic timing errors. A potential solution to mitigate the impact of such errors is redundancy via an in-order checker processor. Emerging 3D chip technology promises increased pro- cessor performance as well as reduced power consump- tion because of shorter on-chip wires. In this paper, we leverage the "snap-on" functionality provided by 3D inte- gration and propose implementing the redundant checker processor on a second die. This allows manufacturers to easily create a family of "reliable processors" without sig- nificantly impacting the cost or performance for customers that care less about reliability. We comprehensively eval- uate design choices for this second die, including the ef- fects of L2 cache organization, deep pipelining, and fre- quency. An interesting feature made possible by 3D inte- gration is the incorporation of heterogeneous process tech- nologies within a single chip. We evaluate the possibility of providing redundancy with an older process technology, an unexplored and especially compelling application of die heterogeneity. We show that with the most pessimistic as- sumptions, the overhead of the second die can be as high as either a 7 °C temperature increase or a 8% performance loss. However, with the use of an older process, this over- head can be reduced to a 3 °C temperature increase or a 4% performance loss, while also providing higher error re- silience. Keywords: reliability, redundant multi-threading, 3D die-stacking, parameter variation, soft errors, dynamic timing errors, power-efficient microarchitecture, on-chip temperature.

Journal ArticleDOI
TL;DR: This paper presents an efficient algorithm to solve the redundancy allocation problem using the hybridization of the ant colony meta-heuristic with the degraded ceiling, which performs well and is competitive with the best-known heuristics for redundancy allocation.

Journal ArticleDOI
TL;DR: In this article, the authors describe the control and parallel operation of two active power filters (APFs) in a combined topology in which one filter is connected in a feedback loop and the other is in a feed forward loop for harmonic compensation.
Abstract: This paper describes the control and parallel operation of two active power filters (APFs). Possible parallel operation situations of two APFs are investigated, and then the proposed topology is analyzed. The filters are coupled in a combined topology in which one filter is connected in a feedback loop and the other is in a feedforward loop for harmonic compensation. Thus, both active power filters bring their own characteristic advantages, i.e., the feedback filter improves the steady-state performance of the harmonic mitigation and the feedforward filter improves the dynamic response. Another characteristic of the proposed topology is the possibility of joint operation of both filters either as frequency-sharing or load-sharing, with or without redundancy. The frequency-sharing operation is possible due to the control algorithm, which is based on selective harmonic compensation using equivalent harmonic integrators. Implementation details and a discussion on the efficiency improvement for various switching frequencies are provided. The evaluation of the proposed topology concludes that this approach is very practical for achieving both low and high order harmonic compensation and stable grid operation. This is supported by extensive measurement results on a 15-kVA laboratory setup, indicating a reduction in total harmonic current distortion from the existing 30% to less than 2% for a typical adjustable speed drive application