scispace - formally typeset
Search or ask a question

Showing papers on "Redundancy (engineering) published in 2003"


01 Jan 2003
TL;DR: A success probability bound is obtained for randomized network coding in link-redundant networks with unreliable links, in terms of link failure probability and amount of redundancy.
Abstract: We consider a randomized network coding approach for multicasting from several sources over a network, in which nodes independently and randomly select linear mappings from inputs onto output links over some field. This approach was first described in [3], which gave, for acyclic delay-free networks, a bound on error probability, in terms of the number of receivers and random coding output links, that decreases exponentially with code length. The proof was based on a result in [2] relating algebraic network coding to network flows. In this paper, we generalize these results to networks with cycles and delay. We also show, for any given acyclic network, a tighter bound in terms of the probability of connection feasibility in a related network problem with unreliable links. From this we obtain a success probability bound for randomized network coding in link-redundant networks with unreliable links, in terms of link failure probability and amount of redundancy.

608 citations


Proceedings ArticleDOI
04 Nov 2003
TL;DR: A simple tree management algorithm is presented that provides the necessary path diversity and an adaptation framework for MDC based on scalable receiver feedback is described, which shows very significant benefits in using multiple distribution trees and MDC, with a 22 dB improvement in PSNR in some cases.
Abstract: We consider the problem of distributing "live" streaming media content to a potentially large and highly dynamic population of hosts. Peer-to-peer content distribution is attractive in this setting because the bandwidth available to serve content scales with demand. A key challenge, however, is making content distribution robust to peer transience. Our approach to providing robustness is to introduce redundance; both in network paths and in data. We use multiple, diverse distribution trees to provide redundancy in network paths and multiple description coding (MDC) to provide redundancy in data. We present a simple tree management algorithm that provides the necessary path diversity and describe an adaptation framework for MDC based on scalable receiver feedback. We evaluate these using MDC applied to real video data coupled with real usage traces from a major news site that experienced a large flash crowd for live streaming content. Our results show very significant benefits in using multiple distribution trees and MDC, with a 22 dB improvement in PSNR in some cases.

484 citations


Journal ArticleDOI
TL;DR: A node-scheduling scheme, which can reduce system overall energy consumption, therefore increasing system lifetime, by identifying redundant nodes in respect of sensing coverage and then assigning them an off-duty operation mode that has lower energy consumption than the normal on-duty one.
Abstract: In wireless sensor networks that consist of a large number of low-power, short-lived, unreliable sensors, one of the main design challenges is to obtain long system lifetime without sacrificing system original performances (sensing coverage and sensing reliability). In this paper, we propose a node-scheduling scheme, which can reduce system overall energy consumption, therefore increasing system lifetime, by identifying redundant nodes in respect of sensing coverage and then assigning them an off-duty operation mode that has lower energy consumption than the normal on-duty one. Our scheme aims to completely preserve original sensing coverage theoretically. Practically, sensing coverage degradation caused by location error, packet loss and node failure is very limited, not more than 1% as shown by our experimental results. In addition, the experimental results illustrate that certain redundancy is still guaranteed after node-scheduling, which we believe can provide enough sensing reliability in many applications. We implement the proposed scheme in NS-2 as an extension of the LEACH protocol and compare its energy consumption with the original LEACH. Simulation results exhibit noticeably longer system lifetime after introducing our scheme than before. Copyright © 2003 John Wiley & Sons, Ltd.

291 citations


Proceedings ArticleDOI
06 Jan 2003
TL;DR: This paper analyzes some deficiencies of the dominant pruning algorithm and proposes two better approximation algorithms: total dominant prune and partial dominant pruned, which utilize 2-hop neighbourhood information more effectively to reduce redundant transmissions.
Abstract: Unlike in a wired network, a packet transmitted by a node in an ad hoc wireless network can reach all neighbours. Therefore, the total number of transmissions (forward nodes) is generally used as the cost criterion for broadcasting. The problem of finding the minimum number of forward nodes is NP-complete. Among various approximation approaches, dominant pruning by H. Lim and C. Kim (2001) utilizes 2-hop neighbourhood information to reduce redundant transmissions. In this paper, we analyze some deficiencies of the dominant pruning algorithm and propose two better approximation algorithms: total dominant pruning and partial dominant pruning. Both algorithms utilize 2-hop neighbourhood information more effectively to reduce redundant transmissions. Simulation results of applying these two algorithms show performance improvements compared with the original dominant pruning. In addition, two termination criteria are discussed and compared through simulation.

284 citations


Journal ArticleDOI
Mark Yim1, Kimon Roufas1, David G. Duff1, Ying Zhang1, Craig Eldershaw1, Sam Homans1 
TL;DR: PolyBot has significant potential in the space manipulation and surface mobility class of applications for space and can self-repair and adapt to changing or unanticipated conditions.
Abstract: Robots used for tasks in space have strict requirements. Modular reconfigurable robots have a variety of attributes that are well suited to these conditions, including: serving as many different tools at once (saving weight), packing into compressed forms (saving space) and having high levels of redundancy (increasing robustness). In addition, self-reconfigurable systems can self-repair and adapt to changing or unanticipated conditions. This paper will describe such a self-reconfigurable modular robot: PolyBot. PolyBot has significant potential in the space manipulation and surface mobility class of applications for space.

276 citations


Journal ArticleDOI
TL;DR: In this article, a Tabu search meta-heuristic has been developed and successfully demonstrated to provide solutions to the system reliability optimization problem of redundancy allocation, which generally involves the selection of components and redundancy levels to maximize system reliability given various system-level constraints.
Abstract: A tabu search meta-heuristic has been developed and successfully demonstrated to provide solutions to the system reliability optimization problem of redundancy allocation. Tabu search is particularly well-suited to this problem and it offers distinct advantages compared to alternative optimization methods. While there are many forms of the problem, the redundancy allocation problem generally involves the selection of components and redundancy levels to maximize system reliability given various system-level constraints. This is a common and extensively studied problem involving system design, reliability engineering and operations research. It is becoming increasingly important to develop efficient solutions to this reliability optimization problem because many telecommunications (and other) systems are becoming more complex, yet with short development schedules and very stringent reliability requirements. Tabu search can be applied to a more diverse problem domain compared to mathematical programming meth...

257 citations


Journal ArticleDOI
TL;DR: In this paper, a model of a parallel multi-inverter system with instantaneous average current sharing is presented, where a disturbance source is introduced to represent all the sources that may cause current unbalances.
Abstract: Parallel multi-inverter systems can be designed to have the advantages of expandable output power, improved reliability, and easy N+X redundancy operation. However, a current-sharing control scheme has to be employed to enable the inverters to share the load current equally. A multi-inverter system with instantaneous average-current-sharing scheme is presented in this paper. By introducing a disturbance source to represent all the sources that may cause current unbalances, a model of the system can be built. Some key issues are discussed based on the model, including stability of the current-sharing controller, impedance characteristics and voltage regulation. Three experimental 110 VAC/1.1 kVA inverters are built and paralleled to verify the theoretical predictions.

257 citations


Proceedings Article
18 May 2003
TL;DR: This work uses a simple resource usage model to measured behavior from the Gnutella file-sharing network to argue that large-scale cooperative storage is limited by likely dynamics and cross-system bandwidth -- not by local disk space.
Abstract: Peer-to-peer storage aims to build large-scale, reliable and available storage from many small-scale unreliable, low-availability distributed hosts. Data redundancy is the key to any data guarantees. However, preserving redundancy in the face of highly dynamic membership is costly. We use a simple resource usage model to measured behavior from the Gnutella file-sharing network to argue that large-scale cooperative storage is limited by likely dynamics and cross-system bandwidth -- not by local disk space. We examine some bandwidth optimization strategies like delayed response to failures, admission control, and load-shifting and find that they do not alter the basic problem. We conclude that when redundancy, data scale, and dynamics are all high, the needed cross-system bandwidth is unreasonable.

250 citations


Proceedings Article
09 Jun 2003
TL;DR: A novel peer-to-peer backup technique that allows computers connected to the Internet to back up their data cooperatively, which appears to be one to two orders of magnitude cheaper than existing Internet backup services.
Abstract: We present a novel peer-to-peer backup technique that allows computers connected to the Internet to back up their data cooperatively: Each computer has a set of partner computers, which collectively hold its backup data. In return, it holds a part of each partner's backup data. By adding redundancy and distributing the backup data across many partners, a highly-reliable backup can be obtained in spite of the low reliability of the average Internet machine. Because our scheme requires cooperation, it is potentially vulnerable to several novel attacks involving free riding (e.g., holding a partner's data is costly, which tempts cheating) or disruption. We defend against these attacks using a number of new methods, including the use of periodic random challenges to ensure partners continue to hold data and the use of disk-space wasting to make cheating unprofitable. Results from an initial prototype show that our technique is feasible and very inexpensive: it appears to be one to two orders of magnitude cheaper than existing Internet backup services.

233 citations


Journal ArticleDOI
TL;DR: The dual neural network is proposed for online redundancy resolution of kinematically redundant manipulators and is shown to be globally (exponentially) convergent to optimal solutions.
Abstract: In this paper, a recurrent neural network called the dual neural network is proposed for online redundancy resolution of kinematically redundant manipulators. Physical constraints such as joint limits and joint velocity limits, together with the drift-free criterion as a secondary task, are incorporated into the problem formulation of redundancy resolution. Compared to other recurrent neural networks, the dual neural network is piecewise linear and has much simpler architecture with only one layer of neurons. The dual neural network is shown to be globally (exponentially) convergent to optimal solutions. The dual neural network is simulated to control the PA10 robot manipulator with effectiveness demonstrated.

233 citations


Journal ArticleDOI
11 Aug 2003
TL;DR: This paper develops an energy-efficient, fault-tolerant approach for collaborative signal and information processing (CSIP) among multiple sensor nodes using a mobile-agent-based computing model and takes collaborative target classification as an application example to show the effectiveness of the proposed approach.
Abstract: In this paper, we develop an energy-efficient, fault-tolerant approach for collaborative signal and information processing (CSIP) among multiple sensor nodes using a mobile-agent-based computing model. In this model, instead of each sensor node sending local information to a processing center for integration, as is typical in client/server-based computing, the integration code is moved to the sensor nodes through mobile agents. The energy efficiency objective and the fault tolerance objective always conflict with each other and present unique challenge to the design of CSIP algorithms. In general, energy-efficient approaches try to limit the redundancy in the algorithm so that minimum amount of energy is required for fulfilling a certain task. On the other hand, redundancy is needed for providing fault tolerance since sensors might be faulty, malfunctioning, or even malicious. A balance has to be struck between these two objectives. We discuss the potential of mobile-agent-based collaborative processing in providing progressive accuracy while maintaining certain degree of fault tolerance. We evaluate its performance compared to the client/server-based collaboration from perspectives of energy consumption and execution time through both simulation and analytical study. Finally, we take collaborative target classification as an application example to show the effectiveness of the proposed approach.

Journal ArticleDOI
TL;DR: The methodology presented here is specifically developed to accommodate the case where there is a choice of redundancy strategy and can result in more reliable and cost-effective engineering designs.
Abstract: Optimal solutions to the redundancy allocation problem are determined when either active or cold-standby redundancy can be selectively chosen for individual subsystems. This problem involves the selection of components and redundancy levels to maximize system reliability. Previously, solutions to the problem could only be found if analysts were restricted to a predetermined redundancy strategy for the complete system. Generally, it had been assumed that active redundancy was to be used. However, in practice both active and cold-standby redundancy may be used within a particular system design and the choice of redundancy strategy becomes an additional decision variable. Available optimization algorithms are inadequate for these design problems and better alternatives are required. The methodology presented here is specifically developed to accommodate the case where there is a choice of redundancy strategy. The problem is formulated with imperfect sensing and switching of cold-standby redundant components ...

Proceedings ArticleDOI
20 Mar 2003
TL;DR: In this article, the authors proposed a new multipath routing algorithm that enables the trade-off between the amount of traffic and the reliability of WSNs, where the data packet is split in k subpackets (k = number of disjoined paths from source to destination) and only E/sub k/subpackets are necessary to rebuild the original data packet.
Abstract: In wireless sensor networks (WSN) data produced by one or more sources usually has to be routed through several intermediate nodes to reach the destination. Problems arise when intermediate nodes to reach the destination. Problems arise when intermediate nodes fail to forward the incoming messages. The reliability of the system can be increased by providing several paths from source to destination and sending the same packet through each of them (the algorithm is known as multipath routing). Using this technique, the traffic increases significantly. In this paper, we analyze a new mechanism that enables the trade-off between the amount of traffic and the reliability. The data packet is split in k subpackets (k = number of disjoined paths from source to destination). If only E/sub k/ subpackets (E/sub k/ < k) are necessary to rebuild the original data packet (condition obtained by adding redundancy to each subpacket), then the trade-off between traffic and reliability can be controlled.

Journal ArticleDOI
TL;DR: Three redundancy analysis algorithms which can be implemented on-chip based on the local-bitmap idea are presented: the local repair-most approach is efficient for a general spare architecture, and the local optimization approach has the best repair rate.
Abstract: With the advance of VLSI technology, the capacity and density of memories is rapidly growing. The yield improvement and testing issues have become the most critical challenges for memory manufacturing. Conventionally, redundancies are applied so that the faulty cells can be repairable. Redundancy analysis using external memory testers is becoming inefficient as the chip density continues to grow, especially for the system chip with large embedded memories. This paper presents three redundancy analysis algorithms which can be implemented on-chip. Among them, two are based on the local-bitmap idea: the local repair-most approach is efficient for a general spare architecture, and the local optimization approach has the best repair rate. The essential spare pivoting technique is proposed to reduce the control complexity. Furthermore, a simulator has been developed for evaluating the repair efficiency of different algorithms. It is also used for determining certain important parameters in redundancy design. The redundancy analysis circuit can easily be integrated with the built-in self-test circuit.

Journal ArticleDOI
TL;DR: ECAY yields a less costly reliability allocation within a reasonable computing time on large systems, and optimizes the weight and space-obstruction in system design throughout an optimal redundancy allocation.
Abstract: This paper considers the allocation of reliability and redundancy to parallel-series systems, while minimizing the cost of the system. It is proven that under usual conditions satisfied by cost functions, a necessary condition for optimal reliability allocation of parallel-series systems is that the reliability of the redundant components of a given subsystem are identical. An optimal algorithm is proposed to solve this optimization problem. This paper proves that the components in each stage of a parallel-series system must have identical reliability, under some nonrestrictive condition on the component's reliability cost functions. This demonstration provides a firm grounding for what many authors have hitherto taken as a working hypothesis. Using this result, an algorithm, ECAY, is proposed for the design of systems with parallel-series architecture, which allows the allocation of both reliability and redundancy to each subsystem for a target reliability for minimizing the system cost. ECAY has the added advantage of allowing the optimal reliability allocation in a very short time. A benchmark is used to compare the ECAY performance to LM-based algorithms. For a given reliability target, ECAY produced the lowest reliability costs and the optimum redundancy levels in the successive reliability allocation for all cases studied, viz, systems of 4, 5, 6, 7, 8, 9 stages or subsystems. Thus ECAY, as compared with LM-based algorithms, yields a less costly reliability allocation within a reasonable computing time on large systems, and optimizes the weight and space-obstruction in system design throughout an optimal redundancy allocation.

Proceedings ArticleDOI
19 Sep 2003
TL;DR: This paper presents an interesting observation concerning the minimum and maximum number of neighbours that are required to provide complete redundancy and introduces simple methods to estimate the degree of redundancy without the knowledge of location or directional information.
Abstract: Wireless sensor networks consist of a large number of tiny sensors that have only limited energy supply. One of the major challenges in constructing such networks is to maintain long network lifetime as well as sufficient sensing area. To achieve this goal, a broadly-used method is to turn off redundant sensors. In this paper, the problem of estimating redundant sensing areas among neighbouring wireless sensors is analysed. We present an interesting observation concerning the minimum and maximum number of neighbours that are required to provide complete redundancy and introduce simple methods to estimate the degree of redundancy without the knowledge of location or directional information. We also provide tight upper and lower bounds on the probability of complete redundancy and on the average partial redundancy. With random sensor deployment, our analysis shows that partial redundancy is more realistic for real applications, as complete redundancy is expensive, requiring up to 11 neighbouring sensors to provide a 90 percent chance of complete redundancy. Our results can be utilised in designing effective sensor scheduling algorithms to reduce energy consumption and in the mean time maintain a reasonable sensing area.

Proceedings ArticleDOI
13 Oct 2003
TL;DR: A new yield metric called performance averaged yield (Ypav) is introduced which accounts both for fully functional chips and those that exhibit some performance degradation, and is able to increase the Ypav of a uniprocessor with only redundant rows in its caches from a base value of 85% to 98% using microarchitectural redundancy.
Abstract: The continued increase in microprocessor clock frequency that has come from advancements in fabrication technology and reductions in feature size, creates challenges in maintaining both manufacturing yield rates and long-term reliability of devices. Methods based on defect detection and reduction may not offer a scalable solution due to cost of eliminating contaminants in the manufacturing process and increasing chip complexity. We propose to use the inherent redundancy available in existing and future chip microarchitectures to improve yield and enable graceful performance degradation in fail-in-place systems. We introduce a new yield metric called performance averaged yield (Y/sub PAV/), which accounts both for fully functional chips and those that exhibit some performance degradation. Our results indicate that at 250nm we are able to increase the Y/sub PAV/ of a uniprocessor with only redundant rows in its caches from a base value of 85% to 98% using microarchitectural redundancy. Given constant chip area, shrinking feature sizes increases fault susceptibility and reduces the base Y/sub PAV/ to 60% at 50nm, which exploiting microarchitectural redundancy then increases to 99.6%.

Journal ArticleDOI
TL;DR: This paper proposes a new method based on the multichannel spatial correlation matrix for time delay estimation that can take advantage of the redundancy when more than two microphones are available and can help the estimator to better cope with noise and reverberation.
Abstract: To find the position of an acoustic source in a room, typically, a set of relative delays among different microphone pairs needs to be determined. The generalized cross-correlation (GCC) method is the most popular to do so and is well explained in a landmark paper by Knapp and Carter. In this paper, the idea of cross-correlation coefficient between two random signals is generalized to the multichannel case by using the notion of spatial prediction. The multichannel spatial correlation matrix is then deduced and its properties are discussed. We then propose a new method based on the multichannel spatial correlation matrix for time delay estimation. It is shown that this new approach can take advantage of the redundancy when more than two microphones are available and this redundancy can help the estimator to better cope with noise and reverberation.

Proceedings ArticleDOI
02 Jun 2003
TL;DR: This paper discusses high level techniques for designing fault tolerant systems in SRAM-based FPGAs, without modification in the FPGA architecture, and presents some fault coverage results and a comparison with the TMR approach.
Abstract: This paper discusses high level techniques for designing fault tolerant systems in SRAM-based FPGAs, without modification in the FPGA architecture. Triple Modular Redundancy (TMR) has been successfully applied in FPGAs to mitigate transient faults, which are likely to occur in space applications. However, TMR comes with high area and power dissipation penalties. The new technique proposed in this paper was specifically developed for FPGAs to cope with transient faults in the user combinational and sequential logic, while also reducing pin count, area and power dissipation. The methodology was validated by fault injection experiments in an emulation board. We present some fault coverage results and a comparison with the TMR approach.

Patent
18 Sep 2003
TL;DR: A nonvolatile memory device capable of reading and writing a large number of memory cells with multiple read/write circuits in parallel has an architecture that reduces redundancy in the multiple read and write circuits to a minimum as discussed by the authors.
Abstract: A non-volatile memory device capable of reading and writing a large number of memory cells with multiple read/write circuits in parallel has an architecture that reduces redundancy in the multiple read/write circuits to a minimum. The multiple read/write circuits are organized into a bank of similar stacks of components. In one aspect, each stack of components has individual components factorizing out their common subcomponents that do not require parallel usage and sharing them as a common component serially. Other aspects, include serial bus communication between the different components, compact I/O enabled data latches associated with the multiple read/write circuits, and an architecture that allows reading and programming of a contiguous row of memory cells or a segment thereof. The various aspects combined to achieve high performance, high accuracy and high compactness.

Patent
05 Feb 2003
TL;DR: In this paper, a stacked set of playing cards is used to expose an information bearing portion bearing a first machine-readable indicia and a second machine readable indicia, such that each of the first and second machines from a number of cards are read by at least one reader.
Abstract: A stacked set of playing cards (40) are supported to expose an information bearing portion bearing a first machine-readable indicia (46) and an information bearing portion bearing a second machine-readable indicia (48), such that each of the first and second machine-readable indicia from a number of cards are read by at least one reader. The second machine-readable indicia (48) may provide redundancy, or may be related to the first machine-readable indicia (46) to allow the authentication of the playing card.

Patent
14 Mar 2003
TL;DR: In this paper, a method for tracking errors in a memory system by detecting an error in a bit of a word accessed in the memory and maintaining an error history comprising a record of each of said detected errors is presented.
Abstract: A method for tracking errors in a memory system by detecting an error in a bit of a word accessed in the memory and maintaining an error history comprising a record of each of said detected errors. The error history information may be used to configure the memory, such as to add redundancy; or may be used to adjust operating parameters of the memory, such as the periodicity of refresh and/or scrub operations; or may be used to trigger a sensing operation of other parameters in an application system. In one embodiment, a counter increments each time an error is detected and decrements when no error is detected, thereby tracking error patterns.

Journal ArticleDOI
TL;DR: In this article, the authors developed a game-theoretic model of bureaucratic policy making in which a political principal chooses the number of agents to handle a given task, and agents have policy preferences that may be opposed to the principal's, and furthermore may choose their policy or effort levels.
Abstract: Do redundant bureaucratic arrangements represent wasteful duplication or a hedge against political uncertainty? Previous attempts at addressing this question have treated agency actions as exogenous, thus avoiding strategic issues such as collective action problems or competition. I develop a game-theoretic model of bureaucratic policy making in which a political principal chooses the number of agents to handle a given task. Importantly, agents have policy preferences that may be opposed to the principal's, and furthermore may choose their policy or effort levels. Among the results are that redundancy can help a principal achieve her policy goals when her preferences are not aligned with the agents'. But redundancy is less helpful if even a single agent has preferences relatively close to the principal's. In this environment collective action problems may cause multiple agents to be less effective than a single agent. Redundancy can also be unnecessary to the principal if the agent's jurisdiction can be terminated.

Journal ArticleDOI
TL;DR: A new algorithm for pressure dependent modelling of WDS has been developed which demonstrates the suitability and meaning of the redundancy measure and it is recommended that redundancy be evaluated along with reliability when assessing system performance.

Proceedings ArticleDOI
20 Oct 2003
TL;DR: This paper introduces a PINCO scheme for single-valued sensor readings, an in-network compression scheme for energy constrained, distributed, wireless sensor networks, and discusses how PINCO parameters affect its performance, and how to tweak them for different performance requirements.
Abstract: In this paper, we present PINCO, an in-network compression scheme for energy constrained, distributed, wireless sensor networks. PINCO reduces redundancy in the data collected from sensors, thereby decreasing the wireless communication among the sensor nodes and saving energy. Sensor data is buffered in the network and combined through a pipelined compression scheme into groups of data, while satisfying a user-specified end-to-end latency bound. We introduce a PINCO scheme for single-valued sensor readings. In this scheme, each group of data is a highly flexible structure so that compressed data can be recompressed without decompressing, in order to reduce newly available redundancy at a different stage of the network. We discuss how PINCO parameters affect its performance, and how to tweak them for different performance requirements. We also include a performance study demonstrating the advantages of our approach over other data collection schemes based on simulation and prototype deployment results.

Journal ArticleDOI
TL;DR: This work introduces the cluster-based failure recovery concept which determines the best placement of slack within the FT schedule so as to minimize the resulting time overhead and provides transparent failure recovery in that a processor recovering from task failures does not disrupt the operation of other processors.
Abstract: The time-triggered model, with tasks scheduled in static (off line) fashion, provides a high degree of timing predictability in safety-critical distributed systems. Such systems must also tolerate transient and intermittent failures which occur far more frequently than permanent ones. Software-based recovery methods using temporal redundancy, such as task reexecution and primary/backup, while incurring performance overhead, are cost-effective methods of handling these failures. We present a constructive approach to integrating runtime recovery policies in a time-triggered distributed system. Furthermore, the method provides transparent failure recovery in that a processor recovering from task failures does not disrupt the operation of other processors. Given a general task graph with precedence and timing constraints and a specific fault model, the proposed method constructs the corresponding fault-tolerant (FT) schedule with sufficient slack to accommodate recovery. We introduce the cluster-based failure recovery concept which determines the best placement of slack within the FT schedule so as to minimize the resulting time overhead. Contingency schedules, also generated offline, revise this FT schedule to mask task failures on individual processors while preserving precedence and timing constraints. We present simulation results which show that, for small-scale embedded systems having task graphs of moderate complexity, the proposed approach generates FT schedules which incur about 30-40 percent performance overhead when compared to corresponding non-fault-tolerant ones.

Proceedings ArticleDOI
01 Dec 2003
TL;DR: In this paper, the authors present an intrusion tolerant architecture for distributed services, especially COTS servers, which makes use of techniques of fault tolerant computing, specifically redundancy, diversity, acceptance test, textitvoting, as well as adaptive reconfiguration.
Abstract: This paper presents a intrusion tolerant architecture for distributed services, especially COTS servers. An intrusion tolerant system assumes that attacks will happen, and some will be successful. However, a wide range of mission critical applications need to provide continuous service despite active attacks or partial compromise. The proposed architecture emphasizes on continuity of operation. It strives to mitigate the effects of both known and unknown attack. We make use techniques of fault tolerant computing, specifically redundancy, diversity, acceptance test, textitvoting—, as well as adaptive reconfiguration. Our architecture consists of five functional components that work together to extend the fault tolerance capability of COTS servers. In addition, the architecture provides mechanisms to audit the COTS servers and internal components for signs of compromise. The auditing as well as adaptive reconfiguration components evaluate the environment threats, identify potential sources of compromise and adaptively generate new configurations for the system.

Journal ArticleDOI
TL;DR: This new scheme allows accuracy to be achieved through the use of redundancy and reassignment, effectively decoupling analog performance from component matching, and is compared with more traditional approaches.
Abstract: As feature size and supply voltage shrink, digital calibration incorporating redundancy of flash analog-to-digital converters is becoming attractive. This new scheme allows accuracy to be achieved through the use of redundancy and reassignment, effectively decoupling analog performance from component matching. Very large comparator offsets (several LSBs) are tolerated, allowing the comparators to be small, fast and power efficient. In this paper, we analyze this scheme and compare with it with more traditional approaches.

Patent
06 Jun 2003
TL;DR: In this article, the authors propose a shadow redundancy appliance shadowing the primary redundancy appliance in the event of a fault at the first redundancy appliance to increase the fault tolerance of the shadow appliance.
Abstract: Techniques for performing data redundancy operations in a fault-tolerant manner. In one aspect, a primary data storage facility stores a primary copy of data and a secondary facility stores data that is redundant of the primary copy of the data. The primary facility includes a first redundancy appliance that receives a sequence of write requests and stores data for the sequence of write requests in storage associated with the primary storage facility. A second redundancy appliance shadows the first redundancy appliance and assumes the role of the first redundancy appliance in the event of a fault at the first redundancy appliance. In this way, fault tolerance is increased by the presence of the second, shadow appliance.

Journal ArticleDOI
TL;DR: This paper develops a relationship between system cost and hardware redundancy levels, assuming cycle-free distributed computing systems and proposes a hybrid heuristic which combines genetic algorithms and the steepest decent method to seek the optimal task allocation andHardware redundancy policies such that system cost is minimized.