scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Computers in 2007"


Journal ArticleDOI
TL;DR: This paper proposes a general model which, under mild assumptions, will generate provably random bits with some tolerance to adversarial manipulation and running in the megabit-per-second range, and develops fault-attack models and the properties of resilient functions to withstand such attacks.
Abstract: This paper is a contribution to the theory of true random number generators based on sampling phase jitter in oscillator rings. After discussing several misconceptions and apparently insurmountable obstacles, we propose a general model which, under mild assumptions, will generate provably random bits with some tolerance to adversarial manipulation and running in the megabit-per-second range. A key idea throughout the paper is the fill rate, which measures the fraction of the time domain in which the analog output signal is arguably random. Our study shows that an exponential increase in the number of oscillators is required to obtain a constant factor improvement in the fill rate. Yet, we overcome this problem by introducing a postprocessing step which consists of an application of an appropriate resilient function. These allow the designer to extract random samples only from a signal with only moderate fill rate and, therefore, many fewer oscillators than in other designs. Last, we develop fault-attack models and we employ the properties of resilient functions to withstand such attacks. All of our analysis is based on rigorous methods, enabling us to develop a framework in which we accurately quantify the performance and the degree of resilience of the design

567 citations


Journal ArticleDOI
TL;DR: A polynomial time 7-approximation algorithm for the first problem and a polynometric time (5+epsi)-approximating algorithm forThe second problem, where epsi>0 can be any given constant.
Abstract: A wireless sensor network consists of many low-cost, low-power sensor nodes, which can perform sensing, simple computation, and transmission of sensed information. Long distance transmission by sensor nodes is not energy efficient since energy consumption is a superlinear function of the transmission distance. One approach to prolonging network lifetime while preserving network connectivity is to deploy a small number of costly, but more powerful, relay nodes whose main task is communication with other sensor or relay nodes. In this paper, we assume that sensor nodes have communication range r>0, while relay nodes have communication range Rgesr, and we study two versions of relay node placement problems. In the first version, we want to deploy the minimum number of relay nodes so that, between each pair of sensor nodes, there is a connecting path consisting of relay and/or sensor nodes. In the second version, we want to deploy the minimum number of relay nodes so that, between each pair of sensor nodes, there is a connecting path consisting solely of relay nodes. We present a polynomial time 7-approximation algorithm for the first problem and a polynomial time (5+epsi)-approximation algorithm for the second problem, where epsi>0 can be any given constant

383 citations




Journal ArticleDOI
TL;DR: This paper presents a rigorous optimization methodology and an algorithm for minimizing the total energy expenditure of the multistage pipeline subject to soft end-to-end response-time constraints and designed and evaluated on a real three-tier server prototype.
Abstract: The energy and cooling costs of Web server farms are among their main financial expenditures. This paper explores the benefits of dynamic voltage scaling (DVS) for power management in server farms. Unlike previous work, which addressed DVS on individual servers and on load-balanced server replicas, this paper addresses DVS in multistage service pipelines. Contemporary Web server installations typically adopt a three-tier architecture in which the first tier presents a Web interface, the second executes scripts that implement business logic, and the third serves database accesses. From a user's perspective, only the end-to-end response across the entire pipeline is relevant. This paper presents a rigorous optimization methodology and an algorithm for minimizing the total energy expenditure of the multistage pipeline subject to soft end-to-end response-time constraints. A distributed power management service is designed and evaluated on a real three-tier server prototype for coordinating DVS settings in a way that minimizes global energy consumption while meeting end-to-end delay constraints. The service is shown to consume as much as 30 percent less energy compared to the default (Linux) energy saving policy

289 citations


Journal ArticleDOI
TL;DR: The ATRI algorithm aims at maximizing coverage area and minimizing coverage gaps and overlaps by adjusting the deployment layout of nodes close to equilateral triangulation, which is proven to be the optimal layout to provide the maximum no-gap coverage.
Abstract: In this paper, we present a novel sensor deployment algorithm, called the adaptive triangular deployment (ATRI) algorithm, for large-scale unattended mobile sensor networks. The ATRI algorithm aims at maximizing coverage area and minimizing coverage gaps and overlaps by adjusting the deployment layout of nodes close to equilateral triangulation, which is proven to be the optimal layout to provide the maximum no-gap coverage. The algorithm only needs the location information of nearby nodes, thereby avoiding communication cost for exchanging global information. By dividing the transmission range into six sectors, each node adjusts the relative distance to its one-hop neighbors in each sector separately. The distance threshold strategy and the movement state diagram strategy are adopted to avoid the oscillation of nodes. The simulation results show that the ATRI algorithm achieves a much larger coverage area and smaller average moving distance of nodes than existing algorithms. We also show that the ATRI algorithm is applicable to practical environments and tasks such as working in both bounded and unbounded areas and avoiding irregularly shaped obstacles. In addition, the density of nodes can be adjusted adaptively to different requirements of tasks.

146 citations


Journal ArticleDOI
TL;DR: This work presents a new scheme for subquadratic space complexity parallel multiplication in GF(2n) using the shifted polynomial basis using the Toeplitz matrix-vector products and coordinate transformation techniques, and to the best of the authors' knowledge, this is the first time that sub quadraticspace complexity parallel multipliers are proposed for dual, weakly dual, and triangular bases.
Abstract: Based on Toeplitz matrix-vector products and coordinate transformation techniques, we present a new scheme for subquadratic space complexity parallel multiplication in GF(2n) using the shifted polynomial basis. Both the space complexity and the asymptotic gate delay of the proposed multiplier are better than those of the best existing subquadratic space complexity parallel multipliers. For example, with n being a power of 2, the space complexity is about 8 percent better, while the asymptotic gate delay is about 33 percent better, respectively. Another advantage of the proposed matrix-vector product approach is that it can also be used to design subquadratic space complexity polynomial, dual, weakly dual, and triangular basis parallel multipliers. To the best of our knowledge, this is the first time that subquadratic space complexity parallel multipliers are proposed for dual, weakly dual, and triangular bases. A recursive design algorithm is also proposed for efficient construction of the proposed subquadratic space complexity multipliers. This design algorithm can be modified for the construction of most of the subquadratic space complexity multipliers previously reported in the literature

141 citations


Journal ArticleDOI
TL;DR: In this paper, the authors show that the already proposed encoding scheme is not optimal and present a new one, proving that it is optimal Moreover, they compare the two encodings theoretically and derive a set of conditions which show that, in practical cases, the proposed encoding always offers better compression in terms of hardware overhead.
Abstract: Selective Huffman coding has recently been proposed for efficient test- data compression with low hardware overhead In this paper, we show that the already proposed encoding scheme is not optimal and we present a new one, proving that it is optimal Moreover, we compare the two encodings theoretically and we derive a set of conditions which show that, in practical cases, the proposed encoding always offers better compression In terms of hardware overhead, the new scheme is at least as low-demanding as the old one The increased compression efficiency, the resulting test-time savings, and the low hardware overhead of the proposed method are also verified experimentally

132 citations


Journal ArticleDOI
TL;DR: This paper proposes a low-cost hardened latch that is able to completely filter out TFs affecting its internal feedback nodes while presenting a lower susceptibility to TFs on the other internal nodes, and proposes another version of the latch that improves the robustness of the output node, which can be higher than that of alternative hardened solutions.
Abstract: In this paper, we analyze the conditions making transient faults (TFs) affecting the nodes of conventional latch structures generate output soft errors (SEs). We investigate the susceptibility to TFs of all latch nodes and identify the most critical one(s). We show that, for standard latches using back-to-back inverters for their positive feedback, the internal nodes within their feedback path are the most critical. Such nodes will be hereafter referred to as internal feedback nodes. Based on this analysis, we first propose a low-cost hardened latch that, compared to alternative hardened solutions, is able to completely filter out TFs affecting its internal feedback nodes while presenting a lower susceptibility to TFs on the other internal nodes. This is achieved at the cost of a reduced robustness to TFs affecting the output node. To overcome this possible limitation (especially for systems for high-reliability applications), we propose another version of our latch that, at the cost of a small area and power consumption increase compared to our first solution, also improves the robustness of the output node, which can be higher than that of alternative hardened solutions. Additionally, both proposed latches present a comparable or higher robustness of the input node than alternative solutions and provide a lower or comparable power-delay product and area overhead than classical implementations and alternative hardened solutions.

123 citations


Journal ArticleDOI
TL;DR: This paper presents an analytical model of general tasks for DVS assuming job timing information is known only after a task release and establishes two relationships between computation capacity and deadline misses to provide a statistical real-time guarantee with reduced capacity.
Abstract: Dynamic voltage scaling (DVS) is a promising technique for battery-powered systems to conserve energy consumption. Most existing DVS algorithms assume information about task periodicity or a priori knowledge about the task set to be scheduled. This paper presents an analytical model of general tasks for DVS assuming job timing information is known only after a task release. It models the voltage scaling process as a transfer function-based filtering system, which facilitates the design of two efficient scaling algorithms. The first is a time-invariant scaling policy and it is proved to be a generalization of several popular DVS algorithms for periodic, sporadic, and aperiodic tasks. A more energy efficient policy is a time-variant scaling algorithm for aperiodic tasks. It is optimal in the sense that it is online without assumed information about future task releases. The algorithm turns out to be a water-filling process with a linear time complexity. It can be applied to scheduling based on worst-case execution times as well as online slack distribution when jobs complete earlier. We further establish two relationships between computation capacity and deadline misses to provide a statistical real-time guarantee with reduced capacity

102 citations


Journal ArticleDOI
TL;DR: This paper is the first one that presents a hierarchical model for the grid service reliability analysis and evaluation and makes the evaluation and calculation tractable by identifying the independence among layers.
Abstract: Grid computing is a recently developed technology. Although the developmental tools and techniques for the grid have been extensively studied, grid reliability analysis is not easy because of its complexity. This paper is the first one that presents a hierarchical model for the grid service reliability analysis and evaluation. The hierarchical modeling is mapped to the physical and logical architecture of the grid service system and makes the evaluation and calculation tractable by identifying the independence among layers. Various types of failures are interleaved in the grid computing environment, such as blocking failures, time-out failures, matchmaking failures, network failures, program failures, and resource failures. This paper investigates all of them to achieve a complete picture about grid service reliability. Markov models, queuing theory, and graph theory are mainly used to model, evaluate, and analyze the grid service reliability. Numerical examples are illustrated

Journal ArticleDOI
TL;DR: The correlation between signals coming from multiple microphones is analyzed and an improved method for carrying out speaker diarization for meetings with multiple distant microphones is proposed, improving the Diarization Error Rate (DER) by 15% to 20% relative to previous systems.
Abstract: Human-machine interaction in meetings requires the localization and identification of the speakers interacting with the system as well as the recognition of the words spoken. A seminal step toward this goal is the field of rich transcription research, which includes speaker diarization together with the annotation of sentence boundaries and the elimination of speaker disfluencies. The sub-area of speaker diarization attempts to identify the number of participants in a meeting and create a list of speech time intervals for each such participant. In this paper, we analyze the correlation between signals coming from multiple microphones and propose an improved method for carrying out speaker diarization for meetings with multiple distant microphones. The proposed algorithm makes use of acoustic information and information from the delays between signals coming from the different sources. Using this procedure, we were able to achieve state-of-the-art performance in the NIST spring 2006 rich transcription evaluation, improving the Diarization Error Rate (DER) by 15% to 20% relative to previous systems.

Journal ArticleDOI
TL;DR: Two novel scheduling algorithms, called the shared-input-data-based listing (SIL) algorithm and the multiple queues with duplication (MQD) algorithm for bag-of-tasks (BoT) applications in grid environments are proposed and show the practicability and competitiveness of these algorithms when compared to existing methods.
Abstract: Over the past decade, the grid has emerged as an attractive platform to tackle various large-scale problems, especially in science and engineering. One primary issue associated with the efficient and effective utilization of heterogeneous resources in a grid is scheduling. Grid scheduling involves a number of challenging issues, mainly due to the dynamic nature of the grid. There are only a handful of scheduling schemes for grid environments that realistically deal with this dynamic nature that have been proposed in the literature. In this paper, two novel scheduling algorithms, called the shared-input-data-based listing (SIL) algorithm and the multiple queues with duplication (MQD) algorithm for bag-of-tasks (BoT) applications in grid environments are proposed. The SIL algorithm targets scheduling data-intensive BoT (DBoT) applications, whereas the MQD algorithm deals with scheduling computationally intensive BoT (CBoT) applications. Their common and primary forte is that they make scheduling decisions without fully accurate performance prediction information. Another point to note is that both scheduling algorithms adopt task duplication as an attempt to reduce serious schedule increases. Our evaluation study employs a number of experiments with various simulation settings. The results show the practicability and competitiveness of our algorithms when compared to existing methods

Journal ArticleDOI
TL;DR: This paper proves that deploying sensors on grid points to construct a wireless sensor network that fully covered critical grids using minimum sensors and that fully covers a maximum total weight of grids using a given number of sensors are each NP-complete.
Abstract: This paper proves that deploying sensors on grid points to construct a wireless sensor network that fully covers critical grids using minimum sensors (critical-grid coverage problem) and that fully covers a maximum total weight of grids using a given number of sensors (weighted-grid coverage problem) are each NP-complete

Journal ArticleDOI
TL;DR: An integer linear programming formulation to find the minimum cost deployment of sensors that provides the desired coverage of a target point set and a greedy heuristic for this problem is developed and proposed.
Abstract: We develop an integer linear programming formulation to find the minimum cost deployment of sensors that provides the desired coverage of a target point set and propose a greedy heuristic for this problem. Our formulation permits heterogeneous multimodal sensors and is extended easily to account for nonuniform sensor detection resulting from blockages, noise, fading, and so on. A greedy algorithm for solving the proposed general ILP is developed. Additionally, isin-approximation algorithms and a polynomial- time approximation scheme are proposed for the case of grid coverage. Experiments demonstrate the superiority of our proposed algorithms over earlier algorithms for point coverage of grids by using heterogeneous sensors.

Journal ArticleDOI
TL;DR: This paper presents two merging algorithms suitable for batch join requests and extends these two algorithms to a batch balanced algorithm, which shows that their rekeying costs are lower compared with those of existing algorithms.
Abstract: A secure multicast communication is important for applications such as pay-per-view and secure videoconferencing. A key tree approach has been proposed by other authors to distribute the multicast group key in such a way that the rekeying cost scales with the logarithm of the group size for a join or depart request. The efficiency of this key tree approach critically depends on whether the key tree remains balanced over time as members join or depart. In this paper, we present two merging algorithms suitable for batch join requests. To additionally handle batch depart requests, we extend these two algorithms to a batch balanced algorithm. Simulation results show that our three algorithms not only maintain a balanced key tree, but their rekeying costs are lower compared with those of existing algorithms

Journal ArticleDOI
TL;DR: An interactive visualization tool that uses a three-dimensional plot to show miss rate changes across program data sizes and cache sizes and its use in evaluating compiler transformations and other uses of this visualization tool include assisting machine and benchmark-set design.
Abstract: Improving cache performance requires understanding cache behavior. However, measuring cache performance for one or two data input sets provides little insight into how cache behavior varies across all data input sets and all cache configurations. This paper uses locality analysis to generate a parameterized model of program cache behavior. Given a cache size and associativity, this model predicts the miss rate for arbitrary data input set sizes. This model also identifies critical data input sizes where cache behavior exhibits marked changes. Experiments show this technique is within 2 percent of the hit rate for set associative caches on a set of floating-point and integer programs using array and pointer-based data structures. Building on the new model, this paper presents an interactive visualization tool that uses a three-dimensional plot to show miss rate changes across program data sizes and cache sizes and its use in evaluating compiler transformations. Other uses of this visualization tool include assisting machine and benchmark-set design. The tool can be accessed on the Web at http://www.cs.rochester.edu/research/locality

Journal ArticleDOI
TL;DR: This work describes a password extraction attack on Class 1 Generation 1 EPC tags and shows how the privacy of Class 1 generation 2 tags can be compromised by this attack.
Abstract: Side-channel attacks are used by cryptanalysts to compromise the implementation of secure systems. One very powerful class of side-channel attacks is power analysis, which tries to extract cryptographic keys and passwords by examining the power consumption of a device. We examine the applicability of this threat to electromagnetically coupled RFID tags. Compared to standard power analysis attacks, our attack is unique in that it requires no physical contact with the device under attack. Power analysis can be carried out even if both the tag and the attacker are passive and transmit no data, making the attack very hard to detect. As a proof of concept, we describe a password extraction attack on Class 1 Generation 1 EPC tags. We also show how the privacy of Class 1 Generation 2 tags can be compromised by this attack. Finally, we examine possible modifications to the tag and its RF front end which help protect against power analysis attacks.

Journal ArticleDOI
TL;DR: This paper presents a novel architecture for speech-based human-machine interaction inspired by recent findings in the neurobiology of living systems and blurs the distinction between the core components of a traditional spoken language dialogue system and instead focuses on a recursive hierarchical feedback control structure.
Abstract: Recent years have seen steady improvements in the quality and performance of speech-based human-machine interaction driven by a significant convergence in the methods and techniques employed. However, the quantity of training data required to improve state-of-the-art systems seems to be growing exponentially and performance appears to be asymptotic to a level that may be inadequate for many real-world applications. This suggests that there may be a fundamental flaw in the underlying architecture of contemporary systems, as well as a failure to capitalize on the combinatorial properties of human spoken language. This paper addresses these issues and presents a novel architecture for speech-based human-machine interaction inspired by recent findings in the neurobiology of living systems. Called PRESENCE-"PREdictive SENsorimotor Control and Emulation"-this new architecture blurs the distinction between the core components of a traditional spoken language dialogue system and instead focuses on a recursive hierarchical feedback control structure. Cooperative and communicative behavior emerges as a by-product of an architecture that is founded on a model of interaction in which the system has in mind the needs and intentions of a user and a user has in mind the needs and intentions of the system.

Journal ArticleDOI
TL;DR: This paper compares the system performance evaluation cooperative (SPEC) Integer and Floating-Point suites to a set of real-world applications for high-performance computing at Sandia National Laboratories, and quantitatively demonstrate the memory properties of real supercomputing applications.
Abstract: This paper compares the system performance evaluation cooperative (SPEC) Integer and Floating-Point suites to a set of real-world applications for high-performance computing at Sandia National Laboratories. These applications focus on the high-end scientific and engineering domains; however, the techniques presented in this paper are applicable to any application domain. The applications are compared in terms of three memory properties: 1) temporal locality (or reuse over time), 2) spatial locality (or the use of data "near" data that has already been accessed), and 3) data intensiveness (or the number of unique bytes the application accesses). The results show that real-world applications exhibit significantly less spatial locality, often exhibit less temporal locality, and have much larger data sets than the SPEC benchmark suite. They further quantitatively demonstrate the memory properties of real supercomputing applications.

Journal ArticleDOI
TL;DR: A model that relaxes some assumptions made in prior research on distributed systems that were inappropriate for grid computing is presented, which simplifies the physical structure of a grid service, allows service performance to be efficiently evaluated, and takes into account data dependence and failure correlation.
Abstract: Grid computing is a newly emerging technology aimed at large-scale resource sharing and global-area collaboration. It is the next step in the evolution of parallel and distributed computing. Due to the largeness and complexity of the grid system, its performance and reliability are difficult to model, analyze, and evaluate. This paper presents a model that relaxes some assumptions made in prior research on distributed systems that were inappropriate for grid computing. The paper proposes a virtual tree-structured model of the grid service. This model simplifies the physical structure of a grid service, allows service performance (execution time) to be efficiently evaluated, and takes into account data dependence and failure correlation. Based on the model, an algorithm for evaluating the grid service time distribution and the service reliability indices is suggested. The algorithm is based on Graph theory and probability theory. Illustrative examples and a real case study of the BioGrid are presented.

Journal ArticleDOI
TL;DR: A reconfigurable curve-based cryptoprocessor that accelerates scalar multiplication of ECC and HECC of genus 2 over GF(2n) and it can handle various curve parameters and arbitrary irreducible polynomials.
Abstract: This paper presents a reconfigurable curve-based cryptoprocessor that accelerates scalar multiplication of Elliptic Curve Cryptography (ECC) and HyperElliptic Curve Cryptography (HECC) of genus 2 over GF(2n). By allocating a copies of processing cores that embed reconfigurable Modular Arithmetic Logic Units (MALUs) over GF(2n), the scalar multiplication of ECC/HECC can be accelerated by exploiting Instruction-Level Parallelism (ILP). The supported field size can be arbitrary up to a(n + 1) - 1. The superscaling feature is facilitated by defining a single instruction that can be used for all field operations and point/divisor operations. In addition, the cryptoprocessor is fully programmable and it can handle various curve parameters and arbitrary irreducible polynomials. The cost, performance, and security trade-offs are thoroughly discussed for different hardware configurations and software programs. The synthesis results with a 0.13-mum CMOS technology show that the proposed reconfigurable cryptoprocessor runs at 292 MHz, whereas the field sizes can be supported up to 587 bits. The compact and fastest configuration of our design is also synthesized with a fixed field size and irreducible polynomial. The results show that the scalar multiplication of ECC over GF(2163) and HECC over GF(283) can be performed in 29 and 63 mus, respectively.

Journal ArticleDOI
TL;DR: The architecture is based on the lookup table (LUT) cascade, which results in a significant reduction in circuit complexity compared to traditional approaches, suitable for automatic synthesis and a synthesis method that converts a Matlab-like specification into an LUT cascade design is shown.
Abstract: This paper proposes an architecture and a synthesis method for high-speed computation of fixed-point numerical functions such as trigonometric, logarithmic, sigmoidal, square root, and combinations of these functions. Our architecture is based on the lookup table (LUT) cascade, which results in a significant reduction in circuit complexity compared to traditional approaches. This is suitable for automatic synthesis and we show a synthesis method that converts a Matlab-like specification into an LUT cascade design. Experimental results show the efficiency of our approach as implemented on a field-programmable gate array (FPGA)

Journal ArticleDOI
TL;DR: This paper finds that the fault location task can be reduced to that under the classical PMC* model and presents an O(n times Delta3 times delta) time diagnosis algorithm for an n-node MM* diagnosable system, where Delta and delta denote the maximum and minimum degrees of a node, respectively.
Abstract: Diagnosis by comparison is a realistic approach to the fault diagnosis of massive multicomputers. This paper addresses the fault identification of diagnosable multicomputer systems under the MM* comparison model. We find that the fault location task can be reduced to that under the classical PMC* model. On this basis, we present an O(n times Delta3 times delta) time diagnosis algorithm for an n-node MM* diagnosable system, where Delta and delta denote the maximum and minimum degrees of a node, respectively. The proposed algorithm is much more efficient than the fastest known diagnosis algorithm (which consumes O(n5) time) because realistic massive multicomputers are sparsely interconnected and, hence, Delta,delta Lt n.

Journal ArticleDOI
X. Zhuang1, H.-H.S. Lee
TL;DR: A number of hardware-based prefetch pollution filtering mechanisms to differentiate good and bad prefetches dynamically based on history information are proposed and analyzed and the performance can be improved by up to 16 percent when the filtering mechanism is incorporated with aggressive prefetch filters as a result of reduced cache pollution and less competition for the limited number of cache ports.
Abstract: In order to bridge the gap of the growing speed disparity between processors and their memory subsystems, aggressive prefetch mechanisms, either hardware-based or compiler-assisted, are employed to hide memory latencies. As the first-level cache gets smaller in deep submicron processor design for fast cache accesses, data cache pollution caused by overly aggressive prefetch mechanisms will become a major performance concern. Ineffective prefetches not only offset the benefits of benign prefetches due to pollution but also throttle bus bandwidth, leading to an overall performance degradation. In this paper, we propose and analyze a number of hardware-based prefetch pollution filtering mechanisms to differentiate good and bad prefetches dynamically based on history information. We designed three prefetch pollution filters organized as a one-level, two-level, or gshare style. In addition, we examine two table indexing schemes: per-address (PA) based and program counter (PC) based. Our prefetch pollution filters work in tandem with both hardware and software prefetchers. As our analysis shows, the cache pollution filters can reduce the ineffective prefetches by more than 90 percent and alleviate the excessive memory bandwidth induced by them. Also, the performance can be improved by up to 16 percent when our filtering mechanism is incorporated with aggressive prefetch filters as a result of reduced cache pollution and less competition for the limited number of cache ports. In addition, a number of sensitivity studies are performed to provide more understandings of the prefetch pollution filter design

Journal ArticleDOI
TL;DR: The study proposes statistical methods for both the single and dual fault injection campaigns and demonstrates the fault-tolerant capability of both processors in terms of fault latencies, the probability of fault manifestation, and the behavior of latent faults.
Abstract: This paper presents a detailed analysis of the behavior of a novel fault-tolerant 32-bit embedded CPU as compared to a default (non-fault-tolerant) implementation of the same processor during a fault injection campaign of single and double faults. The fault-tolerant processor tested is characterized by per-cycle voting of microarchitectural and the flop-based architectural states, redundancy at the pipeline level, and a distributed voting scheme. Its fault-tolerant behavior is characterized for three different workloads from the automotive application domain. The study proposes statistical methods for both the single and dual fault injection campaigns and demonstrates the fault-tolerant capability of both processors in terms of fault latencies, the probability of fault manifestation, and the behavior of latent faults.

Journal ArticleDOI
TL;DR: Based on a recently proposed Toeplitz matrix-vector product approach, a subquadratic computational complexity scheme is presented for multiplications in binary extended finite fields using type I and II optimal normal bases.
Abstract: Based on a recently proposed Toeplitz matrix-vector product approach, a subquadratic computational complexity scheme is presented for multiplications in binary extended finite fields using type I and II optimal normal bases.

Journal ArticleDOI
TL;DR: This paper studies the use of pricing as an incentive mechanism for stimulating participation and collaboration in public wireless mesh networks using a non-self-enforcing but more practical "fixed-rate noninterrupted service" model and proposes an algorithm based on the Markovian decision theory to devise the optimal pricing strategy.
Abstract: Distributed wireless mesh network technology is ready for public deployment in the near future. However, without an incentive system, one should not assume that private self-interested wireless nodes would participate in such a public network and cooperate in the packet forwarding service. This paper studies the use of pricing as an incentive mechanism for stimulating participation and collaboration in public wireless mesh networks. Our focus is on the "economic behavior" of the network nodes-the pricing and purchasing strategies of the access point, wireless relaying nodes, and clients. We use a "game-theoretic approach" to analyze their interactions from one-hop to multihop networks and when the network has an unlimited or limited channel capacity. The important results that we show are that the access point and relaying wireless nodes will adopt a simple yet optimal fixed-rate pricing strategy in a multihop network with an unlimited capacity. However, the access price grows quickly with the hop distance between a client and the access point, which may limit the "scalability" of the wireless mesh network. In case where the network has limited capacity, the optimal strategy for the access point is to vary the access charge and even interrupt service to connecting clients. To this end, we focus on the access point adopting a non-self-enforcing but more practical "fixed-rate noninterrupted service" model and propose an algorithm based on the Markovian decision theory to devise the optimal pricing strategy. Results show that the scalability of a network with limited capacity is upper bounded by one with an unlimited capacity. We believe that this work will shed light on the deployment and pricing issues of distributed public wireless mesh networks.

Journal ArticleDOI
TL;DR: It is shown that test application cost, test data volume, and test power with the proposed scan forest architecture can be greatly reduced compared with the conventional full scan design with a single scan chain and several recent scan testing methods.
Abstract: A new scan architecture called reconfigured scan forest is proposed for cost-effective scan testing. Multiple scan flip-flops can be grouped based on structural analysis that avoids new untestable faults due to new reconvergent fanouts. The proposed new scan architecture allows only a few scan flip-flops to be connected to the XOR trees. The size of the XOR trees can be greatly reduced compared with the original scan forest; therefore, area overhead and routing complexity can be greatly reduced. It is shown that test application cost, test data volume, and test power with the proposed scan forest architecture can be greatly reduced compared with the conventional full scan design with a single scan chain and several recent scan testing methods

Journal ArticleDOI
TL;DR: This paper presents a method for reducing the number of controllers to be designed offline, while still guaranteeing a given control performance, and has been integrated with the elastic scheduling theory to promptly react to overload conditions.
Abstract: Transient overload conditions may cause unpredictable performance degradations in computer controlled systems if not properly handled. To prevent such problems, a common technique adopted in periodic task systems is to reduce the workload by enlarging activation periods. In a digital controller, however, the variation applied on the task period also affects the control law, which needs to be recomputed for the new activation rate. If computing a new control law requires too much time to be performed at runtime, a set of controllers has to be designed offline for different rates and the system has to switch to the proper controller in the presence of an overload condition. In this paper, we present a method for reducing the number of controllers to be designed offline, while still guaranteeing a given control performance. The proposed approach has been integrated with the elastic scheduling theory to promptly react to overload conditions. The effectiveness of the proposed approach has been verified through extensive simulation experiments performed on an inverted pendulum. In addition, the method has been implemented on a real inverted pendulum. Experimental results and implementation issues are reported and discussed