Showing papers in &quot;IEEE Transactions on Computers in 2002&quot;

Query indexing and velocity constrained indexing: scalable techniques for continuous queries on moving objects

TL;DR: It is shown that grid-based sensor placement for single targets provides asymptotically complete location of multiple targets in the grid, and coding-theoretic bounds on the number of sensors are provided and methods for determining their placement in the sensor field are presented.

...read moreread less

Abstract: We present novel grid coverage strategies for effective surveillance and target location in distributed sensor networks. We represent the sensor field as a grid (two or three-dimensional) of points (coordinates) and use the term target location to refer to the problem of locating a target at a grid point at any instant in time. We first present an integer linear programming (ILP) solution for minimizing the cost of sensors for complete coverage of the sensor field. We solve the ILP model using a representative public-domain solver and present a divide-and-conquer approach for solving large problem instances. We then use the framework of identifying codes to determine sensor placement for unique target location, We provide coding-theoretic bounds on the number of sensors and present methods for determining their placement in the sensor field. We also show that grid-based sensor placement for single targets provides asymptotically complete (unambiguous) location of multiple targets in the grid.

...read moreread less

956 citations

Journal Article•DOI•

[...]

Sunil Prabhakar¹, Yuni Xia¹, Dmitri V. Kalashnikov¹, Walid G. Aref¹, Susanne E. Hambrusch¹ - Show less +1 more•Institutions (1)

Purdue University¹

Session-based admission control: a mechanism for peak load management of commercial Web sites

TL;DR: Novel techniques for the efficient and scalable evaluation of multiple continuous queries on moving objects and a combination of Query Indexing and Velocity Constrained Indexing enables the scalable execution of insertion and deletion of queries in addition to processing ongoing queries are developed.

...read moreread less

Abstract: Moving object environments are characterized by large numbers of moving objects and numerous concurrent continuous queries over these objects. Efficient evaluation of these queries in response to the movement of the objects is critical for supporting acceptable response times. In such environments, the traditional approach of building an index on the objects (data) suffers from the need for frequent updates and thereby results in poor performance. In fact, a brute force, no-index strategy yields better performance in many cases. Neither the traditional approach nor the brute force strategy achieve reasonable query processing times. This paper develops novel techniques for the efficient and scalable evaluation of multiple continuous queries on moving objects. Our solution leverages two complimentary techniques: Query Indexing and Velocity Constrained Indexing (VCI). Query Indexing relies on 1) incremental evaluation, 2) reversing the role of queries and data, and 3) exploiting the relative locations of objects and queries. VCI takes advantage of the maximum possible speed of objects in order to delay the expensive operation of updating an index to reflect the movement of objects. In contrast to an earlier technique that requires exact knowledge about the movement of the objects, VCI does not rely on such information. While Query Indexing outperforms VCI, it does not efficiently handle the arrival of new queries. Velocity constrained indexing, on the other hand, is unaffected by changes in queries. We demonstrate that a combination of Query Indexing and Velocity Constrained Indexing enables the scalable execution of insertion and deletion of queries in addition to processing ongoing queries. We also develop several optimizations and present a detailed experimental evaluation of our techniques. The experimental results show that the proposed schemes outperform the traditional approaches by almost two orders of magnitude.

...read moreread less

378 citations

Journal Article•DOI•

[...]

Ludmila Cherkasova¹, P. Phaal•Institutions (1)

Hewlett-Packard¹

Multivariate statistical analysis of audit trails for host-based intrusion detection

TL;DR: It is shown that a Web server augmented with the admission control mechanism is able to provide a fair guarantee of completion, for any accepted session, independent of a session length, which is a critical requirement for any e-business.

...read moreread less

Abstract: We consider a new, session-based workload for measuring web server performance. We define a session as a sequence of client's individual requests. Using a simulation model, we show that an overloaded web server can experience a severe loss of throughput measured as a number of completed sessions compared against the server throughput measured in requests per second. Moreover, statistical analysis of completed sessions reveals that the overloaded web server discriminates against longer sessions. For e-commerce retail sites, longer sessions are typically the ones that would result in purchases, so they are precisely the ones for which the companies want to guarantee completion. To improve Web QoS for commercial Web servers, we introduce a session-based admission control (SBAC) to prevent a web server from becoming overloaded and to ensure that longer sessions can be completed. We show that a Web server augmented with the admission control mechanism is able to provide a fair guarantee of completion, for any accepted session, independent of a session length. This provides a predictable and controllable platform for web applications and is a critical requirement for any e-business. Additionally, we propose two new adaptive admission control strategies, hybrid and predictive, aiming to optimize the performance of SBAC mechanism. These new adaptive strategies are based on a self-tunable admission control function, which adjusts itself accordingly to variations in traffic loads.

...read moreread less

297 citations

Journal Article•DOI•

[...]

Nong Ye¹, Syed Masum Emran², Qiang Chen¹, S. Vilbert¹•Institutions (2)

Arizona State University¹, Motorola²

01 Jul 2002-IEEE Transactions on Computers

TL;DR: This study investigates a multivariate quality control technique to detect intrusions by building a long-term profile of normal activities in information systems (norm profile) and using the norm profile to detect anomalies.

...read moreread less

Abstract: Intrusion detection complements prevention mechanisms, such as firewalls, cryptography, and authentication, to capture intrusions into an information system while they are acting on the information system. Our study investigates a multivariate quality control technique to detect intrusions by building a long-term profile of normal activities in information systems (norm profile) and using the norm profile to detect anomalies. The multivariate quality control technique is based on Hotelling's T/sup 2/ test that detects both counterrelationship anomalies and mean-shift anomalies. The performance of the Hotelling's T/sup 2/ test is examined on two sets of computer audit data: a small data set and a large multiday data set. Both data sets contain sessions of normal and intrusive activities. For the small data set, the Hotelling's T/sup 2/ test signals all the intrusion sessions and produces no false alarms for the normal sessions. For the large data set, the Hotelling's T/sup 2/ test signals 92 percent of the intrusion sessions while producing no false alarms for the normal sessions. The performance of the Hotelling's T/sup 2/ test is also compared with the performance of a more scalable multivariate technique-a chi-squared distance test.

...read moreread less

279 citations

Journal Article•DOI•

ED/sup 4/I: error detection by diverse data and duplicated instructions

[...]

Nahmsuk Oh¹, Subhasish Mitra¹, Edward J. McCluskey¹•Institutions (1)

Stanford University¹

01 Feb 2002-IEEE Transactions on Computers

TL;DR: It is demonstrated how to choose an optimal value of k for the transformation of ED/sup 4/I, and shows that, for integer programs, the transformation with k = -2 was the most desirable choice in six out of seven benchmark programs the authors simulated.

...read moreread less

Abstract: Errors in computing systems can cause abnormal behavior and degrade data integrity and system availability. Errors should be avoided especially in embedded systems for critical applications. However, as the trend in VLSI technologies has been toward smaller feature sizes, lower supply voltages and higher frequencies, there is a growing concern about temporary errors as well as permanent errors in embedded systems; thus, it is very essential to detect those errors. Software-implemented hardware fault tolerance (SIHFT) is a low-cost alternative to hardware fault-tolerance techniques for embedded processors: It does not require any hardware modification of commercial off-the-shelf (COTS) processors. ED/sup 4/I (error detection by data diversity and duplicated instructions) is a SIHFT technique that detects both permanent and temporary errors by executing two "different" programs (with the same functionality) and comparing their outputs. ED/sup 4/I maps each number, x, in the original program into a new number x', and then transforms the program so that it operates on the new numbers so that the results can be mapped backwards for comparison with the results of the original program. The mapping in the transformation of ED/sup 4/I is x' = k/spl middot/x for integer numbers, where k/sub f/ determines the fault detection probability and data integrity of the system. For floating-point numbers, we find a value of k/sub f/ for the fraction and k/sub e/ for the exponent separately, and use k = k/sub f//spl times/2/sup k/ for the value of k. We have demonstrated how to choose an optimal value of k for the transformation. This paper shows that, for integer programs, the transformation with k = -2 was the most desirable choice in six out of seven benchmark programs we simulated. It maximizes the fault detection probability under the condition that the data integrity is highest.

...read moreread less

270 citations

Journal Article•DOI•

Elastic scheduling for flexible workload management

[...]

Giorgio Buttazzo¹, Giuseppe Lipari, Marco Caccamo, Luca Abeni•Institutions (1)

University of Pavia¹

01 Mar 2002-IEEE Transactions on Computers

TL;DR: This work presents a novel scheduling framework in which tasks are treated as springs with given elastic coefficients to better conform to the actual load conditions, and under this model, periodic tasks can intentionally change their execution rate to provide different quality of service.

...read moreread less

Abstract: An increasing number of real-time applications related to multimedia and adaptive control systems require greater flexibility than classical real-time theory usually permits. We present a novel scheduling framework in which tasks are treated as springs with given elastic coefficients to better conform to the actual load conditions. Under this model, periodic tasks can intentionally change their execution rate to provide different quality of service and the other tasks can automatically adapt their periods to keep the system underloaded. The proposed model can also be used to handle overload conditions in a more flexible way and to provide a simple and efficient mechanism for controlling a system's performance as a function of the current load.

...read moreread less

270 citations

Journal Article•DOI•

Adaptive push-pull: disseminating dynamic Web data

[...]

Manish A. Bhide¹, Pavan Deolasee, Amol Katkar, Ankur Panchbudhe, Krithi Ramamritham¹, Prashant Shenoy² - Show less +2 more•Institutions (2)

Indian Institutes of Technology¹, University of Massachusetts Amherst²

Bit-parallel finite field multiplier and squarer using polynomial basis

TL;DR: This paper shows how to combine push and pull-based techniques to achieve the best features of both approaches and demonstrates that such adaptive data dissemination is essential to meet diverse temporal coherency requirements, to be resilient to failures, and for the efficient and scalable utilization of server and network resources.

...read moreread less

Abstract: An important issue in the dissemination of time-varying Web data such as sports scores and stock prices is the maintenance of temporal coherency. In the case of servers adhering to the HTTP protocol, clients need to frequently pull the data based on the dynamics of the data and a user's coherency requirements. In contrast, servers that possess push capability maintain state information pertaining to clients and push only those changes that are of interest to a user. These two canonical techniques have complementary properties with respect to the level of temporal coherency maintained, communication overheads, state space overheads, and loss of coherency due to (server) failures. In this paper, we show how to combine push and pull-based techniques to achieve the best features of both approaches. Our combined technique tailors the dissemination of data from servers to clients based on 1) the capabilities and load at servers and proxies and 2) clients' coherency requirements. Our experimental results demonstrate that such adaptive data dissemination is essential to meet diverse temporal coherency requirements, to be resilient to failures, and for the efficient and scalable utilization of server and network resources.

...read moreread less

194 citations

Journal Article•DOI•

[...]

Huapeng Wu¹•Institutions (1)

University of Waterloo¹

01 Jul 2002-IEEE Transactions on Computers

TL;DR: This article presents an upper complexity bound for the modular polynomial reduction and an analytical form for bit-parallel squaring operation, arguing that to solve multiplicative inverse usingPolynomial basis can be at least as good as using a normal basis.

...read moreread less

Abstract: Bit-parallel finite field multiplication using polynomial basis can be realized in two steps: polynomial multiplication and reduction modulo the irreducible polynomial. In this article, we present an upper complexity bound for the modular polynomial reduction. When the field is generated with an irreducible trinomial, closed form expressions for the coefficients of the product are derived in term of the coefficients of the multiplicands. The complexity of the multiplier architectures and their critical path length are evaluated, and they are comparable to the previous proposals for the same class of fields. An analytical form for bit-parallel squaring operation is also presented. The complexities for bit-parallel squarer are also derived when an irreducible trinomial is used. Consequently, it is argued that to solve multiplicative inverse using polynomial basis can be at least as good as using a normal basis.

...read moreread less

189 citations

Journal Article•DOI•

Cache invalidation and replacement strategies for location-dependent data in mobile environments

[...]

Baihua Zheng¹, Jianliang Xu¹, Dik Lun Lee¹•Institutions (1)

Hong Kong University of Science and Technology¹

Dependability of COTS microkernel-based systems

TL;DR: A new performance criterion is introduced, called caching efficiency, and a generic method for location-dependent cache invalidation strategies is proposed, and two cache replacement policies, PA and PAID, are proposed.

...read moreread less

Abstract: Mobile location-dependent information services (LDISs) have become increasingly popular in recent years. However, data caching strategies for LDISs have thus far received little attention. In this paper, we study the issues of cache invalidation and cache replacement for location-dependent data under a geometric location model. We introduce a new performance criterion, called caching efficiency, and propose a generic method for location-dependent cache invalidation strategies. In addition, two cache replacement policies, PA and PAID, are proposed. Unlike the conventional replacement policies, PA and PAID take into consideration the valid scope area of a data value. We conduct a series of simulation experiments to study the performance of the proposed caching schemes. The experimental results show that the proposed location-dependent invalidation scheme is very effective and the PA and PAID policies significantly outperform the conventional replacement policies.

...read moreread less

172 citations

Journal Article•DOI•

[...]

Jean Arlat¹, Jean-Charles Fabre¹, Manuel S. Rodriguez¹•Institutions (1)

Centre national de la recherche scientifique¹

01 Feb 2002-IEEE Transactions on Computers

TL;DR: A prototype environment, called MAFALDA (Microkernel Assessment by Fault injection AnaLysis and Design Aid), that is aimed at providing objective failure data on a candidate microkernel and also improving its error detection capabilities is described.

...read moreread less

Abstract: The commercial offer concerning microkernel technology constitutes an attractive alternative for developing operating systems to suit a wide range of application domains. However, the integration of commercial off-the-shelf (COTS) microkernels into critical embedded computer systems is a problem for system developers, in particular due to the lack of objective data concerning their behavior in the presence of faults. This paper addresses this issue by describing a prototype environment, called MAFALDA (Microkernel Assessment by Fault injection AnaLysis and Design Aid), that is aimed at providing objective failure data on a candidate microkernel and also improving its error detection capabilities. The paper first presents the overall architecture of MAFALDA. Then, a case study carried out on an instance of the Chorus microkemel is used to illustrate the benefits that can be obtained with MAFALDA both from the dependability assessment and design-aid viewpoints. Implementation issues are also addressed that account for the specific API of the target microkemel. Some overall insights and lessons learned, gained during the various studies conducted on both Chorus and another target microkemel (LynxOS), are then depicted and discussed. Finally, we conclude the paper by summarizing the main features of the work presented and by identifying future research.

...read moreread less

Journal Article•DOI•

The timely computing base model and architecture

[...]

Paulo Veríssimo, António Casimiro

01 Aug 2002-IEEE Transactions on Computers

TL;DR: This paper proposes an architectural construct and programming model which assumes the existence of a component that is capable of executing timely functions, however asynchronous the rest of the system may be, and uses it to build dependable and timely applications exhibiting varying degrees of timeliness assurance.

...read moreread less

Abstract: Current systems are very often based on large-scale, unpredictable and unreliable infrastructures. However, users of these systems increasingly require services with timeliness properties. This creates a difficult-to-solve contradiction with regard to the adequate time model: should it be synchronous, or asynchronous? In this paper, we propose an architectural construct and programming model which address this problem. We assume the existence of a component that is capable of executing timely functions, however asynchronous the rest of the system may be. We call this component the "timely computing base", and it can be used by the other components to execute a set of simple but crucial time-related services. We also show how to use it to build dependable and timely applications exhibiting varying degrees of timeliness assurance, under several synchrony models.

...read moreread less

Journal Article•DOI•

Dynamic power management for nonstationary service requests

[...]

Eui-Young Chung¹, Luca Benini², Alessandro Bogliolo³, Yung-Hsiang Lu¹, G. De Micheli¹ - Show less +1 more•Institutions (3)

Stanford University¹, University of Bologna², University of Ferrara³

01 Nov 2002-IEEE Transactions on Computers

TL;DR: This work presents an online adaptive DPM scheme for systems that can be modeled as finite-state Markov chains and introduces two workload learning techniques based on sliding windows and a two-dimensional interpolation technique to obtain adaptive policies from a precomputed look-up table of optimum stationary policies.

...read moreread less

Abstract: Dynamic power management (DPM) is a design methodology aimed at reducing power consumption of electronic systems by performing selective shutdown of idle system resources. The effectiveness of a power management scheme depends critically on accurate modeling of service requests and on computation of the control policy. In this work, we present an online adaptive DPM scheme for systems that can be modeled as finite-state Markov chains. Online adaptation is required to deal with initially unknown or nonstationary workloads, which are very common in real-life systems. Our approach moves from exact policy optimization techniques in a known and stationary stochastic environment and extends optimum stationary control policies to handle the unknown and nonstationary stochastic environment for practical applications. We introduce two workload learning techniques based on sliding windows and study their properties. Furthermore, a two-dimensional interpolation technique is introduced to obtain adaptive policies from a precomputed look-up table of optimum stationary policies. The effectiveness of our approach is demonstrated by a complete DPM implementation on a laptop computer with a power-manageable hard disk that compares very favorably with existing DPM schemes.

...read moreread less

Journal Article•DOI•

A new construction of Massey-Omura parallel multiplier over GF(2/sup m/)

[...]

Arash Reyhani-Masoleh¹, M.A. Hasan¹•Institutions (1)

University of Waterloo¹

01 May 2002-IEEE Transactions on Computers

TL;DR: It is shown that, not only does this type of multiplier contain redundancy in that special class of finite fields, but it also has redundancy in fields GF(2/sup m/) defined by any irreducible polynomial, and a new architecture for the normal basis parallel multiplier is proposed, which is applicable to any arbitrary finite field and has significantly lower circuit complexity compared to the original Massey-Omura normal basis Parallel multiplier.

...read moreread less

Abstract: The Massey-Omura multiplier of GF(2/sup m/) uses a normal basis and its bit parallel version is usually implemented using m identical combinational logic blocks whose inputs are cyclically shifted from one another. In the past, it was shown that, for a class of finite fields defined by irreducible all-one polynomials, the parallel Massey-Omura multiplier had redundancy and a modified architecture of lower circuit complexity was proposed. In this article, it is shown that, not only does this type of multiplier contain redundancy in that special class of finite fields, but it also has redundancy in fields GF(2/sup m/) defined by any irreducible polynomial. By removing the redundancy, we propose a new architecture for the normal basis parallel multiplier, which is applicable to any arbitrary finite field and has significantly lower circuit complexity compared to the original Massey-Omura normal basis parallel multiplier. The proposed multiplier structure is also modular and, hence, suitable for VLSI realization. When applied to fields defined by the irreducible all-one polynomials, the multiplier's circuit complexity matches the best result available in the open literature.

...read moreread less

Journal Article•DOI•

Diminished-one modulo 2/sup n/+1 adder design

[...]

Haridimos T. Vergos, Constantinos Efstathiou, Dimitris Nikolos

Efficient data allocation over multiple channels at broadcast servers

TL;DR: In this paper, the authors present two new design methodologies for modulo 2/sup n/1 addition in the diminished-one number system, the first leads to carry look-ahead, whereas the second to parallel-prefix adder implementations.

...read moreread less

Abstract: This paper presents two new design methodologies for modulo 2/sup n/+1 addition in the diminished-one number system. The first design methodology leads to carry look-ahead, whereas the second to parallel-prefix adder implementations. VLSI realizations of the proposed circuits in a standard-cell technology are utilized for quantitative comparisons against the existing solutions. Our results indicate that the proposed carry look-ahead adders are area and time efficient for small values of n, while for the rest values of n the proposed parallel-prefix adders are considerably faster than any other already known in the open literature.

...read moreread less

Journal Article•DOI•

[...]

Wai Gen Yee¹, Shamkant B. Navathe¹, Edward Omiecinski¹, Chris Jermaine¹•Institutions (1)

Georgia Institute of Technology¹

Proactive power-aware cache management for mobile computing systems

TL;DR: This paper shows how to minimize the average response time given multiple broadcast channels by optimally partitioning data among them and offers an approximation algorithm that is less complex than the optimal and shows that its performance is near-optimal for a wide range of parameters.

...read moreread less

Abstract: Broadcast is a scalable way of disseminating data because broadcasting an item satisfies all outstanding client requests for it. However, because the transmission medium is shared, individual requests may have high response times. In this paper, we show how to minimize the average response time given multiple broadcast channels by optimally partitioning data among them. We also offer an approximation algorithm that is less complex than the optimal and show that its performance is near-optimal for a wide range of parameters. Finally, we briefly discuss the extensibility of our work with two simple, yet seldom researched extensions, namely, handling varying sized items and generating single channel schedules.

...read moreread less

Journal Article•DOI•

[...]

Guohong Cao¹•Institutions (1)

Pennsylvania State University¹

Coordinated en-route Web caching

TL;DR: This paper proposes a proactive cache management scheme that not only improves the cache hit ratio, the throughput, and the bandwidth utilization, but also reduces the query delay and the power consumption.

...read moreread less

Abstract: Recent work has shown that invalidation report (IR)-based cache management is an attractive approach for mobile environments. However, the IR-based cache invalidation solution has some limitations, such as long query delay, low bandwidth utilization, and it is not suitable for applications where data change frequently. In this paper, we propose a proactive cache management scheme to address these issues. Instead of passively waiting, the clients intelligently prefetch the data that are most likely used in the future. Based on a novel prefetch-access ratio concept, the proposed scheme can dynamically optimize performance or power based on the available resources and the performance requirements. To deal with frequently updated data, different techniques (indexing and caching) are applied to handle different components of the data based on their update frequency. Detailed simulation experiments are carried out to evaluate the proposed methodology. Compared to previous schemes, our solution not only improves the cache hit ratio, the throughput, and the bandwidth utilization, but also reduces the query delay and the power consumption.

...read moreread less

Journal Article•DOI•

[...]

Xueyan Tang¹, Samuel T. Chanson¹•Institutions (1)

Hong Kong University of Science and Technology¹

Architectures and VLSI implementations of the AES-Proposal Rijndael

TL;DR: A novel caching scheme that integrates both object placement and replacement policies and which makes caching decisions on all candidate sites in a coordinated fashion is proposed.

...read moreread less

Abstract: Web caching is an important technique for reducing Internet access latency, network traffic, and server load. This paper investigates cache management strategies for the en-route web caching environment, where caches are associated with routing nodes in the network. We propose a novel caching scheme that integrates both object placement and replacement policies and which makes caching decisions on all candidate sites in a coordinated fashion. In our scheme, cache status information along the routing path of a request is used in dynamically determining where to cache the requested object and what to replace if there is not enough space. The object placement problem is formulated as an optimization problem and the optimal locations to cache the object are obtained using a low-cost dynamic programming algorithm. Extensive simulation experiments have been performed to evaluate the proposed scheme in terms of a wide range of performance metrics. The results show that the proposed scheme significantly outperforms existing algorithms which consider either object placement or replacement at individual caches only.

...read moreread less

Journal Article•DOI•

[...]

Nicolas Sklavos, Odysseas Koufopavlou

TCOT-a timeout-based mobile transaction commitment protocol

TL;DR: Two architectures and VLSI implementations of the AES Proposal, Rijndael, are presented and these alternative architectures are operated both for encryption and decryption process to reduce the required hardware resources and achieve high-speed performance.

...read moreread less

Abstract: Two architectures and VLSI implementations of the AES Proposal, Rijndael, are presented in this paper. These alternative architectures are operated both for encryption and decryption process. They reduce the required hardware resources and achieve high-speed performance. Their design philosophy is completely different. The first uses feedback logic and reaches a throughput value equal to 259 Mbit/sec. It performs efficiently in applications with low covered area resources. The second architecture is optimized for high-speed performance using pipelined technique. Its throughput can reach 3.65 Gbit/sec.

...read moreread less

Journal Article•DOI•

[...]

Vijay Kumar¹, Nitin Prabhu¹, Margaret H. Dunham², Ayse Yasemin Seydim²•Institutions (2)

University of Missouri¹, Southern Methodist University²

Performance modeling and prediction of nondedicated network computing

TL;DR: This work presents a transaction commit protocol based on a "timeout" approach for Mobile Database Systems (MDS), which can be universally used to reach a final transaction termination decision in any message-oriented system.

...read moreread less

Abstract: We present a transaction commit protocol, "Transaction Commit On Timeout (TCOT)," based on a "timeout" approach for Mobile Database Systems (MDS), which can be universally used to reach a final transaction termination decision (e.g., commit, abort, etc.) in any message-oriented system. Particularly suited for a wireless environment, a timeout mechanism is the only way to minimize the impact of the slow and unreliable wireless link. We compare TCOT to a modified version of 2PC to show its superiority based on commit time.

...read moreread less

Journal Article•DOI•

[...]

Linguo Gong¹, Xian-He Sun², E.F. Watson³•Institutions (3)

Rider University¹, Illinois Institute of Technology², Louisiana State University³

01 Sep 2002-IEEE Transactions on Computers

TL;DR: A mathematical model is developed to predict performance for nondedicated network computing and it separates the influence of machine utilization, sequential job service rate, and parallel task allocation on the parallel completion time.

...read moreread less

Abstract: The low cost and wide availability of networks of workstations have made them an attractive solution for high performance computing. However, while a network of workstations may be readily available, these workstations may be privately owned and the owners may not want others to interrupt their priority in using the computer. Assuming machine owners have a preemptive priority, in this paper, we study the parallel processing capacity of a privately owned network of workstations. A mathematical model is developed to predict performance for nondedicated network computing. It also considers systems with heterogeneous machine utilization and heterogeneous service distribution. This model separates the influence of machine utilization, sequential job service rate, and parallel task allocation on the parallel completion time. It is simple and valuable for guiding task scheduling in a nondedicated environment.

...read moreread less

Journal Article•DOI•

A New Finite-Field Multiplier Using Redundant Representation

[...]

Huapeng Wu¹, M.A. Hasan¹, I.F. Blake¹, Shuhong Gao•Institutions (1)

University of Windsor¹

01 Nov 2002-IEEE Transactions on Computers

TL;DR: This article presents simple and highly regular architectures for finite field multipliers using a redundant representation which provide area-time trade-offs which enable the multipliers to implement in a partial-parallel/hybrid fashion.

...read moreread less

Abstract: This article presents simple and highly regular architectures for finite field multipliers using a redundant representation. The basic idea is to embed a finite field into a cyclotomic ring which is based on the elegant multiplicative structure of a cyclic group. One important feature of our architectures is that they provide area-time trade-offs which enable us to implement the multipliers in a partial-parallel/hybrid fashion. This hybrid architecture has great significance in its VLSI implementation in very large fields. The squaring operation using the redundant representation is simply a permutation of the coordinates. It is shown that, when there is an optimal normal basis, the proposed bit-serial and hybrid multiplier architectures have very low space complexity. Constant multiplication is also considered and is shown to have an advantage in using the redundant representation.

...read moreread less

Journal Article•DOI•

High-speed double-precision computation of reciprocal, division, square root, and inverse square root

[...]

J.-A. Pineiro, Javier D. Bruguera

High-speed and reduced-area modular adder structures for RNS

TL;DR: A new method employs a second-degree minimax polynomial approximation to obtain an accurate initial estimate of the reciprocal and the inverse square root values, and then performs a modified Goldschmidt iteration, significantly reducing the latency of the algorithm.

...read moreread less

Abstract: A new method for the high-speed computation of double-precision floating-point reciprocal, division, square root, and inverse square root operations is presented in this paper. This method employs a second-degree minimax polynomial approximation to obtain an accurate initial estimate of the reciprocal and the inverse square root values, and then performs a modified Goldschmidt iteration. The high accuracy of the initial approximation allows us to obtain double-precision results by computing a single Goldschmidt iteration, significantly reducing the latency of the algorithm. Two unfolded architectures are proposed: the first one computing only reciprocal and division operations, and the second one also including the computation of square root and inverse square root. The execution times and area costs for both architectures are estimated, and a comparison with other multiplicative-based methods is presented. The results of this comparison show the achievement of a lower latency than these methods, with similar hardware requirements.

...read moreread less

Journal Article•DOI•

[...]

A.A. Hiasat¹•Institutions (1)

Princess Sumaya University for Technology¹

01 Jan 2002-IEEE Transactions on Computers

TL;DR: A new modular adder design is introduced, based on utilizing concepts developed to realize binary-based adders, that requires less area and time delay than other similar ones.

...read moreread less

Abstract: A modular adder is a very instrumental arithmetic component in implementing online residue-based computations for many digital signal processing applications. It is also a basic component in realizing modular multipliers and residue to binary converters. Thus, the design of a high-speed and reduced-area modular adder is an important issue. In this paper, we introduce a new modular adder design. It is based on utilizing concepts developed to realize binary-based adders. VLSI layout implementations and comparative analysis showed that the hardware requirements and the time delay of the new proposed structure are significantly, less than other reported ones. A new modulo (2/sup n/+1) adder is also presented. Compared with other similar ones, this specific modular adder requires less area and time delay.

...read moreread less

Journal Article•DOI•

Dynamically selecting optimal distribution strategies for Web documents

[...]

Guillaume Pierre¹, M. van Steen¹, Andrew S. Tanenbaum¹•Institutions (1)

University of Amsterdam¹