Showing papers on "Overhead (computing) published in 2009"

PDF

Open Access

Journal Article•DOI•

An adaptive energy efficient mac protocol for wireless sensor networks

[...]

01 Jan 2009-International Journal on Intelligent Electronic Systems

TL;DR: This paper focuses on a protocol stack solution that deals with MAC layer, that minimizes the energy consumption and delay required to transmit packets across the network, called Adaptive SMAC protocol designed for sensor networks.

...read moreread less

Abstract: Sensor networks are deployed in remote locations with limited processor capabilities, memory capacities, and battery supplies. Wireless Sensor Networks (WSN) detects environmental information with sensors in remote settings. One problem facing WSNs is the inability to resupply power to these energy-constrained devices due to their remoteness. Therefore to extend a WSN's effectiveness, the lifetime of the network must be increased by making them as energy efficient as possible. An energy-efficient medium access control (MAC) can boost a WSN's lifetime. This paper focuses on a protocol stack solution that deals with MAC layer, that minimizes the energy consumption and delay required to transmit packets across the network. It is based on Sensor Medium Access Control (S-MAC) called Adaptive SMAC protocol designed for sensor networks. It enables low duty cycle operation in a multi-hop network and common sleep schedules to reduce control overhead and enable traffic adaptive wakeup. To reduce control overhead and latency, introduces coordinated sleeping among neighboring nodes. It is a contention based protocol based on CSMA/CA mechanism. This protocol is simulated in NS-2 and performance evaluated using various topologies under various traffic conditions. In addition with this we tried to improve the energy efficiency of Adaptive SMAC with the help of a new design called Adaptive Cross MAC protocol

...read moreread less

797 citations

Journal Article•DOI•

SCALE: A Low-Complexity Distributed Protocol for Spectrum Balancing in Multiuser DSL Networks

[...]

John Papandriopoulos¹, Jamie Evans¹•Institutions (1)

University of Melbourne¹

01 Aug 2009-IEEE Transactions on Information Theory

TL;DR: A novel algorithm called SCALE is derived, that provides a significant performance improvement over the existing iterative water-filling (IWF) algorithm in multiuser DSL networks, doing so with comparable low complexity.

...read moreread less

Abstract: Dynamic spectrum management of digital subscriber lines (DSLs) has the potential to dramatically increase the capacity of the aging last-mile copper access network. This paper takes an important step toward fulfilling this potential through power spectrum balancing. We derive a novel algorithm called SCALE, that provides a significant performance improvement over the existing iterative water-filling (IWF) algorithm in multiuser DSL networks, doing so with comparable low complexity. The algorithm is easily distributed through measurement and limited message passing with the use of a spectrum management center. We outline how overhead can be managed, and show that in the limit of zero message-passing, performance reduces to IWF.

...read moreread less

382 citations

Proceedings Article•DOI•

Secure and efficient access to outsourced data

[...]

Weichao Wang¹, Zhiwei Li¹, Rodney Owens¹, Bharat Bhargava²•Institutions (2)

University of North Carolina at Charlotte¹, Purdue University²

13 Nov 2009

TL;DR: This paper proposes to encrypt every data block with a different key so that flexible cryptography-based access control can be achieved, and investigates the overhead and safety of the proposed approach, and study mechanisms to improve data access efficiency.

...read moreread less

Abstract: Providing secure and efficient access to large scale outsourced data is an important component of cloud computing. In this paper, we propose a mechanism to solve this problem in owner-write-users-read applications. We propose to encrypt every data block with a different key so that flexible cryptography-based access control can be achieved. Through the adoption of key derivation methods, the owner needs to maintain only a few secrets. Analysis shows that the key derivation procedure using hash functions will introduce very limited computation overhead. We propose to use over-encryption and/or lazy revocation to prevent revoked users from getting access to updated data blocks. We design mechanisms to handle both updates to outsourced data and changes in user access rights. We investigate the overhead and safety of the proposed approach, and study mechanisms to improve data access efficiency.

...read moreread less

321 citations

Journal Article•DOI•

Dissemination and Harvesting of Urban Data Using Vehicular Sensing Platforms

[...]

Uichin Lee¹, Eugenio Magistretti², Mario Gerla¹, Paolo Bellavista³, Antonio Corradi³ - Show less +1 more•Institutions (3)

University of California, Los Angeles¹, Rice University², University of Bologna³

01 Feb 2009-IEEE Transactions on Vehicular Technology

TL;DR: MobEyes is described, which is an effective middleware that was specifically designed for proactive urban monitoring and exploits node mobility to opportunistically diffuse sensed data summaries among neighbor vehicles and to create a low-cost index to query monitoring data.

...read moreread less

Abstract: Recent advances in vehicular communications make it possible to realize vehicular sensor networks, i.e., collaborative environments where mobile vehicles that are equipped with sensors of different nature (from toxic detectors to still/video cameras) interwork to implement monitoring applications. In particular, there is an increasing interest in proactive urban monitoring, where vehicles continuously sense events from urban streets, autonomously process sensed data (e.g., recognizing license plates), and, possibly, route messages to vehicles in their vicinity to achieve a common goal (e.g., to allow police agents to track the movements of specified cars). This challenging environment requires novel solutions with respect to those of more-traditional wireless sensor nodes. In fact, unlike conventional sensor nodes, vehicles exhibit constrained mobility, have no strict limits on processing power and storage capabilities, and host sensors that may generate sheer amounts of data, thus making already-known solutions for sensor network data reporting inapplicable. This paper describes MobEyes, which is an effective middleware that was specifically designed for proactive urban monitoring and exploits node mobility to opportunistically diffuse sensed data summaries among neighbor vehicles and to create a low-cost index to query monitoring data. We have thoroughly validated the original MobEyes protocols and demonstrated their effectiveness in terms of indexing completeness, harvesting time, and overhead. In particular, this paper includes (1) analytic models for MobEyes protocol performance and their consistency with simulation-based results, (2) evaluation of performance as a function of vehicle mobility, (3) effects of concurrent exploitation of multiple harvesting agents with single/multihop communications, (4) evaluation of network overhead and overall system stability, and (5) performance validation of MobEyes in a challenging urban tracking application where the police reconstruct the movements of a suspicious driver, e.g., by specifying the license number of a car.

...read moreread less

250 citations

Journal Article•DOI•

A Survey on Wireless Sensor Network Security

[...]

Jaydip Sen¹•Institutions (1)

Tata Consultancy Services¹

23 Aug 2009-International Journal of Computer Network and Information Security

TL;DR: The current state of the art in security mechanisms for WSNs is discussed and various types of attacks are discussed and their countermeasures presented.

...read moreread less

Abstract: Wireless sensor networks (WSNs) have recently attracted a lot of interest in the research community due their wide range of applications. Due to distributed nature of these networks and their deployment in remote areas, these networks are vulnerable to numerous security threats that can adversely affect their proper functioning. This problem is more critical if the network is deployed for some mission-critical applications such as in a tactical battlefield. Random failure of nodes is also very likely in real-life deployment scenarios. Due to resource constraints in the sensor nodes, traditional security mechanisms with large overhead of computation and communication are infeasible in WSNs. Security in sensor networks is, therefore, a particularly challenging task. This paper discusses the current state of the art in security mechanisms for WSNs. Various types of attacks are discussed and their countermeasures presented. A brief discussion on the future direction of research in WSN security is also included.

...read moreread less

229 citations

Proceedings Article•DOI•

A Commutative Replicated Data Type for Cooperative Editing

[...]

Nuno Preguiça¹, Joan Manuel Marquès², Marc Shapiro³, Mihai Letia⁴•Institutions (4)

Citigroup¹, Open University of Catalonia², French Institute for Research in Computer Science and Automation³, École Normale Supérieure⁴

22 Jun 2009

TL;DR: Treedoc is described, a novel CRDT design for cooperative text editing where the identifiers of Treedoc atoms are selected from a dense space and the results with traces from existing edit histories are validated.

...read moreread less

Abstract: A Commutative Replicated Data Type (CRDT) is one where all concurrent operations commute. The replicas of a CRDT converge automatically, without complex concurrency control. This paper describes Treedoc, a novel CRDT design for cooperative text editing. An essential property is that the identifiers of Treedoc atoms are selected from a dense space. We discuss practical alternatives for implementing the identifier space based on an extended binary tree. We also discuss storage alternatives for data and meta-data, and mechanisms for compacting the tree. In the best case, Treedoc incurs no overhead with respect to a linear text buffer. We validate the results with traces from existing edit histories.

...read moreread less

229 citations

Posted Content•

Survey of clustering algorithms for MANET

[...]

Ratish Agarwal, Mahesh Motwani

11 Dec 2009-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: A survey of different clustering schemes for ad hoc networks, developed by researchers which focus on different performance metrics is presented.

...read moreread less

Abstract: Many clustering schemes have been proposed for ad hoc networks. A systematic classification of these clustering schemes enables one to better understand and make improvements. In mobile ad hoc networks, the movement of the network nodes may quickly change the topology resulting in the increase of the overhead message in topology maintenance. Protocols try to keep the number of nodes in a cluster around a pre-defined threshold to facilitate the optimal operation of the medium access control protocol. The clusterhead election is invoked on-demand, and is aimed to reduce the computation and communication costs. A large variety of approaches for ad hoc clustering have been developed by researchers which focus on different performance metrics. This paper presents a survey of different clustering schemes.

...read moreread less

229 citations

Journal Article•DOI•

Algorithm-based fault tolerance applied to high performance computing

[...]

George Bosilca¹, Remi Delmas¹, Jack Dongarra¹, Julien Langou²•Institutions (2)

University of Tennessee¹, University of Colorado Denver²

01 Apr 2009-Journal of Parallel and Distributed Computing

TL;DR: A careful adaptation of the Algorithmic Based Fault Tolerance technique to the need of parallel distributed computation results in a strongly scalable mechanism for fault tolerance that can also detect and correct errors on the fly of a computation.

...read moreread less

215 citations

Proceedings Article•

Satori: enlightened page sharing

[...]

Grzegorz Milos¹, Derek G. Murray¹, Steven Hand¹, Michael Fetterman²•Institutions (2)

University of Cambridge¹, Nvidia²

14 Jun 2009

TL;DR: Satori is introduced, an efficient and effective system for sharing memory in virtualised systems that is better able to detect short-lived sharing opportunities, efficient and incurs negligible overhead, and it maintains performance isolation between virtual machines.

...read moreread less

Abstract: We introduce Satori, an efficient and effective system for sharing memory in virtualised systems. Satori uses enlightenments in guest operating systems to detect sharing opportunities and manage the surplus memory that results from sharing. Our approach has three key benefits over existing systems: it is better able to detect short-lived sharing opportunities, it is efficient and incurs negligible overhead, and it maintains performance isolation between virtual machines. We present Satori in terms of hypervisor-agnostic design decisions, and also discuss our implementation for the Xen virtual machine monitor. In our evaluation, we show that Satori quickly exploits up to 94% of the maximum possible sharing with insignificant performance overhead. Furthermore, we demonstrate workloads where the additional memory improves macrobenchmark performance by a factor of two.

...read moreread less

206 citations

Proceedings Article•DOI•

vCUDA: GPU accelerated high performance computing in virtual machines

[...]

Lin Shi¹, Hao Chen¹, Jianhua Sun¹•Institutions (1)

Hunan University¹

23 May 2009

TL;DR: The vCUDA framework as mentioned in this paper is a GPGPU (General Purpose Graphics Processing Unit) computing solution for virtual machines that allows applications executing within virtual machines (VMs) to leverage hardware acceleration, which can be beneficial to the performance of a class of HPC applications.

...read moreread less

Abstract: This paper describes vCUDA, a GPGPU (General Purpose Graphics Processing Unit) computing solution for virtual machines. vCUDA allows applications executing within virtual machines (VMs) to leverage hardware acceleration, which can be beneficial to the performance of a class of high performance computing (HPC) applications. The key idea in our design is: API call interception and redirection. With API interception and redirection, applications in VMs can access graphics hardware device and achieve high performance computing in a transparent way. We carry out detailed analysis on the performance and overhead of our framework. Our evaluation shows that GPU acceleration for HPC applications in VMs is feasible and competitive with those running in a native, non-virtualized environment. Furthermore, our evaluation also identifies the main cause of overhead in our current framework, and we give some suggestions for future improvement.

...read moreread less

203 citations

Proceedings Article•DOI•

Breadcrumbs: Efficient, Best-Effort Content Location in Cache Networks

[...]

Elisha J. Rosensweig¹, Jim Kurose¹•Institutions (1)

University of Massachusetts Amherst¹

19 Apr 2009

TL;DR: This paper considers a network in which each router has a local cache that caches files passing through it and develops a simple content caching, location, and routing systems that adopts an implicit, transparent, and best-effort approach towards caching.

...read moreread less

Abstract: For several years, web caching has been used to meet the ever-increasing Web access loads. A fundamental capability of all such systems is that of inter-cache coordination, which can be divided into two main types: explicit and implicit coordination. While the former allows for greater control over resource allocation, the latter does not suffer from the additional communication overhead needed for coordination. In this paper, we consider a network in which each router has a local cache that caches files passing through it. By additionally storing minimal information regarding caching history, we develop a simple content caching, location, and routing systems that adopts an implicit, transparent, and best-effort approach towards caching. Though only best effort, the policy outperforms classic policies that allow explicit coordination between caches.

...read moreread less

Proceedings Article•DOI•

Fast byte-granularity software fault isolation

[...]

Miguel Castro¹, Manuel Costa¹, Jean-Philippe Martin¹, Marcus Peinado¹, Periklis Akritidis¹, Austin Donnelly¹, Paul Barham¹, Richard Black¹ - Show less +4 more•Institutions (1)

Microsoft¹

11 Oct 2009

TL;DR: BGI (Byte-Granularity Isolation), a new software fault isolation technique that uses efficient byte-granularity memory protection to isolate kernel extensions in separate protection domains that share the same address space, is presented.

...read moreread less

Abstract: Bugs in kernel extensions remain one of the main causes of poor operating system reliability despite proposed techniques that isolate extensions in separate protection domains to contain faults. We believe that previous fault isolation techniques are not widely used because they cannot isolate existing kernel extensions with low overhead on standard hardware. This is a hard problem because these extensions communicate with the kernel using a complex interface and they communicate frequently. We present BGI (Byte-Granularity Isolation), a new software fault isolation technique that addresses this problem. BGI uses efficient byte-granularity memory protection to isolate kernel extensions in separate protection domains that share the same address space. BGI ensures type safety for kernel objects and it can detect common types of errors inside domains. Our results show that BGI is practical: it can isolate Windows drivers without requiring changes to the source code and it introduces a CPU overhead between 0 and 16%. BGI can also find bugs during driver testing. We found 28 new bugs in widely used Windows drivers.

...read moreread less

Journal Article•DOI•

Distributed resource allocation schemes

[...]

D.A. Schmidt, C. Shi¹, Randall A. Berry², Michael L. Honig³, Wolfgang Utschick⁴ - Show less +1 more•Institutions (4)

Shanghai Jiao Tong University¹, Massachusetts Institute of Technology², University of California, Berkeley³, Technische Universität München⁴

04 Sep 2009-IEEE Signal Processing Magazine

TL;DR: Distributed resource allocation schemes in which each transmitter determines its allocation autonomously, based on the exchange of interference prices, can be adapted according to the size of the network.

...read moreread less

Abstract: In this article, we discuss distributed resource allocation schemes in which each transmitter determines its allocation autonomously, based on the exchange of interference prices. These schemes have been primarily motivated by the common model for spectrum sharing in which a user or service provider may transmit in a designated band provided that they abide by certain rules (e.g., a standard such as 802.11). An attractive property of these schemes is that they are scalable, i.e., the information exchange and overhead can be adapted according to the size of the network.

...read moreread less

Journal Article•DOI•

Streams on wires: a query compiler for FPGAs

[...]

Rene Mueller¹, Jens Teubner¹, Gustavo Alonso¹•Institutions (1)

ETH Zurich¹

01 Aug 2009

TL;DR: Glasgow, a component library and compositional compiler that transforms continuous queries into logic circuits by composing library components on an operator-level basis is presented.

...read moreread less

Abstract: Taking advantage of many-core, heterogeneous hardware for data processing tasks is a difficult problem. In this paper, we consider the use of FPGAs for data stream processing as coprocessors in many-core architectures. We present Glacier, a component library and compositional compiler that transforms continuous queries into logic circuits by composing library components on an operator-level basis. In the paper we consider selection, aggregation, grouping, as well as windowing operators, and discuss their design as modular elements.We also show how significant performance improvements can be achieved by inserting the FPGA into the system's data path (e.g., between the network interface and the host CPU). Our experiments show that queries on the FPGA can process streams at more than one million tuples per second and that they can do this directly from the network, removing much of the overhead of transferring the data to a conventional CPU.

...read moreread less

Patent•

Apparatus and methods of providing and receiving venue level transmissions and services

[...]

Raghuraman Krishnamoorthi¹, Pankaj V. Rahate¹, Pankaj Jain¹, Devarshi Shah¹, Pavel A. Seliverstov¹, George Allen Rothrock¹, Nilabh Khare¹, Anil K. Wadhwani¹, Jiming Guo¹, Sanjiv Nanda¹, Fuyun Ling¹, Murali Ramaswamy Chari¹, Avneesh Agrawal¹, Rinat Burdo¹, Prasanna Kannan¹, Krishna Kiran Mukkavilli¹, Reynaldo W. Newman¹, Michael M. Fan¹, Manoj M. Deshpande¹, Ranjith S. Jayaram¹ - Show less +16 more•Institutions (1)

Qualcomm¹

30 Sep 2009

TL;DR: In this article, the authors propose a method for providing and receiving venue level transmissions and services, including discovery of a venue specific transmission by receiving an overhead signal from a non-venue network, extracting information for receiving the venue-specific transmission from the overhead signal, and tuning to receive the venue specific transmissions based on the extracted information.

...read moreread less

Abstract: A venue-cast system and method for providing and receiving venue level transmissions and services, including discovery of a venue specific transmission by receiving an overhead signal from a non-venue network, extracting information for receiving the venue specific transmission from the overhead signal, and tuning to receive the venue specific transmission based on the extracted information The venue level transmission may be provided and received in a manner that does not prevent an access terminal from receiving a local area or wide area transmission

...read moreread less

Book Chapter•DOI•

Secure Arithmetic Computation with No Honest Majority

[...]

Yuval Ishai¹, Manoj Prabhakaran², Amit Sahai¹•Institutions (2)

University of California, Los Angeles¹, University of Illinois at Urbana–Champaign²

20 Feb 2009

TL;DR: These results extend a previous approach of Naor and Pinkas for secure polynomial evaluation to two-party protocols with security against malicious parties and present several solutions which differ in their efficiency, generality, and underlying intractability assumptions.

...read moreread less

Abstract: We study the complexity of securely evaluating arithmetic circuits over finite rings. This question is motivated by natural secure computation tasks. Focusing mainly on the case of two-party protocols with security against malicious parties, our main goals are to: (1) only make black-box calls to the ring operations and standard cryptographic primitives, and (2) minimize the number of such black-box calls as well as the communication overhead. We present several solutions which differ in their efficiency, generality, and underlying intractability assumptions. These include: An unconditionally secure protocol in the OT-hybrid model which makes a black-box use of an arbitrary ring R ,but where the number of ring operations grows linearly with (an upper bound on) log|R |. Computationally secure protocols in the OT-hybrid model which make a black-box use of an underlying ring, and in which the number of ring operations does not grow with the ring size. The protocols rely on variants of previous intractability assumptions related to linear codes. In the most efficient instance of these protocols, applied to a suitable class of fields, the (amortized) communication cost is a constant number of field elements per multiplication gate and the computational cost is dominated by O (logk ) field operations per gate, where k is a security parameter. These results extend a previous approach of Naor and Pinkas for secure polynomial evaluation (SIAM J. Comput. , 2006). A protocol for the rings *** m = ***/m *** which only makes a black-box use of a homomorphic encryption scheme. When m is prime, the (amortized) number of calls to the encryption scheme for each gate of the circuit is constant. All of our protocols are in fact UC-secure in the OT-hybrid model and can be generalized to multiparty computation with an arbitrary number of malicious parties.

...read moreread less

Proceedings Article•DOI•

Leveraging 3D PCRAM technologies to reduce checkpoint overhead for future exascale systems

[...]

Xiangyu Dong¹, Naveen Muralimanohar², Norm Jouppi², Richard Kaufmann², Yuan Xie¹ - Show less +1 more•Institutions (2)

Pennsylvania State University¹, Hewlett-Packard²

14 Nov 2009

TL;DR: This work uses the upcoming Phase-Change Random Access Memory (PCRAM) technology and proposes a hybrid local/global checkpointing mechanism after a thorough analysis of MPP systems failure rates and failure sources to reduce the checkpoint overhead and offer a smooth transition from the conventional pure HDD checkpoint to the ideal 3D PCRAM mechanism.

...read moreread less

Abstract: The scalability of future massively parallel processing (MPP) systems is challenged by high failure rates. Current hard disk drive (HDD) checkpointing results in overhead of 25% or more at the petascale. With a direct correlation between checkpoint frequencies and node counts, novel techniques that can take more frequent checkpoints with minimum overhead are critical to implement a reliable exascale system. In this work, we leverage the upcoming Phase-Change Random Access Memory (PCRAM) technology and propose a hybrid local/global checkpointing mechanism after a thorough analysis of MPP systems failure rates and failure sources.We propose three variants of PCRAM-based hybrid checkpointing schemes, DIMM+HDD, DIMM+DIMM, and 3D+3D, to reduce the checkpoint overhead and offer a smooth transition from the conventional pure HDD checkpoint to the ideal 3D PCRAM mechanism. The proposed pure 3D PCRAM-based mechanism can ultimately take checkpoints with overhead less than 4% on a projected exascale system.

...read moreread less

Journal Article•DOI•

Distributed Recursive Least-Squares for Consensus-Based In-Network Adaptive Estimation

[...]

Gonzalo Mateos¹, Ioannis D. Schizas¹, Georgios B. Giannakis¹•Institutions (1)

University of Minnesota¹

01 Nov 2009-IEEE Transactions on Signal Processing

TL;DR: Numerical simulations demonstrate that D-RLS can outperform existing approaches in terms of estimation performance and noise resilience, while it has the potential of performing efficient tracking.

...read moreread less

Abstract: Recursive least-squares (RLS) schemes are of paramount importance for reducing complexity and memory requirements in estimating stationary signals as well as for tracking nonstationary processes, especially when the state and/or data model are not available and fast convergence rates are at a premium. To this end, a fully distributed (D-) RLS algorithm is developed for use by wireless sensor networks (WSNs) whereby sensors exchange messages with one-hop neighbors to consent on the network-wide estimates adaptively. The WSNs considered here do not necessarily possess a Hamiltonian cycle, while the inter-sensor links are challenged by communication noise. The novel algorithm is obtained after judiciously reformulating the exponentially-weighted least-squares cost into a separable form, which is then optimized via the alternating-direction method of multipliers. If powerful error control codes are utilized and communication noise is not an issue, D-RLS is modified to reduce communication overhead when compared to existing noise-unaware alternatives. Numerical simulations demonstrate that D-RLS can outperform existing approaches in terms of estimation performance and noise resilience, while it has the potential of performing efficient tracking.

...read moreread less

Proceedings Article•DOI•

Throughput-efficient sequential channel sensing and probing in cognitive radio networks under sensing errors

[...]

Tao Shu¹, Marwan Krunz¹•Institutions (1)

University of Arizona¹

20 Sep 2009

TL;DR: An optimal decision strategy that maximizes the CR's average throughput is derived by formulating the sequential sensing/probing process as a rate-of-return problem, which is solved using optimal stopping theory.

...read moreread less

Abstract: In this paper, we exploit channel diversity for opportunistic spectrum access (OSA). Our approach uses channel quality as a second criterion (along with the idle/busy status of the channel) in selecting channels to use for opportunistic transmission. The difficulty of the problem comes from the fact that it is practically infeasible for a CR to first scan all channels and then pick the best among them, due to the potentially large number of channels open to OSA and the limited power/hardware capability of a CR. As a result, the CR can only sense and probe channels sequentially. To avoid collisions with other CRs, after sensing and probing a channel, the CR needs to make a decision on whether to terminate the scan and use the underlying channel or to skip it and scan the next one. The optimal use-or-skip decision strategy that maximizes the CR's average throughput is one of our primary concerns in this study. This problem is further complicated by practical considerations, such as sensing/probing overhead and sensing errors. An optimal decision strategy that addresses all the above considerations is derived by formulating the sequential sensing/probing process as a rate-of-return problem, which we solve using optimal stopping theory. We further explore the special structure of this strategy to conduct a "second-round" optimization over the operational parameters, such as the sensing and probing times. We show through simulations that significant throughput gains (e.g., about 100%) are achieved using our joint sensing/probing scheme over the conventional one that uses sensing alone.

...read moreread less

Proceedings Article•DOI•

Adaptive Spill-Receive for robust high-performance caching in CMPs

[...]

Moinuddin K. Qureshi¹•Institutions (1)

IBM¹

06 Mar 2009

TL;DR: A simple extension of DSR is proposed that provides Quality of Service (QoS) by guaranteeing that the worst-case performance of each application remains similar to that with no spilling, while still providing an average throughput improvement of 17.5%.

...read moreread less

Abstract: In a Chip Multi-Processor (CMP) with private caches, the last level cache is statically partitioned between all the cores. This prevents such CMPs from sharing cache capacity in response to the requirement of individual cores. Capacity sharing can be provided in private caches by spilling a line evicted from one cache to another cache. However, naively allowing all caches to spill evicted lines to other caches have limited performance benefit as such spilling does not take into account which cores benefit from extra capacity and which cores can provide extra capacity. This paper proposes Dynamic Spill-Receive (DSR) for efficient capacity sharing. In a DSR architecture, each cache uses Set Dueling to learn whether it should act as a “spiller cache” or “receiver cache” for best overall performance. We evaluate DSR for a Quad-core system with 1MB private caches using 495 multi-programmed workloads. DSR improves average throughput by 18% (weighted-speedup by 13% and harmonic-mean fairness metric by 36%) compared to no spilling. DSR requires a total storage overhead of less than two bytes per core, does not require any changes to the existing cache structure, and is scalable to a large number of cores (16 in our evaluation). Furthermore, we propose a simple extension of DSR that provides Quality of Service (QoS) by guaranteeing that the worst-case performance of each application remains similar to that with no spilling, while still providing an average throughput improvement of 17.5%.

...read moreread less

Journal Issue•DOI•

AQuoSA—adaptive quality of service architecture

[...]

Luigi Palopoli¹, Tommaso Cucinotta², L. Marzario², Giuseppe Lipari²•Institutions (2)

University of Trento¹, Sant'Anna School of Advanced Studies²

01 Jan 2009-Software - Practice and Experience

TL;DR: An architecture for quality of service (QoS) control of time-sensitive applications in multi-programmed embedded systems by combining a resource reservation scheduler and a feedback-based mechanism that allows applications to meet their QoS requirements with the minimum possible impact on CPU occupation is presented.

...read moreread less

Abstract: This paper presents an architecture for quality of service (QoS) control of time-sensitive applications in multi-programmed embedded systems. In such systems, tasks must receive appropriate timeliness guarantees from the operating system independently from one another; otherwise, the QoS experienced by the users may decrease. Moreover, fluctuations in time of the workloads make a static partitioning of the central processing unit (CPU) that is neither appropriate nor convenient, whereas an adaptive allocation based on an on-line monitoring of the application behaviour leads to an optimum design. By combining a resource reservation scheduler and a feedback-based mechanism, we allow applications to meet their QoS requirements with the minimum possible impact on CPU occupation. We implemented the framework in AQuoSA (Adaptive Quality of Service Architecture (AQuoSA). ), a software architecture that runs on top of the Linux kernel. We provide extensive experimental validation of our results and offer an evaluation of the introduced overhead, which is perfectly sustainable in the class of addressed applications. Copyright © 2008 John Wiley & Sons, Ltd.

...read moreread less

Proceedings Article•DOI•

Multi-agent Q-Learning of Channel Selection in Multi-user Cognitive Radio Systems: A Two by Two Case

[...]

Husheng Li¹•Institutions (1)

University of Tennessee¹

30 Mar 2009-arXiv: Information Theory

TL;DR: A channel selection scheme without negotiation is considered for multi-user and multi-channel cognitive radio systems and Multi-agent reinforcement leaning (MARL) is applied in the framework of Q-learning by considering opponent secondary users as a part of the environment.

...read moreread less

Abstract: Resource allocation is an important issue in cognitive radio systems. It can be done by carrying out negotiation among secondary users. However, significant overhead may be incurred by the negotiation since the negotiation needs to be done frequently due to the rapid change of primary users' activity. In this paper, a channel selection scheme without negotiation is considered for multi-user and multi-channel cognitive radio systems. To avoid collision incurred by non-coordination, each user secondary learns how to select channels according to its experience. Multi-agent reinforcement leaning (MARL) is applied in the framework of Q-learning by considering the opponent secondary users as a part of the environment. The dynamics of the Q-learning are illustrated using Metrick-Polak plot. A rigorous proof of the convergence of Q-learning is provided via the similarity between the Q-learning and Robinson-Monro algorithm, as well as the analysis of convergence of the corresponding ordinary differential equation (via Lyapunov function). Examples are illustrated and the performance of learning is evaluated by numerical simulations.

...read moreread less

Book Chapter•DOI•

Embedded Probabilistic Programming

[...]

Oleg Kiselyov¹, Chung-chieh Shan²•Institutions (2)

Fleet Numerical Meteorology and Oceanography Center¹, Rutgers University²

02 Jul 2009

TL;DR: This work uses delimited continuations to reify probabilistic programs as lazy search trees, which inference algorithms may traverse without imposing any interpretive overhead on deterministic parts of a model.

...read moreread less

Abstract: Two general techniques for implementing a domain-specific language (DSL) with less overhead are the finally-tagless embedding of object programs and the direct-style representation of side effects. We use these techniques to build a DSL for probabilistic programming , for expressing countable probabilistic models and performing exact inference and importance sampling on them. Our language is embedded as an ordinary OCaml library and represents probability distributions as ordinary OCaml programs. We use delimited continuations to reify probabilistic programs as lazy search trees, which inference algorithms may traverse without imposing any interpretive overhead on deterministic parts of a model. We thus take advantage of the existing OCaml implementation to achieve competitive performance and ease of use. Inference algorithms can easily be embedded in probabilistic programs themselves.

...read moreread less

Proceedings Article•DOI•

LooCI: a loosely-coupled component infrastructure for networked embedded systems

[...]

Danny Hughes¹, Klaas Thoelen¹, Wouter Horré¹, Nelson Matthys¹, Javier Del Cid¹, Sam Michiels¹, Christophe Huygens¹, Wouter Joosen¹ - Show less +4 more•Institutions (1)

Katholieke Universiteit Leuven¹

14 Dec 2009

TL;DR: A novel component and binding model for networked embedded systems (LooCI) that allows developers to model rich component interactions, while providing support for easy interception, re-wiring and re-use and imposes minimal overhead on developers.

...read moreread less

Abstract: Considerable research has been performed in applying run-time reconfigurable component models to the domain of wireless sensor networks The ability to dynamically deploy and reconfigure software components has clear advantages in sensor network deployments, which are typically large in scale and expected to operate for long periods in the face of node mobility, dynamic environmental conditions and changing application requirements To date, research on component and binding models for sensor networks has primarily focused on the development of specialized component models that are optimized for use in resource-constrained environments However, current approaches impose significant overhead upon developers and tend to use inflexible binding models based on remote procedure calls To address these concerns, we introduce a novel component and binding model for networked embedded systems (LooCI) LooCI components are designed to impose minimal additional overhead on developers Furthermore, LooCI components use a novel event-based binding model that allows developers to model rich component interactions, while providing support for easy interception, re-wiring and re-use A prototype implementation of our component and binding model has been realised for the SunSPOT platform Our preliminary evaluation shows that LooCI has an acceptable memory footprint and imposes minimal overhead on developers

...read moreread less

Proceedings Article•DOI•

C-MAC: Model-Driven Concurrent Medium Access Control for Wireless Sensor Networks

[...]

Mo Sha¹, Guoliang Xing, Gang Zhou², Shucheng Liu¹, Xiaorui Wang³ - Show less +1 more•Institutions (3)

City University of Hong Kong¹, College of William & Mary², University of Tennessee³

19 Apr 2009

TL;DR: Experiments show that C-MAC significantly outperforms the state-of-art CSMA protocol in TinyOS with respect to system throughput, delay and energy consumption.

...read moreread less

Abstract: This paper presents C-MAC, a new MAC protocol designed to achieve high-throughput bulk communication for data-intensive sensing applications. C-MAC exploits concurrent wireless channel access based on empirical power control and physical interference models. Nodes running C-MAC estimate the level of interference based on the physical signal-to-interference-plus-noise-ratio (SINR) model and adjust the transmission power accordingly for concurrent channel access. C-MAC employs a block-based communication mode that not only amortizes the overhead of channel assessment, but also improves the probability that multiple nodes within the interference range of each other can transmit concurrently. C-MAC has been implemented in TinyOS-1.x and extensively evaluated on Tmote nodes. Our experiments show that C-MAC significantly outperforms the state-of-art CSMA protocol in TinyOS with respect to system throughput, delay and energy consumption.

...read moreread less

Proceedings Article•DOI•

ShadowWalker: peer-to-peer anonymous communication using redundant structured topologies

[...]

Prateek Mittal¹, Nikita Borisov¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

09 Nov 2009

TL;DR: This work analytically calculate the anonymity provided by ShadowWalker and shows that it performs well for moderate levels of attackers, and is much better than the state of the art.

...read moreread less

Abstract: Peer-to-peer approaches to anonymous communication promise to eliminate the scalability concerns and central vulnerability points of current networks such as Tor. However, the P2P setting introduces many new opportunities for attack, and previous designs do not provide an adequate level of anonymity. We propose ShadowWalker: a new low-latency P2P anonymous communication system, based on a random walk over a redundant structured topology. We base our design on shadows that redundantly check and certify neighbor information; these certifications enable nodes to perform random walks over the structured topology while avoiding route capture and other attacks.We analytically calculate the anonymity provided by ShadowWalker and show that it performs well for moderate levels of attackers, and is much better than the state of the art. We also design an extension that improves forwarding performance at a slight anonymity cost, while at the same time protecting against selective DoS attacks. We show that our system has manageable overhead and can handle moderate churn, making it an attractive new design for P2P anonymous communication.

...read moreread less

Proceedings Article•DOI•

CLOUDLET: towards mapreduce implementation on virtual machines

[...]

Shadi Ibrahim¹, Hai Jin¹, Bin Cheng¹, Haijun Cao¹, Song Wu¹, Li Qi² - Show less +2 more•Institutions (2)

Huazhong University of Science and Technology¹, China Development Bank²

11 Jun 2009

TL;DR: The existing MapReduce framework in virtualized environment suffers from poor performance, due to the heavy overhead of I/O virtualization, and management difficulty for storage and computation, so the aim of Cloudlet design is to overcome the overhead of VM while benefiting of the other features of VM.

...read moreread less

Abstract: The existing MapReduce framework in virtualized environment suffers from poor performance, due to the heavy overhead of I/O virtualization, and management difficulty for storage and computation. To address the problems, we propose Cloudlet, a novel MapReduce framework on virtual machines. The aim of Cloudlet design is to overcome the overhead of VM while benefiting of the other features of VM (i.e. management and reliability issues).

...read moreread less

Journal Article•DOI•

System-Level Power Management Using Online Learning

[...]

Gaurav Dhiman¹, Tajana Rosing¹•Institutions (1)

University of California, San Diego¹

01 May 2009-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: The results show that the proposed online-learning algorithm adapts really well and achieves an overall performance comparable to the best-performing expert at any point in time, with energy savings as high as 61% and 49% for HDD and CPU, respectively.

...read moreread less

Abstract: In this paper, we propose a novel online-learning algorithm for system-level power management. We formulate both dynamic power management (DPM) and dynamic voltage-frequency scaling problems as one of workload characterization and selection and solve them using our algorithm. The selection is done among a set of experts, which refers to a set of DPM policies and voltage-frequency settings, leveraging the fact that different experts outperform each other under different workloads and device leakage characteristics. The online-learning algorithm adapts to changes in the characteristics and guarantees fast convergence to the best-performing expert. In our evaluation, we perform experiments on a hard disk drive (HDD) and Intel PXA27x core (CPU) with real-life workloads. Our results show that our algorithm adapts really well and achieves an overall performance comparable to the best-performing expert at any point in time, with energy savings as high as 61% and 49% for HDD and CPU, respectively. Moreover, it is extremely lightweight and has negligible overhead.

...read moreread less

Anatomy of a Scalable Software Transactional Memory

[...]

Yossi Lev¹, Victor Luchangco¹, Virendra J. Marathe¹, Marek Olszewski²•Institutions (2)

Sun Microsystems¹, Massachusetts Institute of Technology²

01 Jan 2009

TL;DR: SkySTM is the first STM that supports privatization and scales on modern multicore multiprocessors with hundreds of hardware threads on multiple chips, and uses a scalable nonzero indicator (SNZI), which was designed for this purpose.

...read moreread less

Abstract: Existing software transactional memory (STM) implementations often exhibit poor scalability, usually because of nonscalable mechanisms for read sharing, transactional consistency, and privatization; some STMs also have nonscalable centralized commit mechanisms. We describe novel techniques to eliminate bottlenecks from all of these mechanisms, and present SkySTM, which employs these techniques. SkySTM is the first STM that supports privatization and scales on modern multicore multiprocessors with hundreds of hardware threads on multiple chips. A central theme in this work is avoiding frequent updates to centralized metadata, especially for multi-chip systems, in which the cost of accessing centralized metadata increases dramatically. A key mechanism we use to do so is a scalable nonzero indicator (SNZI), which was designed for this purpose. A secondary contribution of the paper is a new and simplified SNZI algorithm. Our scalable privatization mechanism imposes only about 4% overhead in low-contention experiments; when contention is higher, the overhead still reaches only 35% with over 250 threads. In contrast, prior approaches have been reported as imposing over 100% overhead in some cases, even with only 8 threads.

...read moreread less

Journal Article•DOI•

Estimating clock uncertainty for efficient duty-cycling in sensor networks

[...]

Saurabh Ganeriwal¹, Ilias Tsigkogiannis¹, Hohyun Shim¹, Vlassios Tsiatsis¹, Mani Srivastava¹, Deepak Ganesan¹ - Show less +2 more•Institutions (1)

University of California, Los Angeles¹

01 Jun 2009-IEEE ACM Transactions on Networking

TL;DR: This paper proposes an uncertainty-driven approach to duty-cycling, where a model of long-term clock drift is used to minimize the duty-Cycling overhead, and designs a rate-adaptive, energy-efficientLong-term time synchronization algorithm that can adapt to changing clock drift and environmental conditions, while achieving application-specific precision with very high probability.

...read moreread less

Abstract: Radio duty cycling has received significant attention in sensor networking literature, particularly in the form of protocols for medium access control and topology management. While many protocols have claimed to achieve significant duty-cycling benefits in theory and simulation, these benefits have often not translated into practice. The dominant factor that prevents the optimal usage of the radio in real deployment settings is time uncertainty between sensor nodes which results in overhead in the form of long packet preambles, guard bands, and excessive control packets for synchronization. This paper proposes an uncertainty-driven approach to duty-cycling, where a model of long-term clock drift is used to minimize the duty-cycling overhead. First, we use long-term empirical measurements to evaluate and analyze in-depth the interplay between three key parameters that influence long-term synchronization: synchronization rate, history of past synchronization beacons, and the estimation scheme. Second, we use this measurement-based study to design a rate-adaptive, energy-efficient long-term time synchronization algorithm that can adapt to changing clock drift and environmental conditions, while achieving application-specific precision with very high probability. Finally, we integrate our uncertainty-driven time synchronization scheme with the BMAC medium access control protocol, and demonstrate one to two orders of magnitude reduction in transmission energy consumption with negligible impact on packet loss rate.

...read moreread less

Collapse