scispace - formally typeset
Search or ask a question

Showing papers on "Overhead (computing) published in 2012"


Proceedings Article
13 Jun 2012
TL;DR: This paper describes how LRC is used in WAS to provide low overhead durable storage with consistently low read latencies, and introduces a new set of codes for erasure coding called Local Reconstruction Codes (LRC).
Abstract: Windows Azure Storage (WAS) is a cloud storage system that provides customers the ability to store seemingly limitless amounts of data for any duration of time WAS customers have access to their data from anywhere, at any time, and only pay for what they use and store To provide durability for that data and to keep the cost of storage low, WAS uses erasure coding In this paper we introduce a new set of codes for erasure coding called Local Reconstruction Codes (LRC) LRC reduces the number of erasure coding fragments that need to be read when reconstructing data fragments that are offline, while still keeping the storage overhead low The important benefits of LRC are that it reduces the bandwidth and I/Os required for repair reads over prior codes, while still allowing a significant reduction in storage overhead We describe how LRC is used in WAS to provide low overhead durable storage with consistently low read latencies

1,002 citations


Proceedings ArticleDOI
13 Aug 2012
TL;DR: Kandoo is proposed, a framework for preserving scalability without changing switches that enables network operators to replicate local controllers on demand and relieve the load on the top layer, which is the only potential bottleneck in terms of scalability.
Abstract: Limiting the overhead of frequent events on the control plane is essential for realizing a scalable Software-Defined Network. One way of limiting this overhead is to process frequent events in the data plane. This requires modifying switches and comes at the cost of visibility in the control plane. Taking an alternative route, we propose Kandoo, a framework for preserving scalability without changing switches. Kandoo has two layers of controllers: (i) the bottom layer is a group of controllers with no interconnection, and no knowledge of the network-wide state, and (ii) the top layer is a logically centralized controller that maintains the network-wide state. Controllers at the bottom layer run only local control applications (i.e., applications that can function using the state of a single switch) near datapaths. These controllers handle most of the frequent events and effectively shield the top layer. Kandoo's design enables network operators to replicate local controllers on demand and relieve the load on the top layer, which is the only potential bottleneck in terms of scalability. Our evaluations show that a network controlled by Kandoo has an order of magnitude lower control channel consumption compared to normal OpenFlow networks.

697 citations


Journal ArticleDOI
TL;DR: This paper proposes an efficient and privacy-preserving aggregation scheme, named EPPA, for smart grid communications that resists various security threats and preserve user privacy, and has significantly less computation and communication overhead than existing competing approaches.
Abstract: The concept of smart grid has emerged as a convergence of traditional power system engineering and information and communication technology. It is vital to the success of next generation of power grid, which is expected to be featuring reliable, efficient, flexible, clean, friendly, and secure characteristics. In this paper, we propose an efficient and privacy-preserving aggregation scheme, named EPPA, for smart grid communications. EPPA uses a superincreasing sequence to structure multidimensional data and encrypt the structured data by the homomorphic Paillier cryptosystem technique. For data communications from user to smart grid operation center, data aggregation is performed directly on ciphertext at local gateways without decryption, and the aggregation result of the original data can be obtained at the operation center. EPPA also adopts the batch verification technique to reduce authentication cost. Through extensive analysis, we demonstrate that EPPA resists various security threats and preserve user privacy, and has significantly less computation and communication overhead than existing competing approaches.

682 citations


Proceedings ArticleDOI
13 Aug 2012
TL;DR: This paper introduces the notion of consistent network updates---updates that are guaranteed to preserve well-defined behaviors when transitioning mbetween configurations, and identifies two distinct consistency levels, per-packet and per-flow.
Abstract: Configuration changes are a common source of instability in networks, leading to outages, performance disruptions, and security vulnerabilities. Even when the initial and final configurations are correct, the update process itself often steps through intermediate configurations that exhibit incorrect behaviors. This paper introduces the notion of consistent network updates---updates that are guaranteed to preserve well-defined behaviors when transitioning mbetween configurations. We identify two distinct consistency levels, per-packet and per-flow, and we present general mechanisms for implementing them in Software-Defined Networks using switch APIs like OpenFlow. We develop a formal model of OpenFlow networks, and prove that consistent updates preserve a large class of properties. We describe our prototype implementation, including several optimizations that reduce the overhead required to perform consistent updates. We present a verification tool that leverages consistent updates to significantly reduce the complexity of checking the correctness of network control software. Finally, we describe the results of some simple experiments demonstrating the effectiveness of these optimizations on example applications.

656 citations


Proceedings ArticleDOI
20 May 2012
TL;DR: The results show that SkewTune can significantly reduce job runtime in the presence of skew and adds little to no overhead in the absence of skew.
Abstract: We present an automatic skew mitigation approach for user-defined MapReduce programs and present SkewTune, a system that implements this approach as a drop-in replacement for an existing MapReduce implementation. There are three key challenges: (a) require no extra input from the user yet work for all MapReduce applications, (b) be completely transparent, and (c) impose minimal overhead if there is no skew. The SkewTune approach addresses these challenges and works as follows: When a node in the cluster becomes idle, SkewTune identifies the task with the greatest expected remaining processing time. The unprocessed input data of this straggling task is then proactively repartitioned in a way that fully utilizes the nodes in the cluster and preserves the ordering of the input data so that the original output can be reconstructed by concatenation. We implement SkewTune as an extension to Hadoop and evaluate its effectiveness using several real applications. The results show that SkewTune can significantly reduce job runtime in the presence of skew and adds little to no overhead in the absence of skew.

460 citations


Book ChapterDOI
15 Apr 2012
TL;DR: In this paper, a construction of fully homomorphic encryption (FHE) schemes that for security parameter λ can evaluate any width-Ω(λ) circuit with t gates in time t· polylog(λ).
Abstract: We show that homomorphic evaluation of (wide enough) arithmetic circuits can be accomplished with only polylogarithmic overhead. Namely, we present a construction of fully homomorphic encryption (FHE) schemes that for security parameter λ can evaluate any width-Ω(λ) circuit with t gates in time t· polylog(λ). To get low overhead, we use the recent batch homomorphic evaluation techniques of Smart-Vercauteren and Brakerski-Gentry-Vaikuntanathan, who showed that homomorphic operations can be applied to "packed" ciphertexts that encrypt vectors of plaintext elements. In this work, we introduce permuting/routing techniques to move plaintext elements across these vectors efficiently. Hence, we are able to implement general arithmetic circuit in a batched fashion without ever needing to "unpack" the plaintext vectors. We also introduce some other optimizations that can speed up homomorphic evaluation in certain cases. For example, we show how to use the Frobenius map to raise plaintext elements to powers of p at the "cost" of a linear operation.

448 citations


Journal ArticleDOI
TL;DR: A new data encoding scheme called layered interleaving is proposed, designed for time-sensitive packet recovery in the presence of bursty loss, which is highly efficient in recovering the singleton losses almost immediately and from bursty data losses.
Abstract: Cloud computing provides convenient on-demand network access to a shared pool of configurable computing resources. The resources can be rapidly deployed with great efficiency and minimal management overhead. Cloud is an insecure computing platform from the view point of the cloud users, the system must design mechanisms that not only protect sensitive information by enabling computations with encrypted data, but also protect users from malicious behaviours by enabling the validation of the computation result. In this paper, we propose a new data encoding scheme called layered interleaving, designed for time-sensitive packet recovery in the presence of bursty loss. It is high-speed data recovery scheme with minimal loss probability and using a forward error correction scheme to handle bursty loss. The proposed approach is highly efficient in recovering the singleton losses almost immediately and from bursty data losses.

421 citations


Proceedings ArticleDOI
01 Apr 2012
TL;DR: This work studies the energy consumption of BLE by measuring real devices with a power monitor and derive models of the basic energy consumption behavior observed from the measurement results, and investigates the overhead of Ipv6-based communication over BLE, relevant for future IoT scenarios.
Abstract: Ultra low power communication mechanisms are essential for future Internet of Things deployments. Bluetooth Low Energy (BLE) is one promising candidate for such deployments. We study the energy consumption of BLE by measuring real devices with a power monitor and derive models of the basic energy consumption behavior observed from the measurement results. We investigate also the overhead of Ipv6-based communication over BLE, which is relevant for future IoT scenarios. We contrast our results by performing similar measurements with ZigBee/802.15.4 devices. Our results show that when compared to ZigBee, BLE is indeed very energy efficient in terms of number of bytes transferred per Joule spent. In addition, IPv6 communication energy overhead remains reasonable. We also point out a few specific limitations with current stack implementations and explain that removing those limitations could improve energy utility significantly.

375 citations


Journal ArticleDOI
TL;DR: StreamCloud is presented, a scalable and elastic stream processing engine for processing large data stream volumes that uses a novel parallelization technique that splits queries into subqueries that are allocated to independent sets of nodes in a way that minimizes the distribution overhead.
Abstract: Many applications in several domains such as telecommunications, network security, large-scale sensor networks, require online processing of continuous data flows. They produce very high loads that requires aggregating the processing capacity of many nodes. Current Stream Processing Engines do not scale with the input load due to single-node bottlenecks. Additionally, they are based on static configurations that lead to either under or overprovisioning. In this paper, we present StreamCloud, a scalable and elastic stream processing engine for processing large data stream volumes. StreamCloud uses a novel parallelization technique that splits queries into subqueries that are allocated to independent sets of nodes in a way that minimizes the distribution overhead. Its elastic protocols exhibit low intrusiveness, enabling effective adjustment of resources to the incoming load. Elasticity is combined with dynamic load balancing to minimize the computational resources used. The paper presents the system design, implementation, and a thorough evaluation of the scalability and elasticity of the fully implemented system.

329 citations


01 Jan 2012
TL;DR: The Kieker framework is reviewed, focusing on its features, its provided extension points for custom components, as well the imposed monitoring overhead.
Abstract: Kieker is an extensible framework for monitoring and analyzing the runtime behavior of concurrent or distributed software systems. It provides measurement probes for application performance monitoring and control-ow tracing. Analysis plugins extract and visualize architectural models, augmented by quantitative observations. Congurable readers and writers allow Kieker to be used for online and oine analysis. This paper reviews the Kieker framework focusing on its features, its provided extension points for custom components, as well the imposed monitoring overhead.

282 citations


Proceedings ArticleDOI
03 Mar 2012
TL;DR: The libdft project as mentioned in this paper is a dynamic data flow tracking (DFT) framework that is at once fast, reusable, and works with commodity software and hardware, and provides an API for building DFT-enabled tools that work on unmodified binaries, running on common operating systems and hardware.
Abstract: Dynamic data flow tracking (DFT) deals with tagging and tracking data of interest as they propagate during program execution. DFT has been repeatedly implemented by a variety of tools for numerous purposes, including protection from zero-day and cross-site scripting attacks, detection and prevention of information leaks, and for the analysis of legitimate and malicious software. We present libdft, a dynamic DFT framework that unlike previous work is at once fast, reusable, and works with commodity software and hardware. libdft provides an API for building DFT-enabled tools that work on unmodified binaries, running on common operating systems and hardware, thus facilitating research and rapid prototyping. We explore different approaches for implementing the low-level aspects of instruction-level data tracking, introduce a more efficient and 64-bit capable shadow memory, and identify (and avoid) the common pitfalls responsible for the excessive performance overhead of previous studies. We evaluate libdft using real applications with large codebases like the Apache and MySQL servers, and the Firefox web browser. We also use a series of benchmarks and utilities to compare libdft with similar systems. Our results indicate that it performs at least as fast, if not faster, than previous solutions, and to the best of our knowledge, we are the first to evaluate the performance overhead of a fast dynamic DFT implementation in such depth. Finally, libdft is freely available as open source software.

Journal ArticleDOI
TL;DR: A generic virtualization architecture for SR-IOV devices, which can be implemented on multiple Virtual Machine Monitors (VMMs), and has better throughout, scalability, and lower CPU utilization than paravirtualization.

Journal ArticleDOI
TL;DR: It is shown that the full multiplexing gain observed with perfect channel knowledge is preserved by analog feedback and that the mean loss in sum rate is bounded by a constant when signal-to-noise ratio is comparable in both forward and feedback channels.
Abstract: Interference alignment (IA) is a multiplexing gain optimal transmission strategy for the interference channel. While the achieved sum rate with IA is much higher than previously thought possible, the improvement comes at the cost of requiring network channel state information at the transmitters. This can be achieved by explicit feedback, a flexible yet potentially costly approach that incurs large overhead. In this paper we propose analog feedback as an alternative to limited feedback or reciprocity based alignment. We show that the full multiplexing gain observed with perfect channel knowledge is preserved by analog feedback and that the mean loss in sum rate is bounded by a constant when signal-to-noise ratio is comparable in both forward and feedback channels. When signal-to-noise ratios are not quite symmetric, a fraction of the multiplexing gain is achieved. We consider the overhead of training and feedback and use this framework to numerically optimize the system's effective throughput. We present simulation results to demonstrate the performance of IA with analog feedback, verify our theoretical analysis, and extend our conclusions on optimal training and feedback length.

Journal ArticleDOI
01 Mar 2012
TL;DR: iMapReduce significantly improves the performance of iterative implementations by reducing the overhead of creating new MapReduce jobs repeatedly, eliminating the shuffling of static data, and allowing asynchronous execution of map tasks.
Abstract: Iterative computation is pervasive in many applications such as data mining, web ranking, graph analysis, online social network analysis, and so on. These iterative applications typically involve massive data sets containing millions or billions of data records. This poses demand of distributed computing frameworks for processing massive data sets on a cluster of machines. MapReduce is an example of such a framework. However, MapReduce lacks built-in support for iterative process that requires to parse data sets iteratively. Besides specifying MapReduce jobs, users have to write a driver program that submits a series of jobs and performs convergence testing at the client. This paper presents iMapReduce, a distributed framework that supports iterative processing. iMapReduce allows users to specify the iterative computation with the separated map and reduce functions, and provides the support of automatic iterative processing within a single job. More importantly, iMapReduce significantly improves the performance of iterative implementations by (1) reducing the overhead of creating new MapReduce jobs repeatedly, (2) eliminating the shuffling of static data, and (3) allowing asynchronous execution of map tasks. We implement an iMapReduce prototype based on Apache Hadoop, and show that iMapReduce can achieve up to 5 times speedup over Hadoop for implementing iterative algorithms.

Journal ArticleDOI
TL;DR: In this article, a kinematic coupling-based off-line trajectory planning method for 2D overhead cranes is proposed for smooth trolley transportation and small payload swing, which is proven by Lyapunov techniques and Barbalat's lemmas.
Abstract: Motivated by the desire to achieve smooth trolley transportation and small payload swing, a kinematic coupling-based off-line trajectory planning method is proposed for 2-D overhead cranes. Specifically, to damp out unexpected payload swing, an antiswing mechanism is first introduced into an S-shape reference trajectory based on rigorous analysis for the coupling behavior between the payload and the trolley. After that, the combined trajectory is further tuned through a novel iterative learning strategy, which guarantees accurate trolley positioning. The performance of the proposed trajectory is proven by Lyapunov techniques and Barbalat's lemmas. Finally, some simulation and experiment results are provided to demonstrate the superior performance of the planned trajectory.

Journal ArticleDOI
TL;DR: A consequence of this paper is that it aids in gaining a better understanding of the range and coverage that BPL solutions can achieve; a preliminary step toward the system symbiosis between BPL systems and other broadband technologies in an SG environment.
Abstract: The established statistical analysis, already used to treat overhead transmission power grid networks, is now implemented to examine the factors influencing modal transmission characteristics and modal statistical performance metrics of overhead and underground low-voltage/broadband over power lines (LV/BPL) and medium-voltage/broadband over power lines (MV/BPL) channels associated with power distribution in smart grid (SG) networks. The novelty of this paper is threefold. First, a refined multidimensional chain scattering matrix (TM2) method suitable for overhead and underground LV/BPL and MV/BPL modal channels is evaluated against other relative theoretical and experimental proven models. Second, applying TM2 method, the end-to-end modal channel attenuation of various LV/BPL and MV/BPL multiconductor transmission line (MTL) configurations is determined. The LV/BPL and MV/BPL transmission channels are investigated with regard to their spectral behavior and their end-to-end modal channel attenuation. It is found that the above features depend drastically on the frequency, the type of power grid, the mode considered, the MTL configuration, the physical properties of the cables used, the end-to-end distance, and the number, the electrical length, and the terminations of the branches encountered along the end-to-end BPL signal propagation. Third, the statistical properties of various overhead and underground LV/BPL and MV/BPL modal channels are investigated revealing the correlation between end-to-end modal channel attenuation and modal root-mean-square delay spread (RMS-DS). Already verified in the case of overhead high-voltage (HV) BPL systems, this fundamental property of several wireline systems is also modally validated against relevant sets of field measurements, numerical results, and recently proposed statistical channel models for various overhead and underground LV/BPL and MV/BPL channels. Based on this common inherent attribute of either transmission or distribution BPL networks, new unified regression trend line is proposed giving a further boost towards BPL system intraoperability. A consequence of this paper is that it aids in gaining a better understanding of the range and coverage that BPL solutions can achieve; a preliminary step toward the system symbiosis between BPL systems and other broadband technologies in an SG environment.

Proceedings ArticleDOI
20 May 2012
TL;DR: This talk will talk about how developers can build applications on DynamoDB without having to deal with the complexity of operating a large scale database.
Abstract: Reliability and scalability of an application is dependent on how its application state is managed. To run applications at massive scale requires one to operate datastores that can scale to operate seamlessly across thousands of servers and can deal with various failure modes such as server failures, datacenter failures and network partitions. The goal of Amazon DynamoDB is to eliminate this complexity and operational overhead for our customers by offering a seamlessly scalable database service. In this talk, I will talk about how developers can build applications on DynamoDB without having to deal with the complexity of operating a large scale database.

Proceedings ArticleDOI
25 Feb 2012
TL;DR: A new hybrid approach, based on Algorithm-Based Fault Tolerance (ABFT), to help matrix factorizations algorithms survive fail-stop failures and theoretical analysis shows that the fault tolerance overhead sharply decreases with the scaling in the number of computing units and the problem size.
Abstract: Dense matrix factorizations, such as LU, Cholesky and QR, are widely used for scientific applications that require solving systems of linear equations, eigenvalues and linear least squares problems. Such computations are normally carried out on supercomputers, whose ever-growing scale induces a fast decline of the Mean Time To Failure (MTTF). This paper proposes a new hybrid approach, based on Algorithm-Based Fault Tolerance (ABFT), to help matrix factorizations algorithms survive fail-stop failures. We consider extreme conditions, such as the absence of any reliable component and the possibility of loosing both data and checksum from a single failure. We will present a generic solution for protecting the right factor, where the updates are applied, of all above mentioned factorizations. For the left factor, where the panel has been applied, we propose a scalable checkpointing algorithm. This algorithm features high degree of checkpointing parallelism and cooperatively utilizes the checksum storage leftover from the right factor protection. The fault-tolerant algorithms derived from this hybrid solution is applicable to a wide range of dense matrix factorizations, with minor modifications. Theoretical analysis shows that the fault tolerance overhead sharply decreases with the scaling in the number of computing units and the problem size. Experimental results of LU and QR factorization on the Kraken (Cray XT5) supercomputer validate the theoretical evaluation and confirm negligible overhead, with- and without-errors.

Journal ArticleDOI
TL;DR: The design has been generalized and adopted on both homogeneous and heterogeneous wireless sensor networks and can recover all sensing data even these data has been aggregated, called “recoverable.”
Abstract: Recently, several data aggregation schemes based on privacy homomorphism encryption have been proposed and investigated on wireless sensor networks. These data aggregation schemes provide better security compared with traditional aggregation since cluster heads (aggregator) can directly aggregate the ciphertexts without decryption; consequently, transmission overhead is reduced. However, the base station only retrieves the aggregated result, not individual data, which causes two problems. First, the usage of aggregation functions is constrained. For example, the base station cannot retrieve the maximum value of all sensing data if the aggregated result is the summation of sensing data. Second, the base station cannot confirm data integrity and authenticity via attaching message digests or signatures to each sensing sample. In this paper, we attempt to overcome the above two drawbacks. In our design, the base station can recover all sensing data even these data has been aggregated. This property is called “recoverable.” Experiment results demonstrate that the transmission overhead is still reduced even if our approach is recoverable on sensing data. Furthermore, the design has been generalized and adopted on both homogeneous and heterogeneous wireless sensor networks.

Proceedings ArticleDOI
19 Sep 2012
TL;DR: A virtually costless coherence that outperforms a MESI directory protocol while at the same time reducing shared cache and network energy consumption for 15 parallel benchmarks, on 16 cores is shown.
Abstract: Much of the complexity and overhead (directory, state bits, invalidations) of a typical directory coherence implementation stems from the effort to make it “invisible” even to the strongest memory consistency model. In this paper, we show that a much simpler, directory-less/broadcast-less, multicore coherence can outperform a directory protocol but without its complexity and overhead. Motivated by recent efforts to simplify coherence, we propose a hardware approach that does not require any application guidance. The cornerstone of our approach is a dynamic, application-transparent, write-policy (write-back for private data, write-through for shared data), simplifying the protocol to just two stable states. Self-invalidation of the shared data at synchronization points allows us to remove the directory (and invalidations) completely, with just a data-race-free guarantee from software. This leads to our main result: a virtually costless coherence that outperforms a MESI directory protocol (by 4.8%) while at the same time reducing shared cache and network energy consumption (by 14.2%) for 15 parallel benchmarks, on 16 cores.

Proceedings ArticleDOI
21 Sep 2012
TL;DR: A large-scale observational study that investigated mobile application interruptions in two scenarios: intended back and forth switching between applications and unintended interruptions caused by incoming phone calls reveals that these interruptions rarely happen but when they do, they may introduce a significant overhead.
Abstract: Smartphone users might be interrupted while interacting with an application, either by intended or unintended circumstances. In this paper, we report on a large-scale observational study that investigated mobile application interruptions in two scenarios: (1) intended back and forth switching between applications and (2) unintended interruptions caused by incoming phone calls. Our findings reveal that these interruptions rarely happen (at most 10% of the daily application usage), but when they do, they may introduce a significant overhead (can delay completion of a task by up to 4 times). We conclude with a discussion of the results, their limitations, and a series of implications for the design of mobile phones.

Posted Content
TL;DR: In this article, the authors study the performance of interference alignment in multiple-input multiple-output (MIMO) systems where channel knowledge is acquired through training and analog feedback and design the training and feedback system to maximize IA's effective sum-rate.
Abstract: Interference alignment (IA) is a cooperative transmission strategy that, under some conditions, achieves the interference channel's maximum number of degrees of freedom. Realizing IA gains, however, is contingent upon providing transmitters with sufficiently accurate channel knowledge. In this paper, we study the performance of IA in multiple-input multiple-output systems where channel knowledge is acquired through training and analog feedback. We design the training and feedback system to maximize IA's effective sum-rate: a non-asymptotic performance metric that accounts for estimation error, training and feedback overhead, and channel selectivity. We characterize effective sum-rate with overhead in relation to various parameters such as signal-to-noise ratio, Doppler spread, and feedback channel quality. A main insight from our analysis is that, by properly designing the CSI acquisition process, IA can provide good sum-rate performance in a very wide range of fading scenarios. Another observation from our work is that such overhead-aware analysis can help solve a number of practical network design problems. To demonstrate the concept of overhead-aware network design, we consider the example problem of finding the optimal number of cooperative IA users based on signal power and mobility.

Journal ArticleDOI
TL;DR: In this article, an off-line trolley trajectory planning method for the payload horizontal transferring task of underactuated overhead cranes is presented. And the proposed approach is feasible and efficient for crane control.
Abstract: The authors present a novel off-line trolley trajectory planning method for the payload horizontal transferring task of underactuated overhead cranes. The proposed approach is feasible and efficient for crane control. Specifically, the coupling behaviour between the actuated trolley motion and the unactuated payload swing is successfully addressed via some rigorous geometric analysis in the phase plane. Based on this, an analytical three-segment acceleration trajectory (i.e. a trapezoid velocity trajectory) is firstly obtained by carefully considering the practical constraints of crane control. To tackle the infinite jerk (discontinuity) problem as well as to show the flexibility of the proposed geometric analysis-based method, the authors then introduce some transition mechanisms to smoothen the acceleration trajectory, and to construct two types of modified acceleration trajectories. For any given transferring task, the proposed trajectory planning approach provides a novel mechanism to determine the parameters of the trajectories so that the transportation indexes, including the permitted payload swing, transferring efficiency and so on, can be met without much difficulty. Both simulation and experimental results are exhibited to illustrate the effectiveness and feasibility of the proposed approach.

Journal ArticleDOI
TL;DR: This paper studies the performance of IA in multiple-input multiple-output systems where channel knowledge is acquired through training and analog feedback, and designs the training and feedback system to maximize IA's effective sum-rate: a non-asymptotic performance metric that accounts for estimation error,Training and feedback overhead, and channel selectivity.
Abstract: Interference alignment (IA) is a cooperative transmission strategy that, under some conditions, achieves the interference channel's maximum number of degrees of freedom. Realizing IA gains, however, is contingent upon providing transmitters with sufficiently accurate channel knowledge. In this paper, we study the performance of IA in multiple-input multiple-output systems where channel knowledge is acquired through training and analog feedback. We design the training and feedback system to maximize IA's effective sum-rate: a non-asymptotic performance metric that accounts for estimation error, training and feedback overhead, and channel selectivity. We characterize effective sum-rate with overhead in relation to various parameters such as signal-to-noise ratio, Doppler spread, and feedback channel quality. A main insight from our analysis is that, by properly designing the CSI acquisition process, IA can provide good sum-rate performance in a very wide range of fading scenarios. Another observation from our work is that such overhead-aware analysis can help solve a number of practical network design problems. To demonstrate the concept of overhead-aware network design, we consider the example problem of finding the optimal number of cooperative IA users based on signal power and mobility.

Proceedings ArticleDOI
11 Jun 2012
TL;DR: This work presents a new precise dynamic race detector that leverages structured parallelism in order to address limitations of existing dynamic race detectors, and requires constant space per memory location, works in parallel, and is efficient in practice.
Abstract: Existing dynamic race detectors suffer from at least one of the following three limitations:(i)space overhead per memory location grows linearly with the number of parallel threads [13], severely limiting the parallelism that the algorithm can handle;(ii)sequentialization: the parallel program must be processed in a sequential order, usually depth-first [12, 24]. This prevents the analysis from scaling with available hardware parallelism, inherently limiting its performance;(iii) inefficiency: even though race detectors with good theoretical complexity exist, they do not admit efficient implem entations and are unsuitable for practical use [4, 18].We present a new precise dynamic race detector that leverages structured parallelism in order to address these limitations. Our algorithm requires constant space per memory location, works in parallel, and is efficient in practice. We implemented and evaluated our algorithm on a set of 15 benchmarks. Our experimental results indicate an average (geometric mean) slowdown of 2.78x on a 16-core SMP system.

Journal ArticleDOI
TL;DR: It is concluded that for few mode transmission systems the reduction of modal delay is crucial to enable long-haul performance and the complexity of mode-division multiplexed digital signal processing algorithms with different numbers of multiplexing modes in terms ofmodal dispersion and distance.
Abstract: The complexities of common equalizer schemes are analytically analyzed in this paper in terms of complex multiplications per bit. Based on this approach we compare the complexity of mode-division multiplexed digital signal processing algorithms with different numbers of multiplexed modes in terms of modal dispersion and distance. It is found that training symbol based equalizers have significantly lower complexity compared to blind approaches for long-haul transmission. Among the training symbol based schemes, OFDM requires the lowest complexity for crosstalk compensation in a mode-division multiplexed receiver. The main challenge for training symbol based schemes is the additional overhead required to compensate modal crosstalk, which increases the data rate. In order to achieve 2000 km transmission, the effective modal dispersion must therefore be below 6 ps/km when the OFDM specific overhead is limited to 10%. It is concluded that for few mode transmission systems the reduction of modal delay is crucial to enable long-haul performance.

Journal ArticleDOI
TL;DR: The AHD, a 2-dimensional measure for subacromial space, was found to be smaller on the dominant side in athletes with GIRD and was foundTo increase after a 6-week sleeper stretch program, but future studies are needed to determine clinical implications.
Abstract: Background:Loss of internal rotation range of motion (ROM) on the dominant side is well documented in athletes performing overhead sports activity. This altered motion pattern has been shown to change glenohumeral and scapular kinematics. This could compromise the subacromial space and explain the association between glenohumeral internal rotation deficit (GIRD) and subacromial impingement.Purpose:First, to quantify acromiohumeral distance (AHD) and compare between the dominant and nondominant side in overhead athletes with GIRD of more than 15°. Second, to investigate the effect of a sleeper stretch program on ROM and AHD.Study Design:Controlled laboratory study.Methods:Range of motion was measured with a digital inclinometer and AHD was measured with ultrasound in 62 overhead athletes with GIRD (>15°) at baseline. Differences between sides were analyzed. Athletes were randomly allocated to the stretch (n = 30) or control group (n = 32). The stretch group performed a 6-week sleeper stretch program on the...

Posted Content
TL;DR: ACE is introduced: a system that uses a probabilistic energy-minimization framework that combines a conditional random field with a Markov model to capture the temporal and spatial relations between the entities' poses and achieves its goals of having an accurate and efficient multi-entity indoors localization.
Abstract: Device-free (DF) localization in WLANs has been introduced as a value-added service that allows tracking indoor entities that do not carry any devices. Previous work in DF WLAN localization focused on the tracking of a single entity due to the intractability of the multi-entity tracking problem whose complexity grows exponentially with the number of humans being tracked. In this paper, we introduce Spot as an accurate and efficient system for multi-entity DF detection and tracking. Spot is based on a probabilistic energy minimization framework that combines a conditional random field with a Markov model to capture the temporal and spatial relations between the entities' poses. A novel cross-calibration technique is introduced to reduce the calibration overhead of multiple entities to linear, regardless of the number of humans being tracked. This also helps in increasing the system accuracy. We design the energy minimization function with the goal of being efficiently solved in mind. We show that the designed function can be mapped to a binary graph-cut problem whose solution has a linear complexity on average and a third order polynomial in the worst case. We further employ clustering on the estimated location candidates to reduce outliers and obtain more accurate tracking. Experimental evaluation in two typical testbeds, with a side-by-side comparison with the state-of-the-art, shows that Spot can achieve a multi-entity tracking accuracy of less than 1.1m. This corresponds to at least 36% enhancement in median distance error over the state-of-the-art DF localization systems, which can only track a single entity. In addition, Spot can estimate the number of entities correctly to within one difference error. This highlights that Spot achieves its goals of having an accurate and efficient software-only DF tracking solution of multiple entities in indoor environments.

Proceedings ArticleDOI
21 Oct 2012
TL;DR: This paper analyzes and compares the existing wake-up receiver prototypes and explores their benefits using simulations of two typical scenarios: with and without addressing requirements.
Abstract: Since in most wireless sensor network (WSN) scenarios nodes must operate autonomously for months or years, power management of the radio (usually consuming the largest amount of node's energy) is crucial. In particular, reducing the power consumption during listening plays a fundamental role in the whole energy balance of a sensor node, since shutting down the receiver when no messages are expected can remarkably increase the autonomy. Idle listening is a hard challenge because incoming messages are often unpredictable and developers have to trade off low power consumption and high quality of service. This paper is focusing on benefits of introducing a wake-up receiver over simple duty-cycling (wake-on radio). We analyze and compare the existing wake-up receiver prototypes and explore their benefits using simulations of two typical scenarios: with and without addressing requirements. A particular approach outperforms other solutions in terms of lifetime extension because of its very low power consumption (1$\mu$W). We also evaluate the overhead of the addressing capability, which sometimes has a non-negligible impact on the performance.

Journal ArticleDOI
01 Dec 2012
TL;DR: This paper introduces very lightweight locking (VLL), an alternative approach to pessimistic concurrency control for main-memory database systems that avoids almost all overhead associated with traditional lock manager operations, and proposes a protocol called selective contention analysis (SCA), which enables systems implementing VLL to achieve high transactional throughput under high contention workloads.
Abstract: Locking is widely used as a concurrency control mechanism in database systems. As more OLTP databases are stored mostly or entirely in memory, transactional throughput is less and less limited by disk IO, and lock managers increasingly become performance bottlenecks.In this paper, we introduce very lightweight locking (VLL), an alternative approach to pessimistic concurrency control for main-memory database systems that avoids almost all overhead associated with traditional lock manager operations. We also propose a protocol called selective contention analysis (SCA), which enables systems implementing VLL to achieve high transactional throughput under high contention workloads. We implement these protocols both in a traditional single-machine multi-core database server setting and in a distributed database where data is partitioned across many commodity machines in a shared-nothing cluster. Our experiments show that VLL dramatically reduces locking overhead and thereby increases transactional throughput in both settings.