Showing papers on "Latency (engineering) published in 2015"

PDF

Open Access

Proceedings Article•DOI•

TIMELY: RTT-based Congestion Control for the Datacenter

[...]

Radhika Mittal¹, Nandita Dukkipati², Emily Blem², Hassan M. G. Wassel², Monia Ghobadi³, Amin Vahdat², Yaogong Wang², David Wetherall², David Zats² - Show less +5 more•Institutions (3)

University of California, Berkeley¹, Google², Microsoft³

17 Aug 2015

TL;DR: TIMELY is the first delay-based congestion control protocol for use in the datacenter, and it achieves its results despite having an order of magnitude fewer RTT signals than earlier delay- based schemes such as Vegas.

...read moreread less

Abstract: Datacenter transports aim to deliver low latency messaging together with high throughput. We show that simple packet delay, measured as round-trip times at hosts, is an effective congestion signal without the need for switch feedback. First, we show that advances in NIC hardware have made RTT measurement possible with microsecond accuracy, and that these RTTs are sufficient to estimate switch queueing. Then we describe how TIMELY can adjust transmission rates using RTT gradients to keep packet latency low while delivering high bandwidth. We implement our design in host software running over NICs with OS-bypass capabilities. We show using experiments with up to hundreds of machines on a Clos network topology that it provides excellent performance: turning on TIMELY for OS-bypass messaging over a fabric with PFC lowers 99 percentile tail latency by 9X while maintaining near line-rate throughput. Our system also outperforms DCTCP running in an optimized kernel, reducing tail latency by $13$X. To the best of our knowledge, TIMELY is the first delay-based congestion control protocol for use in the datacenter, and it achieves its results despite having an order of magnitude fewer RTT signals (due to NIC offload) than earlier delay-based schemes such as Vegas.

...read moreread less

442 citations

Proceedings Article•DOI•

Pingmesh: A Large-Scale System for Data Center Network Latency Measurement and Analysis

[...]

Chuanxiong Guo¹, Lihua Yuan¹, Dong Xiang¹, Yingnong Dang¹, Ray Huang¹, David A. Maltz¹, Zhaoyi Liu¹, Vin Wang¹, Bin Pang¹, Hua Chen¹, Zhi-Wei Lin¹, Varugis Kurien - Show less +8 more•Institutions (1)

Microsoft¹

17 Aug 2015

TL;DR: The Pingmesh system for large-scale data center network latency measurement and analysis is developed to answer the question affirmatively: can the authors get network latency between any two servers at any time in large- scale data center networks?

...read moreread less

Abstract: Can we get network latency between any two servers at any time in large-scale data center networks? The collected latency data can then be used to address a series of challenges: telling if an application perceived latency issue is caused by the network or not, defining and tracking network service level agreement (SLA), and automatic network troubleshooting. We have developed the Pingmesh system for large-scale data center network latency measurement and analysis to answer the above question affirmatively. Pingmesh has been running in Microsoft data centers for more than four years, and it collects tens of terabytes of latency data per day. Pingmesh is widely used by not only network software developers and engineers, but also application and service developers and operators.

...read moreread less

336 citations

Journal Article•DOI•

Ex vivo analysis identifies effective HIV-1 latency–reversing drug combinations

[...]

Gregory M. Laird, C. Korin Bullen, Daniel I. S. Rosenbloom¹, Alyssa R. Martin², Alison L. Hill, Christine M. Durand, Janet D. Siliciano, Robert F. Siliciano - Show less +4 more•Institutions (2)

Columbia University¹, Johns Hopkins University²

01 May 2015-Journal of Clinical Investigation

TL;DR: It is shown that protein kinase C agonists in combination with bromodomain inhibitor JQ1 or histone deacetylase inhibitors robustly induce HIV-1 transcription and virus production when directly compared with maximum reactivation by T cell activation.

...read moreread less

Abstract: Reversal of HIV-1 latency by small molecules is a potential cure strategy. This approach will likely require effective drug combinations to achieve high levels of latency reversal. Using resting CD4+ T cells (rCD4s) from infected individuals, we developed an experimental and theoretical framework to identify effective latency-reversing agent (LRA) combinations. Utilizing ex vivo assays for intracellular HIV-1 mRNA and virion production, we compared 2-drug combinations of leading candidate LRAs and identified multiple combinations that effectively reverse latency. We showed that protein kinase C agonists in combination with bromodomain inhibitor JQ1 or histone deacetylase inhibitors robustly induce HIV-1 transcription and virus production when directly compared with maximum reactivation by T cell activation. Using the Bliss independence model to quantitate combined drug effects, we demonstrated that these combinations synergize to induce HIV-1 transcription. This robust latency reversal occurred without release of proinflammatory cytokines by rCD4s. To extend the clinical utility of our findings, we applied a mathematical model that estimates in vivo changes in plasma HIV-1 RNA from ex vivo measurements of virus production. Our study reconciles diverse findings from previous studies, establishes a quantitative experimental approach to evaluate combinatorial LRA efficacy, and presents a model to predict in vivo responses to LRAs.

...read moreread less

335 citations

Journal Article•DOI•

Optimization of Radio and Computational Resources for Energy Efficiency in Latency-Constrained Application Offloading

[...]

Olga Munoz¹, Antonio Pascual-Iserte¹, Josep Vidal¹•Institutions (1)

Polytechnic University of Catalonia¹

01 Oct 2015-IEEE Transactions on Vehicular Technology

TL;DR: A framework for the joint optimization of the radio and computational resource usage exploiting the tradeoff between energy consumption and latency is provided and the minimization of the total consumed energy without latency constraints is analyzed.

...read moreread less

Abstract: Providing femto access points (FAPs) with computational capabilities will allow (either total or partial) offloading of highly demanding applications from smartphones to the so-called femto-cloud. Such offloading promises to be beneficial in terms of battery savings at the mobile terminal (MT) and/or in latency reduction in the execution of applications. However, for this promise to become a reality, the energy and/or the time required for the communication process must be compensated by the energy and/or the time savings that result from the remote computation at the FAPs. For this problem, we provide in this paper a framework for the joint optimization of the radio and computational resource usage exploiting the tradeoff between energy consumption and latency. Multiple antennas are assumed to be available at the MT and the serving FAP. As a result of the optimization, the optimal communication strategy (e.g., transmission power, rate, and precoder) is obtained, as well as the optimal distribution of the computational load between the handset and the serving FAP. This paper also establishes the conditions under which total or no offloading is optimal, determines which is the minimum affordable latency in the execution of the application, and analyzes, as a particular case, the minimization of the total consumed energy without latency constraints.

...read moreread less

330 citations

Journal Article•DOI•

It's not too late: the onset of the frontocentral P3 indexes successful response inhibition in the stop-signal paradigm.

[...]

Jan R. Wessel¹, Adam R. Aron¹•Institutions (1)

University of California, San Diego¹

01 Apr 2015-Psychophysiology

TL;DR: P3 onset latency is shorter when stopping is successful, that it is highly correlated with SSRT, and that it coincides with the purported timing of the inhibition process (towards the end of SSRT).

...read moreread less

Abstract: The frontocentral P3 event-related potential has been proposed as a neural marker of response inhibition. However, this association is disputed: some argue that P3 latency is too late relative to the timing of action stopping (stop-signal reaction time; SSRT) to index response inhibition. We tested whether P3 onset latency is a marker of response inhibition, and whether it coincides with the timing predicted by neurocomputational models. We measured EEG in 62 participants during the stop-signal task, and used independent component analysis and permutation statistics to measure the P3 onset in each participant. We show that P3 onset latency is shorter when stopping is successful, that it is highly correlated with SSRT, and that it coincides with the purported timing of the inhibition process (towards the end of SSRT). These results demonstrate the utility of P3 onset latency as a noninvasive, temporally precise neural marker of the response inhibition process.

...read moreread less

205 citations

Journal Article•DOI•

A Hardwired HIV Latency Program

[...]

Brandon S. Razooky¹, Anand Pai¹, Katherine Aull¹, Igor M. Rouzine¹, Leor S. Weinberger¹ - Show less +1 more•Institutions (1)

University of California, San Francisco¹

26 Feb 2015-Cell

TL;DR: Synthetically decouple viral dependence on cellular environment from viral transcription and show that Tat feedback is sufficient to regulate latency independent of cellular activation, demonstrating that a largely autonomous, viral-encoded program underlies HIV latency.

...read moreread less

200 citations

Proceedings Article•

Queues don't matter when you can JUMP them!

[...]

Matthew P. Grosvenor¹, Malte Schwarzkopf¹, Ionel Gog¹, Robert N. M. Watson¹, Andrew W. Moore¹, Steven Hand¹, Jon Crowcroft¹ - Show less +3 more•Institutions (1)

University of Cambridge¹

04 May 2015

TL;DR: It is shown that QJUMP achieves bounded latency and reduces in-network interference by up to 300×, outperforming Ethernet Flow Control (802.3x), ECN (WRED) and DCTCP and pFabric.

...read moreread less

Abstract: QJUMP is a simple and immediately deployable approach to controlling network interference in datacenter networks. Network interference occurs when congestion from throughput-intensive applications causes queueing that delays traffic from latency-sensitive applications. To mitigate network interference, QJUMP applies Internet QoS-inspired techniques to datacenter applications. Each application is assigned to a latency sensitivity level (or class). Packets from higher levels are rate-limited in the end host, but once allowed into the network can "jump-the-queue" over packets from lower levels. In settings with known node counts and link speeds, QJUMP can support service levels ranging from strictly bounded latency (but with low rate) through to line-rate throughput (but with high latency variance). We have implemented QJUMP as a Linux Traffic Control module. We show that QJUMP achieves bounded latency and reduces in-network interference by up to 300×, outperforming Ethernet Flow Control (802.3x), ECN (WRED) and DCTCP. We also show that QJUMP improves average flow completion times, performing close to or better than DCTCP and pFabric.

...read moreread less

176 citations

Proceedings Article•

C3: cutting tail latency in cloud data stores via adaptive replica selection

[...]

Lalith Suresh¹, Marco Canini², Stefan Schmid³, Anja Feldmann¹•Institutions (3)

Technical University of Berlin¹, Université catholique de Louvain², Telekom Innovation Laboratories³

04 May 2015

TL;DR: The design and implementation of an adaptive replica selection mechanism, C3, that is robust to performance variability in the environment is presented and results show that C3 significantly improves the latencies along the mean, median, and tail and provides higher system throughput.

...read moreread less

Abstract: Achieving predictable performance is critical for many distributed applications, yet difficult to achieve due to many factors that skew the tail of the latency distribution even in well-provisioned systems. In this paper, we present the fundamental challenges involved in designing a replica selection scheme that is robust in the face of performance fluctuations across servers. We illustrate these challenges through performance evaluations of the Cassandra distributed database on Amazon EC2. We then present the design and implementation of an adaptive replica selection mechanism, C3, that is robust to performance variability in the environment. We demonstrate C3's effectiveness in reducing the latency tail and improving throughput through extensive evaluations on Amazon EC2 and through simulations. Our results show that C3 significantly improves the latencies along the mean, median, and tail (up to 3 times improvement at the 99.9th percentile) and provides higher system throughput.

...read moreread less

169 citations

Journal Article•DOI•

Using Straggler Replication to Reduce Latency in Large-scale Parallel Computing

[...]

Da Wang, Gauri Joshi¹, Gregory W. Wornell¹•Institutions (1)

Massachusetts Institute of Technology¹

19 Nov 2015

TL;DR: This work analyses how task replication reduces latency, and proposes a heuristic algorithm to search for the best replication strategies when it is difficult to model the empirical behavior of task execution time and uses the proposed analysis techniques.

...read moreread less

Abstract: In cloud computing jobs consisting of many tasks run in parallel, the tasks on the slowest machines (straggling tasks) become the bottleneck in the completion of the job. One way to combat the variability in machine response time is to add replicas of straggling tasks and wait for the earliest copy to finish. Using the theory of extreme order statistics, we analyze how task replication reduces latency, and its impact on the cost of computing resources. We also propose a heuristic algorithm to search for the best replication strategies when it is difficult to model the empirical behavior of task execution time and use the proposed analysis techniques. Evaluation of the heuristic policies on Google Trace data shows a significant latency reduction compared to the replication strategy used in MapReduce.

...read moreread less

144 citations

Journal Article•DOI•

Global dynamics of a delayed within-host viral infection model with both virus-to-cell and cell-to-cell transmissions.

[...]

Yu Yang¹, Lan Zou², Shigui Ruan³•Institutions (3)

Zhejiang International Studies University¹, Sichuan University², University of Miami³

01 Dec 2015-Bellman Prize in Mathematical Biosciences

TL;DR: The model can be applied to describe the within-host dynamics of HBV, HIV, or HTLV-1 infection and it is found that the global stability of the chronic infection equilibrium might change in some special cases when the assumptions do not hold.

...read moreread less

Abstract: A within-host viral infection model with both virus-to-cell and cell-to-cell transmissions and three distributed delays is investigated, in which the first distributed delay describes the intracellular latency for the virus-to-cell infection, the second delay represents the intracellular latency for the cell-to-cell infection, and the third delay describes the time period that viruses penetrated into cells and infected cells release new virions. The global stability analysis of the model is carried out in terms of the basic reproduction number R0. If R0≤1, the infection-free (semi-trivial) equilibrium is the unique equilibrium and is globally stable; if R0>1, the chronic infection (positive) equilibrium exists and is globally stable under certain assumptions. Examples and numerical simulations for several special cases are presented, including various within-host dynamics models with discrete or distributed delays that have been well-studied in the literature. It is found that the global stability of the chronic infection equilibrium might change in some special cases when the assumptions do not hold. The results show that the model can be applied to describe the within-host dynamics of HBV, HIV, or HTLV-1 infection.

...read moreread less

137 citations

Proceedings Article•DOI•

Elastic Stream Processing with Latency Guarantees

[...]

Björn Lohrmann¹, Peter Janacik¹, Odej Kao¹•Institutions (1)

Technical University of Berlin¹

01 Jun 2015

TL;DR: A model for estimating the latency of a data flow, when the degrees of parallelism of the tasks within are changed is introduced, and how it can be used to enforce latency guarantees, by determining appropriate scaling actions at runtime is described.

...read moreread less

Abstract: Many Big Data applications in science and industry have arisen, that require large amounts of streamed or event data to be analyzed with low latency. This paper presents a reactive strategy to enforce latency guarantees in data flows running on scalable Stream Processing Engines (SPEs), while minimizing resource consumption. We introduce a model for estimating the latency of a data flow, when the degrees of parallelism of the tasks within are changed. We describe how to continuously measure the necessary performance metrics for the model, and how it can be used to enforce latency guarantees, by determining appropriate scaling actions at runtime. Therefore, it leverages the elasticity inherent to common cloud technology and cluster resource management systems. We have implemented our strategy as part of the Nephele SPE. To showcase the effectiveness of our approach, we provide an experimental evaluation on a large commodity cluster, using both a synthetic workload as well as an application performing real-time sentiment analysis on real-world social media data.

...read moreread less

Proceedings Article•DOI•

Few-to-Many: Incremental Parallelism for Reducing Tail Latency in Interactive Services

[...]

E. Haque¹, Yong hun Eom², Yuxiong He³, Sameh Elnikety³, Ricardo Bianchini³, Kathryn S. McKinley³ - Show less +2 more•Institutions (3)

Rutgers University¹, University of California, Irvine², Microsoft³

14 Mar 2015

TL;DR: Few-to-Many (FM) incremental parallelization is introduced, which dynamically increases parallelism to reduce tail latency and improves tail latency by a factor of two compared to prior state-of-the-art parallelization.

...read moreread less

Abstract: Interactive services, such as Web search, recommendations, games, and finance, must respond quickly to satisfy customers. Achieving this goal requires optimizing tail (e.g., 99th+ percentile) latency. Although every server is multicore, parallelizing individual requests to reduce tail latency is challenging because (1) service demand is unknown when requests arrive; (2) blindly parallelizing all requests quickly oversubscribes hardware resources; and (3) parallelizing the numerous short requests will not improve tail latency. This paper introduces Few-to-Many (FM) incremental parallelization, which dynamically increases parallelism to reduce tail latency. FM uses request service demand profiles and hardware parallelism in an offline phase to compute a policy, represented as an interval table, which specifies when and how much software parallelism to add. At runtime, FM adds parallelism as specified by the interval table indexed by dynamic system load and request execution time progress. The longer a request executes, the more parallelism FM adds. We evaluate FM in Lucene, an open-source enterprise search engine, and in Bing, a commercial Web search engine. FM improves the 99th percentile response time up to 32% in Lucene and up to 26% in Bing, compared to prior state-of-the-art parallelization. Compared to running requests sequentially in Bing, FM improves tail latency by a factor of two. These results illustrate that incremental parallelism is a powerful tool for reducing tail latency.

...read moreread less

Journal Article•DOI•

An Evolutionary Role for HIV Latency in Enhancing Viral Transmission

[...]

Igor M. Rouzine¹, Ariel D. Weinberger², Leor S. Weinberger³, Leor S. Weinberger⁴, Leor S. Weinberger¹ - Show less +1 more•Institutions (4)

Gladstone Institutes¹, Wyss Institute for Biologically Inspired Engineering², California Institute for Quantitative Biosciences³, University of California, San Francisco⁴

26 Feb 2015-Cell

TL;DR: It is proposed that latency is an evolutionary "bet-hedging" strategy whose frequency has been optimized to maximize lentiviral transmission by reducing viral extinction during mucosal infections.

...read moreread less

Proceedings Article•DOI•

Hermes: Latency optimal task assignment for resource-constrained mobile computing

[...]

Yi-Hsuan Kao¹, Bhaskar Krishnamachari¹, Moo-Ryong Ra², Fan Bai³•Institutions (3)

University of Southern California¹, AT&T², General Motors³

24 Aug 2015

TL;DR: Her Hermes, a novel fully polynomial time problem approximation scheme (FPTAS) algorithm, is proposed to solve the problem to minimize the latency while meeting prescribed resource utilization constraints.

...read moreread less

Abstract: With mobile devices increasingly able to connect to cloud servers from anywhere, resource-constrained devices can potentially perform offloading of computational tasks to either improve resource usage or improve performance. It is of interest to find optimal assignments of tasks to local and remote devices that can take into account the application-specific profile, availability of computational resources, and link connectivity, and find a balance between energy consumption costs of mobile devices and latency for delay-sensitive applications. Given an application described by a task dependency graph, we formulate an optimization problem to minimize the latency while meeting prescribed resource utilization constraints. Different from most of existing works that either rely on an integer linear programming formulation, which is NP-hard and not applicable to general task dependency graph for latency metrics, or on intuitively derived heuristics that offer no theoretical performance guarantees, we propose Hermes, a novel fully polynomial time problem approximation scheme (FPTAS) algorithm to solve this problem. Hermes pros vides a solution with latency no more than (1 + e) times of the minimum while incurring complexity that is an polynomial in problem size and //e We evaluate the performance by using real data set collected from several benchmarks, and show that Hermes improves the latency by 16% (36% for larger scale application) compared to a previously published heuristic and increases CPU computing time by only 0.4% of overall latency.

...read moreread less

Proceedings Article•

CosTLO: cost-effective redundancy for lower latency variance on cloud storage services

[...]

Zhe Wu¹, Curtis Yu², Harsha V. Madhyastha¹•Institutions (2)

University of Michigan¹, University of California, Riverside²

04 May 2015

TL;DR: CosTLO is designed to satisfy any application's goals for latency variance by estimating the latency variance offered by any particular configuration, efficiently searching through the configuration space to select a cost-effective configuration among the ones that can offer the desired latency variance.

...read moreread less

Abstract: We present CosTLO, a system that reduces the high latency variance associated with cloud storage services by augmenting GET/PUT requests issued by end-hosts with redundant requests, so that the earliest response can be considered. To reduce the cost overhead imposed by redundancy, unlike prior efforts that have used this approach, CosTLO combines the use of multiple forms of redundancy. Since this results in a large number of configurations in which CosTLO can issue redundant requests, we conduct a comprehensive measurement study on S3 and Azure to identify the configurations that are viable in practice. Informed by this study, we design CosTLO to satisfy any application's goals for latency variance by 1) estimating the latency variance offered by any particular configuration, 2) efficiently searching through the configuration space to select a cost-effective configuration among the ones that can offer the desired latency variance, and 3) preserving data consistency despite CosTLO's use of redundant requests. We show that, for the median PlanetLab node, CosTLO can halve the latency variance associated with fetching content from Amazon S3, with only a 25% increase in cost.

...read moreread less

Book Chapter•DOI•

Software-Defined Latency Monitoring in Data Center Networks

[...]

Curtis Yu¹, Cristian Lumezanu, Abhishek Sharma, Qiang Xu, Guofei Jiang, Harsha V. Madhyastha² - Show less +2 more•Institutions (2)

University of California¹, University of Michigan²

19 Mar 2015

TL;DR: Data center network operators have to continually monitor path latency to quickly detect and re-route traffic away from high-delay path segments by passively capturing and aggregating traffic on network devices.

...read moreread less

Abstract: Data center network operators have to continually monitor path latency to quickly detect and re-route traffic away from high-delay path segments. Existing latency monitoring techniques in data centers rely on either (1) actively sending probes from end-hosts, which is restricted in some cases and can only measure end-to-end latencies, or (2) passively capturing and aggregating traffic on network devices, which requires hardware modifications.

...read moreread less

Proceedings Article•DOI•

How Much Faster is Fast Enough?: User Perception of Latency & Latency Improvements in Direct and Indirect Touch

[...]

Jonathan Deber¹, Ricardo Jota¹, Clifton Forlines¹, Daniel Wigdor¹•Institutions (1)

University of Toronto¹

18 Apr 2015

TL;DR: The first experiment extends previous efforts to measure latency perception by reporting on a unified study in which direct and indirect form-factors are compared for both tapping and dragging tasks, showing significant effects from both form-factor and task.

...read moreread less

Abstract: This paper reports on two experiments designed to further our understanding of users' perception of latency in touch- based systems. The first experiment extends previous efforts to measure latency perception by reporting on a unified study in which direct and indirect form-factors are compared for both tapping and dragging tasks. Our results show significant effects from both form-factor and task, and inform system designers as to what input latencies they should aim to achieve in a variety of system types. A follow-up experiment investigates peoples' ability to perceive small improvements to latency in direct and indirect form-factors for tapping and dragging tasks. Our results provide guidance to system designers of the relative value of making improvements in latency that reduce but do not fully eliminate lag from their systems.

...read moreread less

Proceedings Article•

Accurate latency-based congestion feedback for datacenters

[...]

Changhyun Lee¹, Chunjong Park¹, Keon Jang², Sue Moon¹, Dongsu Han¹ - Show less +1 more•Institutions (2)

KAIST¹, Intel²

08 Jul 2015

TL;DR: It is demonstrated that latency-based implicit feedback is accurate enough to signal a single packet's queuing delay in 10 Gbps networks, and the latency feedback can be used to perform practical and fine-grained congestion control in high-speed datacenter networks.

...read moreread less

Abstract: The nature of congestion feedback largely governs the behavior of congestion control. In datacenter networks, where RTTs are in hundreds of microseconds, accurate feedback is crucial to achieve both high utilization and low queueing delay. Proposals for datacenter congestion control predominantly leverage ECN or even explicit in-network feedback (e.g., RCP-type feedback) to minimize the queuing delay. In this work we explore latency-based feedback as an alternative and show its advantages over ECN. Against the common belief that such implicit feed-back is noisy and inaccurate, we demonstrate that latency-based implicit feedback is accurate enough to signal a single packet's queuing delay in 10 Gbps networks. DX enables accurate queuing delay measurements whose error falls within 1.98 and 0.53 microseconds using software-based and hardware-based latency measurements, respectively. This enables us to design a new congestion control algorithm that performs fine-grained control to adjust the congestion window just enough to achieve very low queuing delay while attaining full utilization. Our extensive evaluation shows that 1) the latency measurement accurately reflects the one-way queuing delay in single packet level; 2) the latency feedback can be used to perform practical and fine-grained congestion control in high-speed datacenter networks; and 3) DX outperforms DCTCP with 5.33× smaller median queueing delay at 1 Gbps and 1.57× at 10 Gbps.

...read moreread less

Journal Article•DOI•

The Remarkable Stability of the Latent Reservoir for HIV-1 in Resting Memory CD4+ T Cells

[...]

Janet M. Siliciano¹, Robert F. Siliciano¹, Robert F. Siliciano²•Institutions (2)

Johns Hopkins University School of Medicine¹, Howard Hughes Medical Institute²

01 Nov 2015-The Journal of Infectious Diseases

TL;DR: An extensive and careful study of latent reservoir decay by Crooks et al is reported, reported in this issue of the Journal, and confirms that the stability of the latent reservoir is not determined by treatment regimens.

...read moreread less

Abstract: The modern era of antiretroviral therapy (ART) for human immunodeficiency virus type 1 (HIV-1) infectionbegan in the mid-1990s with the introduction of 2 new classes of antiretroviral drugs, the protease inhibitors (PIs) and the nonnucleoside reverse-transcriptase inhibitors. Combinations consisting of 1 of these drugs along with 2 nucleoside analogue reverse-transcriptase inhibitors rapidly reduced plasma HIV-1 RNA levels to below the limit of detection of clinical assays [1, 2], leading to predictions that continued treatment for 2–3 years could cure the infection [3]. Although it did not prove curative, combination ART became the mainstay of HIV treatment, allowing durable control of viral replication and reversal or prevention of immunodeficiency [4]. A major reason why ART did not prove curative is the persistence of a latent form of the virus in a small population of resting memory CD4 T cells [5, 6]. In these cells, the viral genome is stably integrated into host cell DNA, but viral genes are not expressed at significant levels in part because of the absence of key host transcription factors that are recruited to the HIV prompter only after T-cell activation. The latent reservoir for HIV-1 was originally demonstrated using an assay in which resting cells from patients are activated to reverse latency [6]. Viruses released from individual latently infected cells are expanded in culture. This viral outgrowth assay (VOA) was used to demonstrate the remarkable stability of the latent reservoir [7–9]. The half-life of this pool of cells was shown to be 44 months. At this rate of decay, >70 years would be required for a pool of just 10 cells to decay completely [8, 9]. Initial studies of the decay of the latent reservoir were completed in 2003 [9]. Since that time, remarkable advances in ART have taken place, including the introduction of new classes of antiretroviral drugs, such as integrase inhibitors, and the development of simplified regimens in which multiple antiretroviral drugs are combined into a single pill that can be taken once daily [4]. In this context, an extensive and careful study of latent reservoir decay by Crooks et al [10], reported in this issue of the Journal, is of particular interest. The authors have reexamined the stability of the latent reservoir using longitudinal VOAs in a series of 37 patients, some of whom have been receiving treatment for most of the modern ART era. Despite the long duration of treatment in some patients and the changes in ART, the authors found that the decay rate of the latent reservoir is almost exactly the same as that reported in 2003. The half-life measured by Crooks et al is 43 months [10]. The fact that the decay rate measured in the present study is no different from that measured more than a decade ago confirms that the stability of the latent reservoir is not determined by treatment regimens. As long as the regimen produces a complete or near-complete arrest of new infection events, the decay of the reservoir is determined by the biology of the resting memory T cells that harbor persistent HIV-1. Pharmacodynamic studies indicate that the nonnucleoside reversetranscriptase inhibitors and PIs possess a remarkable potential to inhibit viral replication, a property that reflects an unexpected degree of cooperativity in their dose-response curves [11, 12]. At clinical concentrations, the best PIs can actually produce a 10 billion–fold inhibition of a single round of HIV-1 replication. Thus, even the early combination therapy regimens may have produced complete or near-complete inhibition of new infection events in drug-adherent patients. Subsequent improvements in ART have largely affected tolerability and convenience. Viewed in this light, the finding that the reservoir decay is constant is not surprising. The cures now being routinely achieved with direct-acting antiviral drugs Received and accepted 6 April 2015; electronically published 15 April 2015. Correspondence: Janet M. Siliciano, PhD, Johns Hopkins School of Medicine, 733 N Broadway, Miller Research Bldg 871, Baltimore, MD 21205 (jsilicia@jhmi.edu). The Journal of Infectious Diseases 2015;212:1345–7 © The Author 2015. Published by Oxford University Press on behalf of the Infectious Diseases Society of America. All rights reserved. For Permissions, please e-mail: journals. permissions@oup.com. DOI: 10.1093/infdis/jiv219

...read moreread less

Proceedings Article•DOI•

Quantifying and Mitigating the Negative Effects of Local Latencies on Aiming in 3D Shooter Games

[...]

Zenja Ivkovic¹, Ian Stavness¹, Carl Gutwin¹, Steven W.T. Sutcliffe¹•Institutions (1)

University of Saskatchewan¹

18 Apr 2015

TL;DR: This work tested local latency in a variety of real-world gaming scenarios and carried out a controlled study focusing on targeting and tracking activities in an FPS game with varying degrees of local latency, showing that local latency is a real and substantial problem -- but games can mitigate the problem with appropriate compensation methods.

...read moreread less

Abstract: Real-time games such as first-person shooters (FPS) are sensitive to even small amounts of lag. The effects of net-work latency have been studied, but less is known about local latency, the lag caused by input devices and displays. While local latency is important to gamers, we do not know how it affects aiming performance and whether we can reduce its negative effects. To explore these issues, we tested local latency in a variety of real-world gaming scenarios and carried out a controlled study focusing on targeting and tracking activities in an FPS game with varying degrees of local latency. In addition, we tested the ability of a lag compensation technique (based on aim assistance) to mitigate the negative effects. Our study found local latencies in the real-world range from 23 to 243 ms which cause significant and substantial degradation in performance (even for latencies as low as 41 ms). The study also showed that our compensation technique worked extremely well, reducing the problems caused by lag in the case of targeting, and removing the problem altogether in the case of tracking. Our work shows that local latency is a real and substantial problem -- but games can mitigate the problem with appropriate compensation methods.

...read moreread less

Posted Content•

Efficient Redundancy Techniques for Latency Reduction in Cloud Systems

[...]

Gauri Joshi¹, Emina Soljanin², Gregory W. Wornell³•Institutions (3)

Carnegie Mellon University¹, Rutgers University², Massachusetts Institute of Technology³

14 Aug 2015-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: A general redundancy strategy is designed that achieves a good latency-cost trade-off for an arbitrary service time distribution and generalizes and extends some results in the analysis of fork-join queues.

...read moreread less

Abstract: In cloud computing systems, assigning a task to multiple servers and waiting for the earliest copy to finish is an effective method to combat the variability in response time of individual servers, and reduce latency. But adding redundancy may result in higher cost of computing resources, as well as an increase in queueing delay due to higher traffic load. This work helps understand when and how redundancy gives a cost-efficient reduction in latency. For a general task service time distribution, we compare different redundancy strategies in terms of the number of redundant tasks, and time when they are issued and canceled. We get the insight that the log-concavity of the task service time creates a dichotomy of when adding redundancy helps. If the service time distribution is log-convex (i.e. log of the tail probability is convex) then adding maximum redundancy reduces both latency and cost. And if it is log-concave (i.e. log of the tail probability is concave), then less redundancy, and early cancellation of redundant tasks is more effective. Using these insights, we design a general redundancy strategy that achieves a good latency-cost trade-off for an arbitrary service time distribution. This work also generalizes and extends some results in the analysis of fork-join queues.

...read moreread less

Journal Article•DOI•

Role of microRNAs in herpesvirus latency and persistence.

[...]

Finn Grey¹•Institutions (1)

University of Edinburgh¹

01 Apr 2015-Journal of General Virology

TL;DR: The role of miRNAs in virus latency and persistence, specifically focusing on herpesviruses is discussed, and potential areas of future research and how novel technologies may aid in determining how mi RNAs shape virus latency in the context of herpesvirus infections are discussed.

...read moreread less

Abstract: The identification of virally encoded microRNAs (miRNAs) has had a major impact on the field of herpes virology. Given their ability to target cellular and viral transcripts, and the lack of immune response to small RNAs, miRNAs represent an ideal mechanism of gene regulation during viral latency and persistence. In this review, we discuss the role of miRNAs in virus latency and persistence, specifically focusing on herpesviruses. We cover the current knowledge on miRNAs in establishing and maintaining virus latency and promoting survival of infected cells through targeting of both viral and cellular transcripts, highlighting key publications in the field. We also discuss potential areas of future research and how novel technologies may aid in determining how miRNAs shape virus latency in the context of herpesvirus infections.

...read moreread less

Book Chapter•DOI•

A Particle Swarm Optimization Algorithm for Controller Placement Problem in Software Defined Network

[...]

Chuangen Gao¹, Hua Wang¹, Fangjin Zhu¹, Linbo Zhai¹, Shanwen Yi¹ - Show less +1 more•Institutions (1)

Shandong University¹

18 Nov 2015

TL;DR: A particle swarm optimization algorithm is proposed to solve the global latency controller placement problem with capacitated controllers, taking into consideration both the latency between controllers and the capacities of controllers.

...read moreread less

Abstract: Software defined network (SDN) decouples the control plane from packet processing device and introduces the controller placement problem. The previous methods only focus on propagation latency between controllers and switches but ignore either the latency from controllers to controllers or the capacities of controllers, both of which are critical factors in real networks. In this paper, we define a global latency controller placement problem with capacitated controllers, taking into consideration both the latency between controllers and the capacities of controllers. And this paper proposes a particle swarm optimization algorithm to solve the problem for the first time. Simulation results show that the algorithm has better performance in propagation latency, computation time, and convergence.

...read moreread less

Journal Article•DOI•

Queues with Redundancy: Latency-Cost Analysis

[...]

Gauri Joshi¹, Emina Soljanin², Gregory W. Wornell¹•Institutions (2)

Massachusetts Institute of Technology¹, Bell Labs²

16 Sep 2015

TL;DR: This work analyzes the trade-off between latency and the cost of computing resources in queues with redundancy, without assuming exponential service time, and studies a generalized fork-join queueing model where finishing any k out of n tasks is sufficient to complete a job.

...read moreread less

Abstract: A major advantage of cloud computing and storage is the large-scale sharing of resources, which provides scalability and flexibility. But resource-sharing causes variability in the latency experienced by the user, due to several factors such as virtualization, server outages, network congestion etc . This problem is further aggravated when a job consists of several parallel tasks, because the task run on the slowest machine becomes the latency bottleneck. A promising method to reduce latency is to assign a task to multiple machines and wait for the earliest to finish. Similarly, in cloud storage systems requests to download the content can be assigned to multiple replicas, such that it is sufficient to download any one replica. Although studied actively in systems in the past few years, there is little work on rigorous analysis of how redundancy affects latency. The effect of redundancy in queueing systems was first analyzed only recently in [2, 3, 6], assuming exponential service time. General service time distribution, in particular the effect of its tail, is considered in [7, 8]. This work analyzes the trade-off between latency and the cost of computing resources in queues with redundancy, without assuming exponential service time. We study a generalized fork-join queueing model where finishing any k out of n tasks is sufficient to complete a job. The redundant tasks can be canceled when any k tasks finish, or earlier, when any k tasks start service. For the k = 1 case, we get an elegant latency and cost analysis by identifying equivalences between systems without and with early redundancy cancellation to M/G/1 and M/G/n queues respectively. For general k, we derive bounds on the latency and cost. Please see [4] for an extended version of this work.

...read moreread less

Proceedings Article•DOI•

Improving DRAM latency with dynamic asymmetric subarray

[...]

Shih-Lien Lu¹, Ying-Chen Lin², Chia-Lin Yang²•Institutions (2)

Intel¹, National Taiwan University²

05 Dec 2015

TL;DR: This paper proposed a novel asymmetric DRAM with capability to perform low cost data migration between subarrays with a simple management mechanism and explored many management related policies to achieve 7.25% and 11.77% performance improvement in single- and multi-programming workloads, respectively, over a system with traditional homogeneous DRAM.

...read moreread less

Abstract: The evolution of DRAM technology has been driven by capacity and bandwidth during the last decade. In contrast, DRAM access latency stays relatively constant and is trending to increase. Much efforts have been devoted to tolerate memory access latency but these techniques have reached the point of diminishing returns. Having shorter bitline and wordline length in a DRAM device will reduce the access latency. However by doing so it will impact the array efficiency. In the mainstream market, manufacturers are not willing to trade capacity for latency. Prior works had proposed hybrid-bitline DRAM design to overcome this problem. However, those methods are either intrusive to the circuit and layout of the DRAM design, or there is no direct way to migrate data between the fast and slow levels. In this paper, we proposed a novel asymmetric DRAM with capability to perform low cost data migration between subarrays. Having this design we determined a simple management mechanism and explored many management related policies. We showed that with this new design and our simple management technique we could achieve 7.25% and 11.77% performance improvement in single- and multi-programming workloads, respectively, over a system with traditional homogeneous DRAM. This gain is above 80% of the potential performance gain of a system based on a hypothetical DRAM which is made out of short bitlines entirely.

...read moreread less

Proceedings Article•DOI•

Delayed-Dynamic-Selective (DDS) Prediction for Reducing Extreme Tail Latency in Web Search

[...]

Saehoon Kim¹, Yuxiong He², Seung-won Hwang¹, Sameh Elnikety², Seungjin Choi¹ - Show less +1 more•Institutions (2)

Pohang University of Science and Technology¹, Microsoft²

02 Feb 2015

TL;DR: The proposed prediction framework has a unique set of characteristics to predict long-running queries with high recall and improved precision and is effective in reducing the extreme tail latency compared to a start-of-the-art predictor and improves server throughput by more than 70% because of its improved precision.

...read moreread less

Abstract: A commercial web search engine shards its index among many servers, and therefore the response time of a search query is dominated by the slowest server that processes the query. Prior approaches target improving responsiveness by reducing the tail latency of an individual search server. They predict query execution time, and if a query is predicted to be long-running, it runs in parallel, otherwise it runs sequentially. These approaches are, however, not accurate enough for reducing a high tail latency when responses are aggregated from many servers because this requires each server to reduce a substantially higher tail latency (e.g., the 99.99th-percentile), which we call extreme tail latency.We propose a prediction framework to reduce the extreme tail latency of search servers. The framework has a unique set of characteristics to predict long-running queries with high recall and improved precision. Specifically, prediction is delayed by a short duration to allow many short-running queries to complete without parallelization, and to allow the predictor to collect a set of dynamic features using runtime information. These features estimate query execution time with high accuracy. We also use them to estimate the prediction errors to override an uncertain prediction by selectively accelerating the query for a higher recall.We evaluate the proposed prediction framework to improve search engine performance in two scenarios using a simulation study: (1) query parallelization on a multicore processor, and (2) query scheduling on a heterogeneous processor. The results show that, for both scenarios, the proposed framework is effective in reducing the extreme tail latency compared to a start-of-the-art predictor because of its higher recall, and it improves server throughput by more than 70% because of its improved precision.

...read moreread less

Proceedings Article•DOI•

Reducing read latency of phase change memory via early read and Turbo Read

[...]

Prashant J. Nair¹, Chiachen Chou¹, Bipin Rajendran², Moinuddin K. Qureshi¹•Institutions (2)

Georgia Institute of Technology¹, Indian Institute of Technology Bombay²

01 Feb 2015

TL;DR: A combination of Early Read and Turbo Read can reduce the PCM read latency by 30%, improve the system performance by 21%, and reduce the Energy Delay Product (EDP) by 28%, while requiring minimal changes to the memory system.

...read moreread less

Abstract: Phase Change Memory (PCM) is an emerging memory technology that can enable scalable high-density main memory systems. Unfortunately, PCM has higher read latency than DRAM, resulting in lower system performance. This paper investigates architectural techniques to improve the read latency of PCM. We observe that there is a wide distribution in cell resistance in both the SET state and the RESET state, and that the read latency of PCM is designed conservatively to handle the worst case cell. If PCM sensing can be tuned to exploit the variability in cell resistance, then we can get reduced read latency. We propose two schemes to enable better-than-worst-case read latency for PCM systems. Our first proposal, Early Read, reads the data earlier than the specified time period. Our key observation that Early Read causes only unidirectional errors (SET being read as RESET) allows us to efficiently detect data errors using Berger codes. In the uncommon case that Early Read causes data error(s), we simply retry the read operation with original latency. Our evaluations show that Early Read can reduce the read latency by 25% while incurring a storage overhead of only 10 bits per 64 byte line. Our second proposal, Turbo Read, reduces the sensing time for read operations by pumping higher current, at the expense of accidentally switching the PCM cell with small probability during the read operation. We analyze Error Correction Codes (ECC) and Probabilistic Row Scrubbing (PRS) for maintaining data integrity under Turbo Read. We show that a combination of Early Read and Turbo Read can reduce the PCM read latency by 30%, improve the system performance by 21%, and reduce the Energy Delay Product (EDP) by 28%, while requiring minimal changes to the memory system.

...read moreread less

Journal Article•DOI•

The Good, the Bad and the WiFi

[...]

Toke Høiland-Jørgensen¹, Per Hurtig¹, Anna Brunstrom¹•Institutions (1)

Karlstad University¹

04 Oct 2015-Computer Networks

TL;DR: The results show that while the AQM algorithms can significantly improve steady state performance, they exacerbate TCP flow unfairness and severely struggle to quickly control queueing latency at flow startup, which can lead to large latency spikes that hurt the perceived performance.

...read moreread less

Proceedings Article•DOI•

Experimental demonstration of high-throughput low-latency mobile fronthaul supporting 48 20-MHz LTE signals with 59-Gb/s CPRI-equivalent rate and 2-μs processing latency

[...]

Xiang Liu¹, Huaiyu Zeng¹, Naresh Chand¹, Frank Effenberger¹•Institutions (1)

Huawei¹

03 Dec 2015

TL;DR: This work experimentally demonstrates the transmission of 48 20-MHz LTE signals with a CPRI-equivalent data rate of 59 Gb/s, achieving a low round-trip digital-signal-processing latency of <;2 μs and a low mean error-vector magnitude of ~2.5 % after fiber transmission.

...read moreread less

Abstract: We experimentally demonstrate the transmission of 48 20-MHz LTE signals with a CPRI-equivalent data rate of 59 Gb/s, achieving a low round-trip digital-signal-processing latency of <2 μs and a low mean error-vector magnitude of ∼2.5 % after fiber transmission.

...read moreread less

Proceedings Article•DOI•

Efficient replication of queued tasks for latency reduction in cloud systems

[...]

Gauri Joshi¹, Emina Soljanin², Gregory W. Wornell¹•Institutions (2)

Massachusetts Institute of Technology¹, Bell Labs²

15 Oct 2015

TL;DR: In this article, the authors analyzed how different redundancy strategies, for eg. number of replicas, and the time when they are issued and canceled, affect the latency and computing cost.

...read moreread less

Abstract: In cloud computing systems, assigning a job to multiple servers and waiting for the earliest copy to finish is an effective method to combat the variability in response time of individual servers. Although adding redundant replicas always reduces service time, the total computing time spent per job may be higher, thus increasing waiting time in queue. The total time spent per job is also proportional to the cost of computing resources. We analyze how different redundancy strategies, for eg. number of replicas, and the time when they are issued and canceled, affect the latency and computing cost. We get the insight that the log-concavity of the service time distribution is a key factor in determining whether adding redundancy reduces latency and cost. If the service distribution is log-convex, then adding maximum redundancy reduces both latency and cost. And if it is log-concave, then having fewer replicas and canceling the redundant requests early is more effective.

...read moreread less

Collapse