scispace - formally typeset
Search or ask a question

Showing papers on "Latency (engineering) published in 2019"


Proceedings ArticleDOI
Mingxing Tan1, Bo Chen1, Ruoming Pang1, Vijay K. Vasudevan1, Mark Sandler1, Andrew Howard1, Quoc V. Le1 
01 Jun 2019
TL;DR: In this article, the authors propose an automated mobile neural architecture search (MNAS) approach, which explicitly incorporates model latency into the main objective so that the search can identify a model that achieves a good trade-off between accuracy and latency.
Abstract: Designing convolutional neural networks (CNN) for mobile devices is challenging because mobile models need to be small and fast, yet still accurate. Although significant efforts have been dedicated to design and improve mobile CNNs on all dimensions, it is very difficult to manually balance these trade-offs when there are so many architectural possibilities to consider. In this paper, we propose an automated mobile neural architecture search (MNAS) approach, which explicitly incorporate model latency into the main objective so that the search can identify a model that achieves a good trade-off between accuracy and latency. Unlike previous work, where latency is considered via another, often inaccurate proxy (e.g., FLOPS), our approach directly measures real-world inference latency by executing the model on mobile phones. To further strike the right balance between flexibility and search space size, we propose a novel factorized hierarchical search space that encourages layer diversity throughout the network. Experimental results show that our approach consistently outperforms state-of-the-art mobile CNN models across multiple vision tasks. On the ImageNet classification task, our MnasNet achieves 75.2% top-1 accuracy with 78ms latency on a Pixel phone, which is 1.8× faster than MobileNetV2 with 0.5% higher accuracy and 2.3× faster than NASNet with 1.2% higher accuracy. Our MnasNet also achieves better mAP quality than MobileNets for COCO object detection. Code is at https://github.com/tensorflow/tpu/tree/master/models/official/mnasnet.

1,841 citations


Proceedings ArticleDOI
01 Oct 2019
TL;DR: In this paper, a Harmonic Densely Connected Network (HDN) was proposed to achieve high efficiency in terms of both low MACs and memory traffic for real-time object detection and semantic segmentation.
Abstract: State-of-the-art neural network architectures such as ResNet, MobileNet, and DenseNet have achieved outstanding accuracy over low MACs and small model size counterparts. However, these metrics might not be accurate for predicting the inference time. We suggest that memory traffic for accessing intermediate feature maps can be a factor dominating the inference latency, especially in such tasks as real-time object detection and semantic segmentation of high-resolution video. We propose a Harmonic Densely Connected Network to achieve high efficiency in terms of both low MACs and memory traffic. The new network achieves 35%, 36%, 30%, 32%, and 45% inference time reduction compared with FC-DenseNet-103, DenseNet-264, ResNet-50, ResNet-152, and SSD-VGG, respectively. We use tools including Nvidia profiler and ARM Scale-Sim to measure the memory traffic and verify that the inference latency is indeed proportional to the memory traffic consumption and the proposed network consumes low memory traffic. We conclude that one should take memory traffic into consideration when designing neural network architectures for high-resolution applications at the edge.

238 citations


Journal ArticleDOI
TL;DR: This paper evaluates the relevant PHY and MAC techniques for their ability to improve the reliability and reduce the latency and identifies that enabling long-term evolution to coexist in the unlicensed spectrum is also a potential enabler of URLLC in theUnlicensed band.
Abstract: Future 5th generation networks are expected to enable three key services—enhanced mobile broadband, massive machine type communications and ultra-reliable and low latency communications (URLLC). As per the 3rd generation partnership project URLLC requirements, it is expected that the reliability of one transmission of a 32 byte packet will be at least 99.999% and the latency will be at most 1 ms. This unprecedented level of reliability and latency will yield various new applications, such as smart grids, industrial automation and intelligent transport systems. In this survey we present potential future URLLC applications, and summarize the corresponding reliability and latency requirements. We provide a comprehensive discussion on physical (PHY) and medium access control (MAC) layer techniques that enable URLLC, addressing both licensed and unlicensed bands. This paper evaluates the relevant PHY and MAC techniques for their ability to improve the reliability and reduce the latency. We identify that enabling long-term evolution to coexist in the unlicensed spectrum is also a potential enabler of URLLC in the unlicensed band, and provide numerical evaluations. Lastly, this paper discusses the potential future research directions and challenges in achieving the URLLC requirements.

185 citations


Proceedings ArticleDOI
01 Jul 2019
TL;DR: This paper propose a prefix-to-prefix framework for multaneous translation that implicitly learns to anticipate in a single translation model, which achieves low latency and reasonable qual- ity (compared to full-sentence translation) on 4 directions.
Abstract: Simultaneous translation, which translates sentences before they are finished, is use- ful in many scenarios but is notoriously dif- ficult due to word-order differences. While the conventional seq-to-seq framework is only suitable for full-sentence translation, we pro- pose a novel prefix-to-prefix framework for si- multaneous translation that implicitly learns to anticipate in a single translation model. Within this framework, we present a very sim- ple yet surprisingly effective “wait-k” policy trained to generate the target sentence concur- rently with the source sentence, but always k words behind. Experiments show our strat- egy achieves low latency and reasonable qual- ity (compared to full-sentence translation) on 4 directions: zh↔en and de↔en.

163 citations


Proceedings Article
01 Jan 2019
TL;DR: Shenango achieves tail latency and throughput comparable to ZygOS, a state-of-the-art, kernel-bypass network stack, but can linearly trade latency-sensitive application throughput for batch processing application throughput, vastly increasing CPU efficiency.
Abstract: Datacenter applications demand microsecond-scale tail latencies and high request rates from operating systems, and most applications handle loads that have high variance over multiple timescales. Achieving these goals in a CPU-efficient way is an open problem. Because of the high overheads of today’s kernels, the best available solution to achieve microsecond-scale latencies is kernel-bypass networking, which dedicates CPU cores to applications for spin-polling the network card. But this approach wastes CPU: even at modest average loads, one must dedicate enough cores for the peak expected load. Shenango achieves comparable latencies but at far greater CPU efficiency. It reallocates cores across applications at very fine granularity—every 5 μs—enabling cycles unused by latency-sensitive applications to be used productively by batch processing applications. It achieves such fast reallocation rates with (1) an efficient algorithm that detects when applications would benefit from more cores, and (2) a privileged component called the IOKernel that runs on a dedicated core, steering packets from the NIC and orchestrating core reallocations. When handling latency-sensitive applications, such as memcached, we found that Shenango achieves tail latency and throughput comparable to ZygOS, a state-of-the-art, kernel-bypass network stack, but can linearly trade latency-sensitive application throughput for batch processing application throughput, vastly increasing CPU efficiency.

154 citations


Journal ArticleDOI
TL;DR: A novel V2V-enabled resource allocation scheme based on C-V2X technology is proposed to improve the reliability and latency of VANETs and significantly outperforms the existing schemes in terms of latency, throughput, and packet delivery ratio.
Abstract: In vehicular ad hoc networks (VANETs), cellular vehicle-to-everything (C-V2X) is an emerging technology for communications between vehicle-to-infrastructure, vehicle-to-pedestrian, and vehicle-to-network which improves traffic efficiency, road safety, and the availability of infotainment services. Herein, a novel V2V-enabled resource allocation scheme based on C-V2X technology is proposed to improve the reliability and latency of VANETs. The key idea is that V2V communications based on cellular-V2X technology among vehicles remove the contention latency and can assist for longer distance communications. Particularly, we propose a hybrid architecture, where the V2V links are controlled by the cellular eNodeB in the overlay scheme. In this scheme, every vehicle periodically checks its packet lifetime and requests the cellular eNodeB to determine V2V links. The optimum resource allocation problem at the cellular eNodeB is to choose optimum receiver vehicles to determine V2V links and allocate suitable channels to minimize the total latency. This problem is equivalent to the maximum weighted independent set problem (MWIS-AW) with associated weights, which is NP-hard. In order to compute the weights, an analytical approach is developed to model the expected latency and packet delivery ratio. Moreover, a greedy cellular-based V2V link selection algorithm is proposed to solve MWIS-AW problem and develop a theoretical performance lower bound. Simulation results show that the proposed scheme significantly outperforms the existing schemes in terms of latency, throughput, and packet delivery ratio.

114 citations


Journal ArticleDOI
01 Feb 2019
TL;DR: In this paper, the authors propose a holistic analysis and classification of the main design principles and enabling technologies that will make it possible to deploy low-latency wireless communication networks and discuss open problems for future research.
Abstract: While the current generation of mobile and fixed communication networks has been standardized for mobile broadband services, the next generation is driven by the vision of the Internet of Things and mission-critical communication services requiring latency in the order of milliseconds or submilliseconds. However, these new stringent requirements have a large technical impact on the design of all layers of the communication protocol stack. The cross-layer interactions are complex due to the multiple design principles and technologies that contribute to the layers’ design and fundamental performance limitations. We will be able to develop low-latency networks only if we address the problem of these complex interactions from the new point of view of submilliseconds latency. In this paper, we propose a holistic analysis and classification of the main design principles and enabling technologies that will make it possible to deploy low-latency wireless communication networks. We argue that these design principles and enabling technologies must be carefully orchestrated to meet the stringent requirements and to manage the inherent tradeoffs between low latency and traditional performance metrics. We also review currently ongoing standardization activities in prominent standards associations, and discuss open problems for future research.

101 citations


Journal ArticleDOI
TL;DR: The authors study performance and competition among high-frequency traders and find that differences in relative latency accounts for large differences in HFTs' tradi cation performance and competitive advantage among highfrequency traders.
Abstract: We study performance and competition among high-frequency traders (HFTs). We construct measures of latency and find that differences in relative latency account for large differences in HFTs’ tradi ...

100 citations


Proceedings ArticleDOI
21 Oct 2019
TL;DR: An overview of grant-free random access in 5G New Radio is provided, and two reliability-enhancing solutions are presented that result in significant performance gains, in terms of reliability as well as resource efficiency.
Abstract: Ultra-reliable low latency communication requires innovative resource management solutions that can guarantee high reliability at low latency. Grant-free random access, where channel resources are accessed without undergoing assignment through a handshake process, is proposed in 5G New Radio as an important latency reducing solution. However, this comes at an increased likelihood of collisions resulting from uncoordinated channel access. Novel reliability enhancement techniques are therefore needed. This article provides an overview of grant-free random access in 5G New Radio focusing on the ultra-reliable low latency communication service class, and presents two reliability-enhancing solutions. The first proposes retransmissions over shared resources, whereas the second proposal incorporates grant-free transmission with non-orthogonal multiple access where overlapping transmissions are resolved through the use of advanced receivers. Both proposed solutions result in significant performance gains, in terms of reliability as well as resource efficiency. For example, the proposed non-orthogonal multiple access scheme can support a normalized load of more than 1.5 users/slot at packet loss rates of ~ 10−5 a significant improvement over conventional grant-free schemes like slotted-ALOHA.

95 citations


Proceedings Article
26 Feb 2019
TL;DR: It is demonstrated that Shinjuku provides significant tail latency and throughput improvements over IX and ZygOS for a wide range of workload scenarios and achieves up to 6.6× higher throughput and 88% lower tail latency.
Abstract: The recently proposed dataplanes for microsecond scale applications, such as IX and ZygOS, use nonpreemptive policies to schedule requests to cores. For the many real-world scenarios where request service times follow distributions with high dispersion or a heavy tail, they allow short requests to be blocked behind long requests, which leads to poor tail latency. Shinjuku is a single-address space operating system that uses hardware support for virtualization to make preemption practical at the microsecond scale. This allows Shinjuku to implement centralized scheduling policies that preempt requests as often as every 5µsec and work well for both light and heavy tailed request service time distributions. We demonstrate that Shinjuku provides significant tail latency and throughput improvements over IX and ZygOS for a wide range of workload scenarios. For the case of a RocksDB server processing both point and range queries, Shinjuku achieves up to 6.6× higher throughput and 88% lower tail latency.

94 citations


Journal ArticleDOI
TL;DR: This paper performs spectrum and power allocation to maximize the sum ergodic capacity of vehicle-to-infrastructure (V2I) links while guaranteeing the LVP for vehicle- to-vehicle (V 2V) links using the effective capacity theory.
Abstract: Vehicular communications face a tremendous challenge in guaranteeing low latency for safety-critical information exchange due to fast varying channels caused by high mobility. Focusing on the tail behavior of random latency experienced by packets, latency violation probability (LVP) deserves particular attention. Based on only large-scale channel information, this paper performs spectrum and power allocation to maximize the sum ergodic capacity of vehicle-to-infrastructure (V2I) links while guaranteeing the LVP for vehicle-to-vehicle (V2V) links. Using the effective capacity theory, we explicitly express the latency constraint with introduced latency exponents. Then, the resource allocation problem is decomposed into a pure power allocation subproblem and a pure spectrum allocation subproblem, both of which can be solved with global optimum in polynomial time. Simulation results show that the effective capacity model can accurately characterize the LVP. In addition, the effectiveness of the proposed algorithm is demonstrated from the perspectives of the capacity of the V2I links and the latency of the V2V links.

Proceedings ArticleDOI
25 Mar 2019
TL;DR: In this article, a low latency on-device inference runtime is proposed, which accelerates each NN layer by simultaneously utilizing diverse heterogeneous processors on a mobile device and by performing computations using processor-friendly quantization.
Abstract: Emerging mobile services heavily utilize Neural Networks (NNs) to improve user experiences. Such NN-assisted services depend on fast NN execution for high responsiveness, demanding mobile devices to minimize the NN execution latency by efficiently utilizing their underlying hardware resources. To better utilize the resources, existing mobile NN frameworks either employ various CPU-friendly optimizations (e.g., vectorization, quantization) or exploit data parallelism using heterogeneous processors such as GPUs and DSPs. However, their performance is still bounded by the performance of the single target processor, so that realtime services such as voice-driven search often fail to react to user requests in time. It is obvious that this problem will become more serious with the introduction of more demanding NN-assisted services. In this paper, we propose μLayer, a low latency on-device inference runtime which significantly improves the latency of NN-assisted services. μLayer accelerates each NN layer by simultaneously utilizing diverse heterogeneous processors on a mobile device and by performing computations using processor-friendly quantization. Two key findings motivate our work: 1) the existing frameworks are limited by single-processor performance as they execute an NN layer using only a single processor, and 2) the CPU and the GPU on the same mobile device achieve comparable computational throughput, making cooperative acceleration highly promising. First, to accelerate an NN layer using both the CPU and the GPU at the same time, μLayer employs a layer distribution mechanism which completely removes redundant computations between the processors. Next, μLayer optimizes the per-processor performance by making the processors utilize different data types that maximize their utilization. In addition, to minimize potential latency increases due to overly aggressive workload distribution, μLayer selectively increases the distribution granularity to divergent layer paths. Our experiments using representative NNs and mobile devices show that μLayer significantly improves the speed and the energy efficiency of on-device inference by up to 69.6% and 58.1%, respectively, over the state-of-the-art NN execution mechanism.

Journal ArticleDOI
TL;DR: The heterogeneous multi-layer MEC (HetMEC) is proposed where data that cannot be timely processed at the edge are allowed to be offloaded to the upper- layer MEC servers, and finally to the cloud center (CC) with more powerful computing capacity.
Abstract: Driven by great demands on low-latency services of the edge devices (EDs), mobile edge computing (MEC) has been proposed to enable the computing capacities at the edge of the radio access network. However, conventional MEC servers suffer some disadvantages such as limited computing capacity, preventing and computation-intensive tasks to be processed on time. To relief this issue, we propose the heterogeneous multi-layer MEC (HetMEC) where data that cannot be timely processed at the edge are allowed to be offloaded to the upper-layer MEC servers, and finally to the cloud center (CC) with more powerful computing capacity. We aim to minimize the system latency, i.e., the total computing and transmission time on all layers for the data generated by the EDs. We design the latency minimization algorithm by jointly coordinating the task assignment, computing, and transmission resources among the EDs, multi-layer MEC servers, and the CC. The simulation results indicate that our proposed algorithm can achieve a lower latency and higher processing rate than the conventional MEC scheme.

Proceedings Article
24 May 2019
TL;DR: In this paper, the authors combined homomorphic encryption with neural networks to make inferences while protecting against information leakage, and applied the method of transfer learning to provide private inference services using deep networks with latency of ∼0.16 seconds.
Abstract: When applying machine learning to sensitive data, one has to find a balance between accuracy, information security, and computational-complexity. Recent studies combined Homomorphic Encryption with neural networks to make inferences while protecting against information leakage. However, these methods are limited by the width and depth of neural networks that can be used (and hence the accuracy) and exhibit high latency even for relatively simple networks. In this study we provide two solutions that address these limitations. In the first solution, we present more than 10× improvement in latency and enable inference on wider networks compared to prior attempts with the same level of security. The improved performance is achieved by novel methods to represent the data during the computation. In the second solution, we apply the method of transfer learning to provide private inference services using deep networks with latency of ∼0.16 seconds. We demonstrate the efficacy of our methods on several computer vision tasks.

Proceedings ArticleDOI
21 Sep 2019
TL;DR: This work investigates the effects of latency on task performance and perceived workload for different driving scenarios and suggests that latency has negative influence on driving performance and subjective factors and led to a decreased confidence in Teleoperated Driving during the study.
Abstract: In the domain of automated driving, numerous (technological) problems were solved in recent years, but still many limitations are around that could eventually prevent the deployment of automated driving systems (ADS) beyond SAE level 3. A remote operating fallback authority might be a promising solution. In order for teleoperation to function reliably and universal, it will make use of existing infrastructure, such as cellular networks. Unfortunately, cellular networks might suffer from variable performance. In this work, we investigate the effects of latency on task performance and perceived workload for different driving scenarios. Results from a simulator study (N=28) suggest that latency has negative influence on driving performance and subjective factors and led to a decreased confidence in Teleoperated Driving during the study. A latency of about 300 ms already led to a deteriorated driving performance, whereas variable latency did not consequently deteriorate driving performance.

Journal ArticleDOI
13 Nov 2019-PLOS ONE
TL;DR: The proposed intelligent FC analytical model and algorithm use a fuzzy inference system combined with reinforcement learning and neural network evolution strategies for data packet allocation and selection in an IoT–FC environment and results indicated the better performance of the proposed approach compared with existing methods.
Abstract: Fog computing (FC) is an evolving computing technology that operates in a distributed environment. FC aims to bring cloud computing features close to edge devices. The approach is expected to fulfill the minimum latency requirement for healthcare Internet-of-Things (IoT) devices. Healthcare IoT devices generate various volumes of healthcare data. This large volume of data results in high data traffic that causes network congestion and high latency. An increase in round-trip time delay owing to large data transmission and large hop counts between IoTs and cloud servers render healthcare data meaningless and inadequate for end-users. Time-sensitive healthcare applications require real-time data. Traditional cloud servers cannot fulfill the minimum latency demands of healthcare IoT devices and end-users. Therefore, communication latency, computation latency, and network latency must be reduced for IoT data transmission. FC affords the storage, processing, and analysis of data from cloud computing to a network edge to reduce high latency. A novel solution for the abovementioned problem is proposed herein. It includes an analytical model and a hybrid fuzzy-based reinforcement learning algorithm in an FC environment. The aim is to reduce high latency among healthcare IoTs, end-users, and cloud servers. The proposed intelligent FC analytical model and algorithm use a fuzzy inference system combined with reinforcement learning and neural network evolution strategies for data packet allocation and selection in an IoT-FC environment. The approach is tested on simulators iFogSim (Net-Beans) and Spyder (Python). The obtained results indicated the better performance of the proposed approach compared with existing methods.

Journal ArticleDOI
TL;DR: In this paper, the delay components and packet loss probabilities in typical ultrareliable low-latency communications (URLLC) scenarios and formulate the constraints on E2E delay and overall packet loss probability.
Abstract: Ultrareliable low-latency communications (URLLC) is one of three emerging application scenarios in 5G new radio (NR) for which physical layer design aspects have been specified. With 5G NR, we can guarantee reliability and latency in radio access networks. However, for communication scenarios where the transmission involves both radio access and wide-area core networks, the delay in radio access networks contributes to only a portion of the end-toend (E2E) delay. In this article, we outline the delay components and packet loss probabilities in typical URLLC scenarios and formulate the constraints on E2E delay and overall packet loss probability. Then, we summarize possible solutions in the physical, link, and network layers as well as the cross-layer design. Finally, we discuss open issues in prediction and communication codesign for URLLC in wide-area, largescale networks.

Journal ArticleDOI
TL;DR: In this paper, the authors focus on the virally encoded G-protein coupled receptor, US28, which suppresses the major immediate early promoter (MIEP) in early myeloid lineage cells.
Abstract: Human cytomegalovirus (HCMV) latency and reactivation is regulated by the chromatin structure at the major immediate early promoter (MIEP) within myeloid cells. Both cellular and viral factors are known to control this promoter during latency, here we will review the known mechanisms for MIEP regulation during latency. We will then focus on the virally encoded G-protein coupled receptor, US28, which suppresses the MIEP in early myeloid lineage cells. The importance of this function is underlined by the fact that US28 is essential for HCMV latency in CD34+ progenitor cells and CD14+ monocytes. We will describe cellular signalling pathways modulated by US28 to direct MIEP suppression during latency and demonstrate how US28 is able to ‘regulate the regulators’ of HCMV latency. Finally, we will describe how cell-surface US28 can be a target for antiviral therapies directed at the latent viral reservoir.

Proceedings ArticleDOI
24 Jun 2019
TL;DR: This work compares three variants of an ILP-based algorithm that aim to minimize E2E latency of requested services, service provisioning cost, and VSF migration frequency, respectively and proposes a heuristic in order to address the scalability issue of the ILp-based solutions.
Abstract: The 5th generation mobile network (5G) is expected to support numerous services with versatile quality of service (QoS) requirements such as high data rates and low end-to-end (E2E) latency. It is widely agreed that E2E latency can be significantly reduced by moving content / computing capability closer to the network edge. However, since the edge nodes (i.e., base stations) have limited computing capacity, mobile network operators shall make a decision on how to provision the computing resources to the services in order to make sure that the E2E latency requirement of the services are satisfied while the network resources (e.g., computing, radio, and transport network resources) are used in an efficient manner. In this work, we employ integer linear programming (ILP) techniques to formulate and solve a joint user association, service function chain (SFC) placement, and resource allocation problem where SFCs, composed of virtualized service functions (VSFs), represent user requested services that have certain E2E latency and data rate requirements. Specifically, we compare three variants of an ILP-based algorithm that aim to minimize E2E latency of requested services, service provisioning cost, and VSF migration frequency, respectively. We then propose a heuristic in order to address the scalability issue of the ILP-based solutions. Simulations results demonstrate the effectiveness of the proposed heuristic algorithm.

Journal ArticleDOI
TL;DR: Several design implications are suggested for improving performance including adding features to the automation that will allow the operator to use common strategies and providing necessary information using multiple sensory channels.

Proceedings ArticleDOI
04 Apr 2019
TL;DR: RPCValet is introduced, an NI-driven RPC load-balancing design for architectures with hardware-terminated protocols and integrated NIs, that delivers near-optimal tail latency and improves throughput under tight tail latency goals by up to 1.4x, and reduces tail latency before saturation by 4x for RPCs with μs-scale service times.
Abstract: Modern online services come with stringent quality requirements in terms of response time tail latency. Because of their decomposition into fine-grained communicating software layers, a single user request fans out into a plethora of short, μs-scale RPCs, aggravating the need for faster inter-server communication. In reaction to that need, we are witnessing a technological transition characterized by the emergence of hardware-terminated user-level protocols (e.g., InfiniBand/RDMA) and new architectures with fully integrated Network Interfaces (NIs). Such architectures offer a unique opportunity for a new NI-driven approach to balancing RPCs among the cores of manycore server CPUs, yielding major tail latency improvements for μs-scale RPCs. We introduce RPCValet, an NI-driven RPC load-balancing design for architectures with hardware-terminated protocols and integrated NIs, that delivers near-optimal tail latency. RPCValet's RPC dispatch decisions emulate the theoretically optimal single-queue system, without incurring synchronization overheads currently associated with single-queue implementations. Our design improves throughput under tight tail latency goals by up to 1.4x, and reduces tail latency before saturation by up to 4x for RPCs with μs-scale service times, as compared to current systems with hardware support for RPC load distribution. RPCValet performs within 15% of the theoretically optimal single-queue system.

Journal ArticleDOI
TL;DR: In this paper, the authors analyze the origin of the intrinsic timing jitter in superconducting nanowire single photon detectors (SNSPDs) in terms of fluctuations in the latency of the detector response, which is determined by the microscopic physics of the photon detection process.
Abstract: We analyze the origin of the intrinsic timing jitter in superconducting nanowire single photon detectors (SNSPDs) in terms of fluctuations in the latency of the detector response, which is determined by the microscopic physics of the photon detection process. We demonstrate that fluctuations in the physical parameters which determine the latency give rise to the intrinsic timing jitter. We develop a general description of latency by introducing the explicit time dependence of the internal detection efficiency. By considering the dynamic Fano fluctuations together with static spatial inhomogeneities, we study the details of the connection between latency and timing jitter. We develop both a simple phenomenological model and a more general microscopic model of detector latency and timing jitter based on the solution of the generalized time-dependent Ginzburg-Landau equations for the 1D hotbelt geometry. While the analytical model is sufficient for qualitative interpretation of recent data, the general approach establishes the framework for a quantitative analysis of detector latency and the fundamental limits of intrinsic timing jitter. These theoretical advances can be used to interpret the results of recent experiments measuring the dependence of detection latency and timing jitter on photon energy to the few-picosecond level.

Proceedings ArticleDOI
09 May 2019
TL;DR: Using the Reinforcement learning (RL) algorithm Deep Q-network (DQN) to select the users who offload at the same time without knowing the actions of other users in advance, this paper will obtain the optimal user combination state and minimize system offloading latency.
Abstract: Both non-orthogonal multiple access (NOMA) and mobile edge computing (MEC) have been recognized as important techniques in future wireless networks, and the combination of them has received attention recently. It has been demonstrated that in a dual-user scenario, the use of the NOMA can effectively reduce the latency and energy consumption of MEC offloading. However, the scenario of multiple users needs to be considered further, which is more practical. In this paper, we consider a NOMA-MEC system with multiple users and single MEC server, and investigate the problem of minimizing offloading latency. Through using the Reinforcement learning (RL) algorithm Deep Q-network (DQN) to select the users who offload at the same time without knowing the actions of other users in advance, we will obtain the optimal user combination state and minimize system offloading latency. Simulation results show that the proposed method can significantly reduce the system offloading latency in the multi-user scenario of applying NOMA to MEC.

Journal ArticleDOI
TL;DR: The proposed compressed sensing-based random access protocol (CS-RACH), which is suitable for servicing a large number of machine-type communication devices in Internet of Things (IoT) network, considerably reduces the access latency under reasonable conditions in IoT environments.
Abstract: This paper proposes a compressed sensing-based random access protocol (CS-RACH), which is suitable for servicing a large number of machine-type communication devices in Internet of Things (IoT) network. In CS-RACH, we utilize a larger number of unique preambles compared to conventional LTE-RACH, however, the compressed sensing technique makes it possible to simultaneously detect the users with high accuracy. Compared to the user detection in conventional LTE-RACH, the proposed user detection can get rid of preamble collisions and decrease the collision probability, thereby the overall access latency is significantly reduced. To prove the benefits of the proposed CS-RACH, we mathematically analyze and compare access latency performance of LTE-RACH and CS-RACH. In particular, based on the least absolute shrinkage and selection operator approach, we derive a normalized throughput, access success probability, and average access latency. Our simulation results also exhibit that the proposed CS-RACH considerably reduces the access latency under reasonable conditions in IoT environments.

Journal ArticleDOI
TL;DR: This work utilizes a deep-learning technology to help store data blocks and then improve the read and write latency of the DCN, where $k$ is the number of cores in the fat-tree.
Abstract: Low-latency data access is becoming an upcoming and increasingly important challenge. The proper placement of data blocks can reduce data travel among distributed storage systems, which contributes significantly to the latency reduction. However, the dominant data placement optimization has primarily relied on prior known data requests or static initial data distribution, which ignores the dynamics of clients' data access requests and networks. The learning technology can help the data center networks (DCNs) learn from historical access information and make optimal data storage decision. Consider a more practical DCNs with fat-tree topology, we utilize a deep-learning technology k-means to help store data blocks and then improve the read and write latency of the DCN, where k is the number of cores in the fat-tree. The evaluation results demonstrate that the average write and read latency of the whole system can be lowered by 33% and 45%, respectively. And the best set of parameter k is analyzed and recommended to provide guidance to the real application, which is equal to the number of cores in the DCNs.

Journal ArticleDOI
26 Mar 2019-Mbio
TL;DR: The QUECEL model closely mimics RNA induction profiles seen in cells from well-suppressed HIV patient samples using the envelope detection of in vitro transcription sequencing (EDITS) assay and is a robust and reproducible tool to study the molecular mechanisms underlying HIV latency.
Abstract: The latent HIV reservoir is generated following HIV infection of activated effector CD4 T cells, which then transition to a memory phenotype. Here, we describe an ex vivo method, called QUECEL (quiescent effector cell latency), that mimics this process efficiently and allows production of large numbers of latently infected CD4+ T cells. Naive CD4+ T cells were polarized into the four major T cell subsets (Th1, Th2, Th17, and Treg) and subsequently infected with a single-round reporter virus which expressed GFP/CD8a. The infected cells were purified and coerced into quiescence using a defined cocktail of cytokines, including tumor growth factor beta, interleukin-10 (IL-10), and IL-8, producing a homogeneous population of latently infected cells. Flow cytometry and transcriptome sequencing (RNA-Seq) demonstrated that the cells maintained the correct polarization phenotypes and had withdrawn from the cell cycle. Key pathways and gene sets enriched during transition from quiescence to reactivation include E2F targets, G2M checkpoint, estrogen response late gene expression, and c-myc targets. Reactivation of HIV by latency-reversing agents (LRAs) closely mimics RNA induction profiles seen in cells from well-suppressed HIV patient samples using the envelope detection of in vitro transcription sequencing (EDITS) assay. Since homogeneous populations of latently infected cells can be recovered, the QUECEL model has an excellent signal-to-noise ratio and has been extremely consistent and reproducible in numerous experiments performed during the last 4 years. The ease, efficiency, and accuracy of the mimicking of physiological conditions make the QUECEL model a robust and reproducible tool to study the molecular mechanisms underlying HIV latency.IMPORTANCE Current primary cell models for HIV latency correlate poorly with the reactivation behavior of patient cells. We have developed a new model, called QUECEL, which generates a large and homogenous population of latently infected CD4+ memory cells. By purifying HIV-infected cells and inducing cell quiescence with a defined cocktail of cytokines, we have eliminated the largest problems with previous primary cell models of HIV latency: variable infection levels, ill-defined polarization states, and inefficient shutdown of cellular transcription. Latency reversal in the QUECEL model by a wide range of agents correlates strongly with RNA induction in patient samples. This scalable and highly reproducible model of HIV latency will permit detailed analysis of cellular mechanisms controlling HIV latency and reactivation.

Journal ArticleDOI
TL;DR: It is shown that a member of the Apobec3A (apolipoprotein B MRNA editing enzyme catalytic subunit 3A) family, A3A, suppresses HIV-1 reactivation by recruiting chromatin-modifying enzymes to impose repressive marks around the long terminal repeat promoter region.
Abstract: HIV-1 integrates into the genome of target cells and establishes latency indefinitely. Understanding the molecular mechanism of HIV-1 latency maintenance is needed for therapeutic strategies to combat existing infection. In this study, we found an unexpected role for Apobec3A (apolipoprotein B MRNA editing enzyme catalytic subunit 3A, abbreviated "A3A") in maintaining the latency state within HIV-1-infected cells. Overexpression of A3A in latently infected cell lines led to lower reactivation, while knockdown or knockout of A3A led to increased spontaneous and inducible HIV-1 reactivation. A3A maintains HIV-1 latency by associating with proviral DNA at the 5' long terminal repeat region, recruiting KAP1 and HP1, and imposing repressive histone marks. We show that knockdown of A3A in latently infected human primary CD4 T cells enhanced HIV-1 reactivation. Collectively, we provide evidence and a mechanism by which A3A reinforces HIV-1 latency in infected CD4 T cells.

Posted Content
TL;DR: It is suggested that memory traffic for accessing intermediate feature maps can be a factor dominating the inference latency, especially in such tasks as real-time object detection and semantic segmentation of high-resolution video.
Abstract: State-of-the-art neural network architectures such as ResNet, MobileNet, and DenseNet have achieved outstanding accuracy over low MACs and small model size counterparts. However, these metrics might not be accurate for predicting the inference time. We suggest that memory traffic for accessing intermediate feature maps can be a factor dominating the inference latency, especially in such tasks as real-time object detection and semantic segmentation of high-resolution video. We propose a Harmonic Densely Connected Network to achieve high efficiency in terms of both low MACs and memory traffic. The new network achieves 35%, 36%, 30%, 32%, and 45% inference time reduction compared with FC-DenseNet-103, DenseNet-264, ResNet-50, ResNet-152, and SSD-VGG, respectively. We use tools including Nvidia profiler and ARM Scale-Sim to measure the memory traffic and verify that the inference latency is indeed proportional to the memory traffic consumption and the proposed network consumes low memory traffic. We conclude that one should take memory traffic into consideration when designing neural network architectures for high-resolution applications at the edge.

Journal ArticleDOI
TL;DR: The real-time packet transmission up to 50 Gb/s without packet loss during 48-hour through the channel bonding in a single optical distribution network (ODN) with 20 km reach and 1:64 split ratio is experimentally demonstrated.
Abstract: To meet the latency and bandwidth requirements of 5G mobile services and next-generation residential/business services, we present high speed and low-latency passive optical network (PON) enabled by time controlled-tactile optical access (TIC-TOC) technology The TIC-TOC technology could support bandwidth-intensive as well as low-latency services for 5G mobile network by using channel bonding and low-latency oriented dynamic bandwidth allocation (DBA) In order to confirm the technical feasibility of TIC-TOC, FPGA-based OLT and multi-speed ONU prototypes are implemented We experimentally demonstrate the real-time packet transmission up to 50 Gb/s without packet loss during 48-hour through the channel bonding in a single optical distribution network (ODN) with 20 km reach and 1:64 split ratio Less than 400 μs latency for 5G mobile services is also demonstrated Furthermore, we confirm that total throughput could be expanded up to 100 Gb/s by adding more channels

Journal ArticleDOI
20 Aug 2019-Mbio
TL;DR: It is determined that US28 is required for reactivation but not for maintaining latency, as HCMV-mediated changes in hematopoiesis during latency in vivo and in vitro was dependent upon US28, as US28 directly promoted differentiation toward the myeloid lineage.
Abstract: Human cytomegalovirus (HCMV) infection of CD34+ hematopoietic progenitor cells (CD34+ HPCs) provides a critical reservoir of virus in stem cell transplant patients, and viral reactivation remains a significant cause of morbidity and mortality. The HCMV chemokine receptor US28 is implicated in the regulation of viral latency and reactivation. To explore the role of US28 signaling in latency and reactivation, we analyzed protein tyrosine kinase signaling in CD34+ HPCs expressing US28. US28-ligand signaling in CD34+ HPCs induced changes in key regulators of cellular activation and differentiation. In vitro latency and reactivation assays utilizing CD34+ HPCs indicated that US28 was required for viral reactivation but not latency establishment or maintenance. Similarly, humanized NSG mice (huNSG) infected with TB40E-GFP-US28stop failed to reactivate upon treatment with granulocyte-colony-stimulating factor, but viral genome levels were maintained. Interestingly, HCMV-mediated changes in hematopoiesis during latency in vivo and in vitro was also dependent upon US28, as US28 directly promoted differentiation toward the myeloid lineage. To determine whether US28 constitutive activity and/or ligand-binding activity were required for latency and reactivation, we infected both huNSG mice and CD34+ HPCs in vitro with HCMV TB40E-GFP containing the US28-R129A mutation (no CA) or Y16F mutation (no ligand binding). TB40E-GFP-US28-R129A was maintained during latency and exhibited normal reactivation kinetics. In contrast, TB40E-GFP-US28-Y16F exhibited high levels of viral genome during latency and reactivation, indicating that the virus did not establish latency. These data indicate that US28 is necessary for viral reactivation and ligand binding activity is required for viral latency, highlighting the complex role of US28 during HCMV latency and reactivation.IMPORTANCE Human cytomegalovirus (HCMV) can establish latency following infection of CD34+ hematopoietic progenitor cells (HPCs), and reactivation from latency is a significant cause of viral disease and accelerated graft failure in bone marrow and solid-organ transplant patients. The precise molecular mechanisms of HCMV infection in HPCs are not well defined; however, select viral gene products are known to regulate aspects of latency and reactivation. The HCMV-encoded chemokine receptor US28, which binds multiple CC chemokines as well as CX3CR1, is expressed both during latent and lytic phases of the virus life cycle and plays a role in latency and reactivation. However, the specific timing of US28 expression and the role of ligand binding in these processes are not well defined. In this report, we determined that US28 is required for reactivation but not for maintaining latency. However, when present during latency, US28 ligand binding activity is critical to maintaining the virus in a quiescent state. We attribute the regulation of both latency and reactivation to the role of US28 in promoting myeloid lineage cell differentiation. These data highlight the dynamic and multifunctional nature of US28 during HCMV latency and reactivation.