Topic

Latency (engineering)

About: Latency (engineering) is a research topic. Over the lifetime, 7278 publications have been published within this topic receiving 115409 citations. The topic is also known as: lag.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Patent•

Traffic switching using multi-dimensional packet classification

[...]

Hamid R. Mehrvar¹, Roland A. Smith¹, Xiaoming Fan¹, Mark Stephen Wight¹•Institutions (1)

Nortel¹

22 Feb 2002

TL;DR: In this paper, a multi-dimensional traffic classification scheme is proposed, in which multiple orthogonal traffic classification methods are successively implemented for each traffic stream traversing the system.

...read moreread less

Abstract: A method and system for conveying an arbitrary mixture of high and low latency traffic streams across a common switch fabric implements a multi-dimensional traffic classification scheme, in which multiple orthogonal traffic classification methods are successively implemented for each traffic stream traversing the system. At least two diverse paths are mapped through the switch fabric, each path being optimized to satisfy respective different latency requirements. A latency classifier is adapted to route each traffic stream to a selected path optimized to satisfy latency requirements most closely matching a respective latency requirement of the traffic stream. A prioritization classifier independently prioritizes traffic streams in each path. A fairness classifier at an egress of each path can be used to enforce fairness between responsive and non-responsive traffic streams in each path. This arrangement enables traffic streams having similar latency requirements to traverse the system through a path optimized for those latency requirements.

...read moreread less

166 citations

Proceedings Article•DOI•

STACL: Simultaneous Translation with Implicit Anticipation and Controllable Latency using Prefix-to-Prefix Framework

[...]

Mingbo Ma¹, Liang Huang², Xiong Hao¹, Renjie Zheng³, Kaibo Liu⁴, Baigong Zheng⁴, Zhang Chuanqiang¹, Zhongjun He¹, Hairong Liu¹, Xing Li, Hua Wu¹, Haifeng Wang¹ - Show less +8 more•Institutions (4)

Baidu¹, University of Science and Technology of China², Fudan University³, Oregon State University⁴

01 Jul 2019

TL;DR: This paper propose a prefix-to-prefix framework for multaneous translation that implicitly learns to anticipate in a single translation model, which achieves low latency and reasonable qual- ity (compared to full-sentence translation) on 4 directions.

...read moreread less

Abstract: Simultaneous translation, which translates sentences before they are finished, is use- ful in many scenarios but is notoriously dif- ficult due to word-order differences. While the conventional seq-to-seq framework is only suitable for full-sentence translation, we pro- pose a novel prefix-to-prefix framework for si- multaneous translation that implicitly learns to anticipate in a single translation model. Within this framework, we present a very sim- ple yet surprisingly effective “wait-k” policy trained to generate the target sentence concur- rently with the source sentence, but always k words behind. Experiments show our strat- egy achieves low latency and reasonable qual- ity (compared to full-sentence translation) on 4 directions: zh↔en and de↔en.

...read moreread less

163 citations

Proceedings Article•DOI•

Ubik: efficient cache sharing with strict qos for latency-critical workloads

[...]

Harshad Kasture¹, Daniel Sanchez¹•Institutions (1)

Massachusetts Institute of Technology¹

24 Feb 2014

TL;DR: This work proposes Ubik, a dynamic partitioning technique that predicts and exploits the transient behavior of latency-critical workloads to maintain their tail latency while maximizing the cache space available to batch applications.

...read moreread less

Abstract: Chip-multiprocessors (CMPs) must often execute workload mixes with different performance requirements. On one hand, user-facing, latency-critical applications (e.g., web search) need low tail (i.e., worst-case) latencies, often in the millisecond range, and have inherently low utilization. On the other hand, compute-intensive batch applications (e.g., MapReduce) only need high long-term average performance. In current CMPs, latency-critical and batch applications cannot run concurrently due to interference on shared resources. Unfortunately, prior work on quality of service (QoS) in CMPs has focused on guaranteeing average performance, not tail latency. In this work, we analyze several latency-critical workloads, and show that guaranteeing average performance is insufficient to maintain low tail latency, because microarchitectural resources with state, such as caches or cores, exert inertia on instantaneous workload performance. Last-level caches impart the highest inertia, as workloads take tens of milliseconds to warm them up. When left unmanaged, or when managed with conventional QoS frameworks, shared last-level caches degrade tail latency significantly. Instead, we propose Ubik, a dynamic partitioning technique that predicts and exploits the transient behavior of latency-critical workloads to maintain their tail latency while maximizing the cache space available to batch applications. Using extensive simulations, we show that, while conventional QoS frameworks degrade tail latency by up to 2.3x, Ubik simultaneously maintains the tail latency of latency-critical workloads and significantly improves the performance of batch applications.

...read moreread less

163 citations

Patent•

System and method for reducing latency in call setup and teardown

[...]

Rajat Ghai¹, Jim Towey¹•Institutions (1)

Cisco Systems, Inc.¹

07 Jul 2008

TL;DR: In this paper, a network device with integrated functionalities and a cache is provided that stores policy information to reduce the amount of signaling that is necessary to setup and teardown sessions.

...read moreread less

Abstract: Systems and methods for reducing latency in call setup and teardown are provided. A network device with integrated functionalities and a cache is provided that stores policy information to reduce the amount of signaling that is necessary to setup and teardown sessions. By handling various aspects of the setup and teardown within a network device, latency is reduced and the amount of bandwidth needed for setup signaling is also reduced.

...read moreread less

161 citations

Journal Article•DOI•

An analysis of latency and interresponse time in free recall.

[...]

Doug Rohrer¹, John T. Wixted¹•Institutions (1)

University of California, San Diego¹

01 Sep 1994-Memory & Cognition

TL;DR: In four experiments, subjects freely recalled previously studied items while a voice key and computer recorded each item’s recall latency relative to the onset of the recall period, suggesting that retrieval includes a brief normally distributed initiation stage followed by a longer exponentially distributed search stage.

...read moreread less

Abstract: In four experiments, subjects freely recalled previously studied items while a voice key and computer recorded each item’s recall latency relative to the onset of the recall period. The measures of recall probability and mean recall latency were shown to be empirically independent, demonstrating that there exists no a priori relationship between the two. In all four experiments, latency distributions were fit well by the ex-Gaussian, suggesting that retrieval includes a brief normally distributed initiation stage followed by a longer exponentially distributed search stage. Further, the variation in mean latency stemmed from the variation in the duration of the search stage, not the initiation stage. Interresponse times (IRTs), the time elapsed between two successive item recalls, were analyzed as well. The growth of mean IRTs, plotted as a function of output position, was shown to be a simple function of the number of items not yet recalled. Finally, the mathematical nature of both free recall latency and IRT growth are shown to be consistent with a simple theoretical account of retrieval that depicts mean recall latency as a measure of the breadth of search.

...read moreread less

161 citations

Collapse

Network Information

Performance

Metrics

7,278

Papers

130,965

Citations

No. of papers in the topic in previous years
Year	Papers
2022	2
2021	485
2020	529
2019	533
2018	500
2017	405

Latency (engineering)

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics