Topic

Latency (engineering)

About: Latency (engineering) is a research topic. Over the lifetime, 3729 publications have been published within this topic receiving 39210 citations. The topic is also known as: lag.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

BEVDetNet: Bird's Eye View LiDAR Point Cloud based Real-time 3D Object Detection for Autonomous Driving

[...]

Sambit Mohapatra¹, Senthil Yogamani¹, Heinrich Gotzig¹, Stefan Milz, Patrick Mäder² - Show less +1 more•Institutions (2)

Valeo¹, Technische Universität Ilmenau²

19 Sep 2021

TL;DR: In this article, the authors proposed a novel semantic segmentation architecture as a single unified model for object center detection using key points, box predictions and orientation prediction using binned classification in a simpler Bird's Eye View (BEV) 2D representation.

...read moreread less

Abstract: 3D object detection based on LiDAR point clouds is a crucial module in autonomous driving particularly for long range sensing. Most of the research is focused on achieving higher accuracy and these models are not optimized for deployment on embedded systems from the perspective of latency and power efficiency. For high speed driving scenarios, latency is a crucial parameter as it provides more time to react to dangerous situations. Typically a voxel or point-cloud based 3D convolution approach is utilized for this module. Firstly, they are inefficient on embedded platforms as they are not suitable for efficient parallelization. Secondly, they have a variable runtime due to level of sparsity of the scene which is against the determinism needed in a safety system. In this work, we aim to develop a very low latency algorithm with fixed runtime. We propose a novel semantic segmentation architecture as a single unified model for object center detection using key points, box predictions and orientation prediction using binned classification in a simpler Bird's Eye View (BEV) 2D representation. The proposed architecture can be trivially extended to include semantic segmentation classes like road without any additional computation. The proposed model has a latency of 4 ms on the embedded Nvidia Xavier platform. The model is 5X faster than other top accuracy models with a minimal accuracy degradation of 2% in Average Precision at ${\mathbf{I}\mathbf{o}\mathbf{U}=\boldsymbol{0.5}}$ on KITTI dataset.

...read moreread less

14 citations

Proceedings Article•

Low Latency Communication on DIMMnet-1 Network Interface Plugged into a DIMM Slot

[...]

Noboru Tanabe, Yoshihiro Hamada, Hironori Nakajo, Hideki Imashiro, Junji Yamamoto, Tomohiro Kudoh, Hideharu Amano - Show less +3 more

22 Sep 2002

TL;DR: The round-trip time for AOTF on this incompletely tuned DIMMnet-1 is 7.5 times faster than Myrinet2000 and the barrier synchronization time is 4 times fasterthan that of an SR8000 supercomputer, showing that DIMmnet- 1 holds promise for applications in which scalable performance with traditional approaches is difficult because of frequent data exchange.

...read moreread less

Abstract: DIMMnet-1 is a high performance network interface for PC clusters that can be directly plugged into the DIMM slot of a PC. By using both low latency AOTF (Atomic On-The-Fly) sending and high bandwidth BOTF (Block On-The-Fly) sending, it can overcome the overhead caused by standard I/O such as the PCI bus. Two types of DIMMnet-1 prototypeboards (providing optical and electrical network interfaces) containing a Martini network interface controller chip are currently available. They can be plugged into a 100MHz DIMM slot of a PC with a Pentium-3, Pentium-4 or Athlon processor. The round-trip time for AOTF onthis incompletely tuned DIMMnet-1 is 7.5 times faster than Myrinet2000. The barrier synchronization time for AOTF is 4 times faster than that of an SR8000 supercomputer. Theinter-two-node floating sum operation time is 1903 ns. This shows that DIMMnet-1 holds promise for applications in which scalable performance with traditional approaches is difficult because of frequent data exchange.

...read moreread less

14 citations

Patent•

Method and apparatus for resuming memory operations from a low latency wake-up low power state

[...]

Opher D. Kahn¹, Doron Orenstein¹•Institutions (1)

Intel¹

14 Feb 2000

TL;DR: In this paper, a system including a processor, an operating system, and a memory subsystem that requires initialization commands to exit a memory low power state is presented, where control logic detects exit from a low-latency wake-up low-power state and responsively generates a plurality of initialization commands.

...read moreread less

Abstract: A method and apparatus for resuming operations from a low latency wake-up low power state. One embodiment provides a system including a processor, an operating system, and a memory subsystem that requires initialization commands to exit a memory low power state. Control logic detects exit from an operating system low latency low power state and responsively generates a plurality of initialization commands to remove the memory subsystem from the memory low power state prior to deasserting a stop clock signal and allowing execution to resume.

...read moreread less

14 citations

Proceedings Article•

FusionRAID: Achieving Consistent Low Latency for Commodity {SSD} Arrays

[...]

Tianyang Jiang¹, Guangyan Zhang¹, Zican Huang¹, Xiaosong Ma², Junyu Wei¹, Zhiyue Li¹, Weimin Zheng¹ - Show less +3 more•Institutions (2)

Tsinghua University¹, Qatar Computing Research Institute²

01 Jan 2021

TL;DR: FusionRAID is proposed, a new RAID architecture that achieves consistent, low latency on commodity SSD arrays by spreading requests to all SSDs in a shared, large storage pool, bursty application workloads can be served by plenty of “normal-behaving” drives.

...read moreread less

Abstract: The use of all-flash arrays has been increasing. Compared to their hard-disk counterparts, each drive offers higher performance but also undergoes more severe periodic performance degradation (due to internal operations such as garbage collection). With a detailed study of widely-used applications/traces and 6 SSD models, we confirm that individual SSD’s performance jitters are further magnified in RAID arrays. Our results also reveal that with SSD latency low and decreasing, the software overhead of RAID write creates long, complex write paths involving more drives, raising both average-case latency and risk of exposing worst-case performance. Based on these findings, we propose FusionRAID, a new RAID architecture that achieves consistent, low latency on commodity SSD arrays. By spreading requests to all SSDs in a shared, large storage pool, bursty application workloads can be served by plenty of “normal-behaving” drives. By performing temporary, replicated writes, it retains RAID faulttolerance yet greatly accelerates small, random writes. Blocks of such transient data replicas are created in stripe-ready locations based on RAID declustering, enabling effortless conversion to long-term RAID storage. Finally, using lightweight SSD latency spike detection and request redirection, FusionRAID avoids drives under transient but severe performance degradation. Our evaluation with traces and applications shows that FusionRAID brings a 22%–98% reduction in median latency, and a 2.7×–62× reduction in tail latency, with a moderate and temporary space overhead.

...read moreread less

14 citations

Journal Article•DOI•

A single-cycle parallel multi-slice connected components analysis hardware architecture

[...]

Michael J. Klaiber¹, Donald G. Bailey², Sven Simon¹•Institutions (2)

University of Stuttgart¹, Massey University²

01 Aug 2019-Journal of Real-time Image Processing

TL;DR: A memory-efficient architecture for single-pass connected components analysis suited for high-throughput embedded image processing systems is proposed which achieves a speedup by partitioning the image into slices and allows reuse of labels associated with the image objects.

...read moreread less

Abstract: In this paper, a memory-efficient architecture for single-pass connected components analysis suited for high-throughput embedded image processing systems is proposed which achieves a speedup by partitioning the image into slices. Although global data dependencies of image segments spanning several image slices exist, a temporal and spatial local algorithm is proposed, together with a suited FPGA hardware architecture processing pixel data at low latency. The low latency of the proposed architecture allows reuse of labels associated with the image objects. This reduces the amount of memory by a factor of more than 5 in the considered implementations which is a significant contribution since memory is a critical resource in embedded image processing on FPGAs. Therefore, a significantly higher bandwidth of pixel data can be processed with this architecture compared to the state-of-the-art architectures using the same amount of hardware resources.

...read moreread less

14 citations

Collapse

Network Information

Performance

Metrics

3,729

Papers

51,651

Citations

No. of papers in the topic in previous years
Year	Papers
2022	10
2021	692
2020	481
2019	389
2018	366
2017	227

Latency (engineering)

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics