scispace - formally typeset
Search or ask a question
Topic

Latency (engineering)

About: Latency (engineering) is a research topic. Over the lifetime, 3729 publications have been published within this topic receiving 39210 citations. The topic is also known as: lag.


Papers
More filters
Proceedings ArticleDOI
19 Sep 2021
TL;DR: In this article, the authors proposed a novel semantic segmentation architecture as a single unified model for object center detection using key points, box predictions and orientation prediction using binned classification in a simpler Bird's Eye View (BEV) 2D representation.
Abstract: 3D object detection based on LiDAR point clouds is a crucial module in autonomous driving particularly for long range sensing. Most of the research is focused on achieving higher accuracy and these models are not optimized for deployment on embedded systems from the perspective of latency and power efficiency. For high speed driving scenarios, latency is a crucial parameter as it provides more time to react to dangerous situations. Typically a voxel or point-cloud based 3D convolution approach is utilized for this module. Firstly, they are inefficient on embedded platforms as they are not suitable for efficient parallelization. Secondly, they have a variable runtime due to level of sparsity of the scene which is against the determinism needed in a safety system. In this work, we aim to develop a very low latency algorithm with fixed runtime. We propose a novel semantic segmentation architecture as a single unified model for object center detection using key points, box predictions and orientation prediction using binned classification in a simpler Bird's Eye View (BEV) 2D representation. The proposed architecture can be trivially extended to include semantic segmentation classes like road without any additional computation. The proposed model has a latency of 4 ms on the embedded Nvidia Xavier platform. The model is 5X faster than other top accuracy models with a minimal accuracy degradation of 2% in Average Precision at ${\mathbf{I}\mathbf{o}\mathbf{U}=\boldsymbol{0.5}}$ on KITTI dataset.

14 citations

Proceedings Article
22 Sep 2002
TL;DR: The round-trip time for AOTF on this incompletely tuned DIMMnet-1 is 7.5 times faster than Myrinet2000 and the barrier synchronization time is 4 times fasterthan that of an SR8000 supercomputer, showing that DIMmnet- 1 holds promise for applications in which scalable performance with traditional approaches is difficult because of frequent data exchange.
Abstract: DIMMnet-1 is a high performance network interface for PC clusters that can be directly plugged into the DIMM slot of a PC. By using both low latency AOTF (Atomic On-The-Fly) sending and high bandwidth BOTF (Block On-The-Fly) sending, it can overcome the overhead caused by standard I/O such as the PCI bus. Two types of DIMMnet-1 prototypeboards (providing optical and electrical network interfaces) containing a Martini network interface controller chip are currently available. They can be plugged into a 100MHz DIMM slot of a PC with a Pentium-3, Pentium-4 or Athlon processor. The round-trip time for AOTF onthis incompletely tuned DIMMnet-1 is 7.5 times faster than Myrinet2000. The barrier synchronization time for AOTF is 4 times faster than that of an SR8000 supercomputer. Theinter-two-node floating sum operation time is 1903 ns. This shows that DIMMnet-1 holds promise for applications in which scalable performance with traditional approaches is difficult because of frequent data exchange.

14 citations

Patent
Opher D. Kahn1, Doron Orenstein1
14 Feb 2000
TL;DR: In this paper, a system including a processor, an operating system, and a memory subsystem that requires initialization commands to exit a memory low power state is presented, where control logic detects exit from a low-latency wake-up low-power state and responsively generates a plurality of initialization commands.
Abstract: A method and apparatus for resuming operations from a low latency wake-up low power state. One embodiment provides a system including a processor, an operating system, and a memory subsystem that requires initialization commands to exit a memory low power state. Control logic detects exit from an operating system low latency low power state and responsively generates a plurality of initialization commands to remove the memory subsystem from the memory low power state prior to deasserting a stop clock signal and allowing execution to resume.

14 citations

Proceedings Article
01 Jan 2021
TL;DR: FusionRAID is proposed, a new RAID architecture that achieves consistent, low latency on commodity SSD arrays by spreading requests to all SSDs in a shared, large storage pool, bursty application workloads can be served by plenty of “normal-behaving” drives.
Abstract: The use of all-flash arrays has been increasing. Compared to their hard-disk counterparts, each drive offers higher performance but also undergoes more severe periodic performance degradation (due to internal operations such as garbage collection). With a detailed study of widely-used applications/traces and 6 SSD models, we confirm that individual SSD’s performance jitters are further magnified in RAID arrays. Our results also reveal that with SSD latency low and decreasing, the software overhead of RAID write creates long, complex write paths involving more drives, raising both average-case latency and risk of exposing worst-case performance. Based on these findings, we propose FusionRAID, a new RAID architecture that achieves consistent, low latency on commodity SSD arrays. By spreading requests to all SSDs in a shared, large storage pool, bursty application workloads can be served by plenty of “normal-behaving” drives. By performing temporary, replicated writes, it retains RAID faulttolerance yet greatly accelerates small, random writes. Blocks of such transient data replicas are created in stripe-ready locations based on RAID declustering, enabling effortless conversion to long-term RAID storage. Finally, using lightweight SSD latency spike detection and request redirection, FusionRAID avoids drives under transient but severe performance degradation. Our evaluation with traces and applications shows that FusionRAID brings a 22%–98% reduction in median latency, and a 2.7×–62× reduction in tail latency, with a moderate and temporary space overhead.

14 citations

Journal ArticleDOI
TL;DR: A memory-efficient architecture for single-pass connected components analysis suited for high-throughput embedded image processing systems is proposed which achieves a speedup by partitioning the image into slices and allows reuse of labels associated with the image objects.
Abstract: In this paper, a memory-efficient architecture for single-pass connected components analysis suited for high-throughput embedded image processing systems is proposed which achieves a speedup by partitioning the image into slices. Although global data dependencies of image segments spanning several image slices exist, a temporal and spatial local algorithm is proposed, together with a suited FPGA hardware architecture processing pixel data at low latency. The low latency of the proposed architecture allows reuse of labels associated with the image objects. This reduces the amount of memory by a factor of more than 5 in the considered implementations which is a significant contribution since memory is a critical resource in embedded image processing on FPGAs. Therefore, a significantly higher bandwidth of pixel data can be processed with this architecture compared to the state-of-the-art architectures using the same amount of hardware resources.

14 citations


Network Information
Related Topics (5)
Network packet
159.7K papers, 2.2M citations
92% related
Server
79.5K papers, 1.4M citations
91% related
Wireless
133.4K papers, 1.9M citations
90% related
Wireless sensor network
142K papers, 2.4M citations
90% related
Wireless network
122.5K papers, 2.1M citations
90% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202210
2021692
2020481
2019389
2018366
2017227