Journal•

arXiv: Performance

About: arXiv: Performance is an academic journal. The journal publishes majorly in the area(s): Server & Queueing theory. Over the lifetime, 872 publications have been published receiving 5021 citations.

...read moreread less

Topics: Server, Queueing theory, Scheduling (computing), Cache, Queue ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Posted Content•

Optimal Power Cost Management Using Stored Energy in Data Centers

[...]

Rahul Urgaonkar¹, Bhuvan Urgaonkar², Michael J. Neely³, Anand Sivasubramaniam²•Institutions (3)

BBN Technologies¹, Pennsylvania State University², University of Southern California³

16 Mar 2011-arXiv: Performance

TL;DR: This work investigates cost reduction opportunities that arise by the use of uninterrupted power supply units as energy storage devices and develops an online control algorithm that can optimally exploit these devices to minimize the time average cost.

...read moreread less

Abstract: Since the electricity bill of a data center constitutes a significant portion of its overall operational costs, reducing this has become important. We investigate cost reduction opportunities that arise by the use of uninterrupted power supply (UPS) units as energy storage devices. This represents a deviation from the usual use of these devices as mere transitional fail-over mechanisms between utility and captive sources such as diesel generators. We consider the problem of opportunistically using these devices to reduce the time average electric utility bill in a data center. Using the technique of Lyapunov optimization, we develop an online control algorithm that can optimally exploit these devices to minimize the time average cost. This algorithm operates without any knowledge of the statistics of the workload or electricity cost processes, making it attractive in the presence of workload and pricing uncertainties. An interesting feature of our algorithm is that its deviation from optimality reduces as the storage capacity is increased. Our work opens up a new area in data center power management.

...read moreread less

335 citations

Posted Content•

A Performance Comparison of CUDA and OpenCL

[...]

Kamran Karimi, Neil G. Dickson, Firas Hamze

14 May 2010-arXiv: Performance

TL;DR: This paper uses complex, near-identical kernels from a Quantum Monte Carlo application to compare the performance of CUDA and OpenCL and shows that when using NVIDIA compiler tools, converting a CUDA kernel to an OpenCL kernel involves minimal modifications.

...read moreread less

Abstract: CUDA and OpenCL are two different frameworks for GPU programming. OpenCL is an open standard that can be used to program CPUs, GPUs, and other devices from different vendors, while CUDA is specific to NVIDIA GPUs. Although OpenCL promises a portable language for GPU programming, its generality may entail a performance penalty. In this paper, we use complex, near-identical kernels from a Quantum Monte Carlo application to compare the performance of CUDA and OpenCL. We show that when using NVIDIA compiler tools, converting a CUDA kernel to an OpenCL kernel involves minimal modifications. Making such a kernel compile with ATI's build tools involves more modifications. Our performance tests measure and compare data transfer times to and from the GPU, kernel execution times, and end-to-end application execution times for both CUDA and OpenCL.

...read moreread less

225 citations

Journal Article•DOI•

A General Formula for the Stationary Distribution of the Age of Information and Its Application to Single-Server Queues

[...]

Yoshiaki Inoue¹, Hiroyuki Masuyama², Tetsuya Takine¹, Toshiyuki Tanaka²•Institutions (2)

Osaka University¹, Kyoto University²

17 Apr 2018-arXiv: Performance

TL;DR: In this paper, the stationary distribution of the AoI in information update systems is analyzed in terms of the stationary distributions of the system delay and the peak AoI for different service disciplines.

...read moreread less

Abstract: This paper considers the stationary distribution of the age of information (AoI) in information update systems. We first derive a general formula for the stationary distribution of the AoI, which holds for a wide class of information update systems. The formula indicates that the stationary distribution of the AoI is given in terms of the stationary distributions of the system delay and the peak AoI. To demonstrate its applicability and usefulness, we analyze the AoI in single-server queues with four different service disciplines: first-come first-served (FCFS), preemptive last-come first-served (LCFS), and two variants of non-preemptive LCFS service disciplines. For the FCFS and the preemptive LCFS service disciplines, the GI/GI/1, M/GI/1, and GI/M/1 queues are considered, and for the non-preemptive LCFS service disciplines, the M/GI/1 and GI/M/1 queues are considered. With these results, we further show comparison results for the mean AoI's in the M/GI/1 and GI/M/1 queues under those service disciplines.

...read moreread less

134 citations

Journal Article•

Benchmarking TinyML Systems: Challenges and Direction

[...]

Colby R. Banbury¹, Vijay Janapa Reddi, Max W. Y. Lam, William Fu, Amin Fazel, Jeremy Holleman, Xinyuan Huang, Robert Hurtado, David Kanter, Anton Lokhmotov, David A. Patterson, Danilo Pau, Jae-sun Seo, Jeff Sieracki, Urmish Thakker, Marian Verhelst, Poonam Yadav - Show less +13 more•Institutions (1)

Harvard University¹

10 Mar 2020-arXiv: Performance

TL;DR: The current landscape of TinyML is presented and the challenges and direction towards developing a fair and useful hardware benchmark for TinyML workloads are discussed, along with three preliminary benchmarks and the selection methodology are discussed.

...read moreread less

Abstract: Recent advancements in ultra-low-power machine learning (TinyML) hardware promises to unlock an entirely new class of smart applications. However, continued progress is limited by the lack of a widely accepted benchmark for these systems. Benchmarking allows us to measure and thereby systematically compare, evaluate, and improve the performance of systems and is therefore fundamental to a field reaching maturity. In this position paper, we present the current landscape of TinyML and discuss the challenges and direction towards developing a fair and useful hardware benchmark for TinyML workloads. Furthermore, we present our four benchmarks and discuss our selection methodology. Our viewpoints reflect the collective thoughts of the TinyMLPerf working group that is comprised of over 30 organizations.

...read moreread less

127 citations

Posted Content•

AI Benchmark: All About Deep Learning on Smartphones in 2019

[...]

Andrey Ignatov¹, Radu Timofte¹, Andrei Kulik², Seung-Soo Yang³, Ke Wang⁴, Felix Baum⁵, Max Wu⁶, Lirong Xu, Luc Van Gool¹ - Show less +5 more•Institutions (6)

ETH Zurich¹, Google², Samsung³, Huawei⁴, Qualcomm⁵, MediaTek⁶

15 Oct 2019-arXiv: Performance

TL;DR: This paper evaluates the performance and compares the results of all chipsets from Qualcomm, HiSilicon, Samsung, MediaTek and Unisoc that are providing hardware acceleration for AI inference and discusses the recent changes in the Android ML pipeline.

...read moreread less

Abstract: The performance of mobile AI accelerators has been evolving rapidly in the past two years, nearly doubling with each new generation of SoCs. The current 4th generation of mobile NPUs is already approaching the results of CUDA-compatible Nvidia graphics cards presented not long ago, which together with the increased capabilities of mobile deep learning frameworks makes it possible to run complex and deep AI models on mobile devices. In this paper, we evaluate the performance and compare the results of all chipsets from Qualcomm, HiSilicon, Samsung, MediaTek and Unisoc that are providing hardware acceleration for AI inference. We also discuss the recent changes in the Android ML pipeline and provide an overview of the deployment of deep learning models on mobile devices. All numerical results provided in this paper can be found and are regularly updated on the official project website: this http URL.

...read moreread less

88 citations

Collapse

Network Information

Related Journals (5)

IEEE Transactions on Parallel and Distributed Systems

5.2K papers, 237.8K citations

85% related

IEEE ACM Transactions on Networking

4K papers, 296.9K citations

83% related

arXiv: Learning

45K papers, 837.1K citations

81% related

Computer Networks

5.9K papers, 261.1K citations

7.3K papers, 647.9K citations

79% related

Performance

Metrics

872

Papers

6,466

Citations

No. of papers from the Journal in previous years
Year	Papers
2021	73
2020	122
2019	126
2018	94
2017	87
2016	57