Topic

Performance per watt

About: Performance per watt is a research topic. Over the lifetime, 315 publications have been published within this topic receiving 5778 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Thermal-aware design and analysis techniques for integrated circuits and high-performance microprocessor systems

[...]

Seda Ogrenci Memik¹, Rajarshi Mukherjee¹•Institutions (1)

Northwestern University¹

01 Jan 2006

TL;DR: The results show that the algorithm, by incorporating physical interaction of the cores, consistently succeeds in maximizing the operating frequency of the most critical core while successfully relieving the thermal emergency of the core.

...read moreread less

Abstract: Physical phenomena such as temperature and power have an increasingly important role in performance and reliability of modern process technologies. This trend will only strengthen with future generations. In this dissertation we present mechanisms for thermal management and power optimizations of integrated circuits. We present three thermal aware high-level synthesis techniques for peak temperature reduction targeting ASIC design flow. Decisions made during high-level synthesis impact the activity of functional resources and their power consumption. Power consumed is dissipated as heat. The first approach consists of two constructive temperature-aware resource allocation and binding algorithms - temperature constrained resource minimization and resource constrained temperature minimization. The second technique is an iterative temperature aware binding algorithm, which evenly distributes activity across functional units. The third mechanism combines temperature-aware scheduling and binding based on feedback from post floorplan thermal simulation. Our techniques are effective in peak temperature reduction, and reducing leakage and total power consumption. In order to maintain performance per Watt in microprocessors, there is a shift towards chip level multiprocessing paradigm. With such large-scale integration and increasing power densities Dynamic Thermal Management (DTM) continues to be a significant design effort to maintain performance and reliability. We present two mechanisms to perform real time frequency scaling as part of dynamic frequency and voltage scaling to assist DTM. The results show that our algorithm, by incorporating physical interaction of the cores, consistently succeeds in maximizing the operating frequency of the most critical core while successfully relieving the thermal emergency of the core. DTM techniques rely on accurate readings of on-die thermal sensors. Next, we present novel techniques for determining the optimal locations and allocations for thermal sensors to provide a high fidelity thermal profile of a complex microprocessor system. We show that our tool is able to create a sensor distribution for a given microprocessor architecture providing accurate thermal measurements. Increased logic density and programmability of FPGAs cause high power dissipation and on-chip temperature. We present techniques for placement and minimization of sensors, which can then be mapped onto FPGA, post-fabrication for thermal monitoring and power driven netlist partitioning for realizing low power FPGAs.

...read moreread less

1 citations

Proceedings Article•DOI•

Improving performance per Watt of non-monotonic Multicore Processors via bottleneck-based online program phase classification

[...]

Sudarshan Srinivasan¹, Israel Koren¹, Sandip Kundu¹•Institutions (1)

University of Massachusetts Amherst¹

01 Oct 2016

TL;DR: A novel online program phase detection technique that is based on the frequency of cache misses and processor stalls which correspond to core resource bottlenecks is proposed that can demonstrate as much as 22% improvement in average performance/Watt using Instructions per Second (IPS) as the performance metric.

...read moreread less

Abstract: Heterogeneous architectures offer the promise of higher performance/Watt compared to symmetric multi-cores. Recent works have proposed the use of non-monotonic (NM) heterogeneous architectures with diverse core types where each core has unique power and performance characteristics. However, the power and performance benefits achieved by NM architectures is highly dependent on assignment of application to the most suitable core type for all program phases. In this paper we propose a novel online program phase detection technique that is based on the frequency of cache misses and processor stalls which correspond to core resource bottlenecks. We track performance monitors to formulate a Bottleneck Type Vector (BTV) that help direct the application to most appropriate core type for execution. We compare the proposed BTV-based core assignment method to prior online core assignment approaches and demonstrate as much as 22% improvement in average performance/Watt using Instructions per Second (IPS) as the performance metric.

...read moreread less

1 citations

Journal Article•DOI•

Disk I/O Performance-per-Watt Analysis for Cloud Computing

[...]

Joseph Issa, Abdallah Kassem

18 Jul 2014-International Journal of Computer Applications

TL;DR: This paper analyzes disk I/O performance by assessing the disk bandwidth and latency for different reads and writes configurations for sequential and random patterns and proposes an estimation method which estimates disk latency at different disk queue depth settings.

...read moreread less

Abstract: disk I/O performance and power consumption associated with a given cloud workload is important especially for workloads that are bounded by disk I/O. Disk performance becomes a bottleneck for achieving higher performance and lower power consumption especially when memory size is not enough to process large blocks of data. This will lead to a negative impact on the Quality-of-Service (QoS). In this paper, we analyze disk I/O performance by assessing the disk bandwidth and latency for different reads and writes configurations for sequential and random patterns. The systems used are based on ATOM D525 and Xeon X5660 processors. We analyze power consumption for both systems and provide a performance-per-watt optimum operation point. We also propose an estimation method which estimates disk latency at different disk queue depth settings. The estimation method is verified to estimate disk latency with < 5% error margin. KeywordsI/O performance, performance-per-watt analysis, cloud computing

...read moreread less

1 citations

Patent•

Intelligent multicore control for optimal performance per watt

[...]

희준 박, 스티븐 톰슨, 로날드 프랭크 앨튼, 에도아르도 레기니, 사티쉬 고베르단, 피테르-라위스 담 박케르 - Show less +2 more

07 Aug 2014

TL;DR: In this article, the authors identify and enable the optimum set of processor cores in order to achieve the best performance for the lowest power consumption level or a given power budget for a given workload.

...read moreread less

Abstract: Various aspects provide a device and method for the intelligent control of a plurality of multi-core processor cores of a multi-core integrated circuit The aspect may be identified and enable the optimum set of the processor core in order to achieve the best performance for the lowest power consumption level or a given power budget for a given workload Optimal set of processor cores may be designated day of the number of active processor cores or active core processor specific If the temperature readings of the processor core is under the threshold, the set of processor cores may be selected to provide the lowest power consumption for a given workload If the temperature reading is above the threshold value of the processor core, a set of processor cores may be selected to provide the best performance for a given power budget

...read moreread less

1 citations

Virtualization with the Intel® Xeon® Processor 5500 Series: A Proof of Concept

[...]

Sudip Chahal, Sudhir S. Bangalore, Raghu Yeluri, Stephen G. Anderson, Ashok Emani - Show less +1 more

01 Jan 2009

TL;DR: Proof-of-concept testing and total cost of ownership (TCO) analysis were conducted and seamless live migration between servers based on Intel Xeon processor 5500 series and previous Intel processor generations was verified using VMware Enhanced VMotion* and Intel Virtualization Technology FlexMigration assist.

...read moreread less

Abstract: Intel IT, together with Intel’s Digital Enterprise Group, End User Platform Integration, and Intel’s Software and Services Group, conducted proof-of-concept testing and total cost of ownership (TCO) analysis to assess the virtualization capabilities of Intel® Xeon® processor 5500 series. A server based on Intel® Xeon® processor X5570 delivered up to 2.6x the performance and up to 2.05x the performance per watt of a server based on Intel® Xeon® processor E5450, resulting in the ability to support approximately twice as many virtual machines for the same TCO. We also verified seamless live migration between servers based on Intel Xeon processor 5500 series and previous Intel® processor generations using VMware Enhanced VMotion* and Intel® Virtualization Technology FlexMigration assist.

...read moreread less

1 citations

Collapse

Network Information

Performance

Metrics

315

Papers

6,353

Citations

No. of papers in the topic in previous years
Year	Papers
2021	14
2020	15
2019	15
2018	36
2017	25
2016	31

Performance per watt

Papers published on a yearly basis

Papers

Network Information

Related Topics (5)

Performance

Metrics