This paper derives simple, yet fundamental formulas to describe the interplay between parallelism of an application, program performance, and energy consumption and derives optimal frequencies allocated to the serial and parallel regions in an application to either minimize the total energy consumption or minimize the energy-delay product.
Abstract:
This paper derives simple, yet fundamental formulas to describe the interplay between parallelism of an application, program performance, and energy consumption. Given the ratio of serial and parallel portions in an application and the number of processors, we derive optimal frequencies allocated to the serial and parallel regions in an application to either minimize the total energy consumption or minimize the energy-delay product. The impact of static power is revealed by considering the ratio between static and dynamic power and quantifying the advantages of adding to the architecture capability to turn off individual processors and save static energy. We further determine the conditions under which one can obtain both energy and speed improvement, as well as the amount of improvement. While the formulas we obtain use simplifying assumptions, they provide valuable theoretical insights into energy-aware processor resource management. Our results form a basis for several interesting research directions in the area of energy-aware multicore processor architectures.
TL;DR: A sleep mode handling method for processor architectures with support for per-core power gating is developed and the solution is inspired by cellular automata, finding that it is possible to construct a distributed core sleep mode Handling policy which only depends on local information e.g. the neighboring cores power states, and still performs comparable to policies that are not distributed.
TL;DR: A novel energy-efficient policy is presented, which is aiming at reducing the data-accessing energy consumption of workflow applications that executed on cloud environments that outperforms existing approaches in terms of energy efficiency and execution performance.
TL;DR: Nine algorithms for energy and time constrained scheduling of independent sequential tasks on a multiprocessor computer with bounded and discrete and irregular clock frequency and supply voltage and execution speed and power consumption levels are proposed and it is found that the combination of the largest execution requirement first method for task selection and the longest timeFirst method for list scheduling yields the best performance.
TL;DR: This paper considers three major families of multicore processors - symmetric, asymmetric, and dynamic - and proposes three separate corollaries to the standard Amdahl's law to model the performance of different multicore configurations with different modes of operation.
TL;DR: A new method, based on convex problem solving, that determines the most energy efficient operating point in terms of frequency and number of active cores in an MPSoC is introduced.
TL;DR: This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most important trends facing computer designers today.
TL;DR: In this paper, the authors argue that the organization of a single computer has reached its limits and that truly significant advances can be made only by interconnection of a multiplicity of computers in such a manner as to permit cooperative solution.
TL;DR: The parallel landscape is frame with seven questions, and the following are recommended to explore the design space rapidly: • The overarching goal should be to make it easy to write programs that execute efficiently on highly parallel computing systems • The target should be 1000s of cores per chip, as these chips are built from processing elements that are the most efficient in MIPS (Million Instructions per Second) per watt, MIPS per area of silicon, and MIPS each development dollar.
TL;DR: This paper proposes a simple model of job scheduling aimed at capturing some key aspects of energy minimization, and gives an off-line algorithm that computes, for any set of jobs, a minimum-energy schedule.
Q1. What are the contributions mentioned in the paper "On the interplay of parallelization, program performance, and energy consumption" ?
This paper derives simple, yet fundamental formulas to describe the interplay between parallelism of an application, program performance, and energy consumption. The authors further determine the conditions under which one can obtain both energy and speed improvement, as well as the amount of improvement. While the formulas the authors obtain use simplifying assumptions, they provide valuable theoretical insights into energy-aware processor resource management.
Q2. What have the authors stated for future works in "On the interplay of parallelization, program performance, and energy consumption" ?
In this paper, the authors developed an analytical framework to study the trade-offs between parallelization, program performance, and energy consumption. The authors considered two machine models ; one assumes that individual processors can not be turned off independently, and the other assumes that they can. When processors can be individually turned off, the analysis indicates that the minimum total energy is independent of the number of processors used for executing the parallel section, while the energy-delay product is minimized when the maximum number of available processors are used during the parallel execution section. The demonstrated substantial power advantage that can be gained from turning off individual processors is a great incentive to designing multicore processors with the capability of turning off individual processors.