Support: My students and I gratefully acknowledge the following agencies for their generous support:
Motivation: Every major manufacturer of high performance microprocessors has shifted to multicore processors as a solution to the problem of soaring power consumption of single core processors. As this trend continues toward processors with hundreds of cores we will once again face the problem of soaring power dissipation. In contrast to single core processors, multicore processors will require far more sophisticated techniques to control the heat generated and maximize the performance. With singlecore processors the thermal package and 43 cooling system were designed to safely handle close to the worstcase power dissipation. Hence voltage and frequency scaling could be sparingly used. This approach leads to poor performance and energy efficiency when applied to multicore processors. The number of threads ready for execution at any instant varies significantly over time, and this results in spatial and temporal variations in the die activity and, hence, the temperature. Therefore, in contrast to a singlecore processor, the average power consumption of a multicore processor will be a smaller fraction of the worstcase power consumption. For multicore processors the chip’s thermal design power has to be set much closer to average power. As a result, there has to be greater reliance on dynamic thermal management (DTM) techniques to not only protect the chip but also to not waste performance and to improve energy efficiency. The dynamic power consumption of a core is a time varying function of the core speed s(t), voltage v(t), and the task allocated to it. The standby leakage power depends on the core voltage and its temperature, and the temperature of a core depends on its dynamic power consumption and the standby leakage power. The core speed is related to not only the core voltage but also the temperature – the latter being caused by the mobility degradation of transistors at higher temperatures. Thus the problem of controlling the components of an SoC is one of a nonlinear, multidimensional, constrained optimization problem, requiring extremization of any one of the aforementioned objective functions over the control variables, with constraints arising from the relationships between the control variables, the sources of power dissipation, and temperature. The primary objective of this project is to develop a robust, optimal controller for performing DTM in realtime for embedded processsors or servers. The DTM optimizer should be sufficiently fast so as to adjust the speeds and voltages within the 1020 ms time frame, and migrate tasks within the 100200 ms interval. Robustness requires the error correction that will be based on differences between the predicted and estimated state. The corrected state will be used to determine the optimal values of core voltages and speeds for the subsequent interval. Figure on the right shows the block diagram of the DTM controller called STEAM, which stands for a Smart Temperature and EnergyAware Multicore controller. STEAM can predict the optimal voltage/frequency setting to achieve maximum energy efficiency, through online power and thermal model estimations and the minimization of prediction error using an recursive least squares (RLS) and Kalman filters. STEAM can also be used to optimize other objective functions including maximization of performance, maximization of PPW, minimization of peak temperature and power consumption while satisfying task deadlines. The structure consists of a DTM controller that generates optimal DVFS states for the next time interval based on the computed power and thermal models, current package temperature and core utilization. To predict the optimal voltage/frequency setting in real time, STEAM incorporates detailed power and thermal models which consider the coupled heat interaction between all cores, as well as temperaturedependent transistor leakage. In order to reduce the noise from the thermal sensors, STEAM incorporates RLS and Kalman filters to recursively eliminate the noisy sensor inputs. Therefore, STEAM’s prediction model is selfoptimizing over time. STEAM optimizes a generalized form of energy efficiency – Perf^α /Power, where α is a parameter that is used to assign a greater importance to performance or power in the optimization. In summary, STEAM is a lightweight closedloop controller that coordinates the multiple, independent DVFS settings of all cores on a multicore platform. STEAM is implemented and evaluated on a Linux based system running on an Intel QuadCore Sandy Bridgebased processor. It was evaluated by running the set of benchmarks called MIBench. In these set of experiments, STEAM was able to regulate core temperatures below a specified maximum core temperature to be within 3◦C above the specified maximum. Experiments to analyze STEAM’s ability to track power consumption and temperatures has shown that its prediction accuracy for temperature is within 0.1◦C mean error and 2.5◦C standard deviation error. For power consumption it has a 0.04 W mean error and 0.6 W standard deviation error for average workload power consumption of 30 W. In comparison, the standard deviation of noise for temperature sensors is 2.1◦C, and 0.7 W for power sensor. Since the prediction error is close to the sensors noise range, the proposed STEAM controller has demonstrated excellent tracking ability on a real processor.
The figure above shows a summary of the results comparing STEAM with other DTM techniques available on Linux,: referred to as Performance (strategy to maximize performance), Ondemand (strategy that reacts to workload activity to maximize performance) and Powersave (strategy aimed a minimizing overall energy consumption) governors. The size of the circles in the figure represents relative MIPS/Watt. STEAM outperforms all of the existing policies on Linux by at least 32%. The black dotted curve in the figure shows the energydelay pattern, and where a new policy might possibly fit in. Relevant Publication

Sarma Vrudhula at ASU > Research >