Sarma Vrudhula at ASU‎ > ‎Research‎ > ‎

Power and Energy Management of Multicore Processors

Support: My students and I gratefully acknowledge the following agencies for their generous support: 

  • National Science Foundation (CNS-0509540, CNS-0905035), 
  • Science Foundation Arizona SFAz SRG 0211-04, the Star Dust Foundation
  • Center for Embedded Systems DWS-0086. 

Motivation: Every major manufacturer of high performance microprocessors has shifted to multi-core processors as a solution to the problem of soaring power consumption of single core processors. As this trend continues toward processors with hundreds of cores we will once again face the problem of soaring power dissipation. In contrast to single core processors, multi-core processors will require far more sophisticated techniques to control the heat generated and maximize the performance. With single-core processors the thermal package and 43 cooling system were designed to safely handle close to the  worst-case power dissipation. Hence voltage and frequency scaling could be sparingly used. This approach leads to poor performance and energy efficiency when applied to multicore processors. The number of threads ready for execution at any instant varies significantly over time, and this results in spatial and temporal variations in the die activity and, hence, the temperature. Therefore, in contrast to a single-core processor, the average power consumption of a multicore processor will be a smaller fraction of the worst-case power consumption. For multicore processors the chip’s thermal design power has to be set much closer to  average power. As a result, there has to be greater reliance on dynamic thermal management (DTM) techniques to not only protect the chip but also to not waste performance and to improve energy efficiency. 

The dynamic power consumption of a core is a time varying function of the core speed s(t), voltage v(t), and the task allocated to it. The standby leakage power depends on the core voltage and its temperature, and the temperature of a core depends on its dynamic power consumption and the standby leakage power. The core speed is related to not only the core voltage but also the temperature – the latter being caused by the mobility degradation of transistors at higher temperatures. Thus the problem of controlling the components of an SoC is one of a non-linear, multi-dimensional, constrained optimization problem, requiring extremization of any one of the aforementioned objective functions over the control variables, with constraints arising from the relationships between the control variables, the sources of power dissipation, and temperature. 

The primary objective of this project is to develop a robust, optimal controller for performing DTM in real-time for embedded processsors or servers. The DTM optimizer should be sufficiently fast so as to adjust the speeds and voltages within the 10-20 ms time frame, and migrate tasks within the 100-200 ms interval. Robustness requires the error correction that will be based on differences between the predicted and estimated state. The corrected state will be used to determine the optimal values of core voltages and speeds for the subsequent interval.





Figure on the right shows the block diagram of the DTM controller called STEAM, which stands for a Smart Temperature and Energy-Aware Multicore controller. STEAM can predict the optimal voltage/frequency setting to achieve maximum energy efficiency, through online power and thermal model estimations and the minimization of prediction error using an recursive least squares (RLS) and Kalman filters. STEAM can also be used to optimize other objective functions including maximization of performance, maximization of PPW, minimization of peak temperature and power consumption while satisfying task deadlines. The structure consists of a DTM controller that generates optimal DVFS  states for the next time interval based on the computed power and thermal models, current package temperature and core utilization. To predict the optimal voltage/frequency setting in real time, STEAM incorporates detailed power and thermal models which consider the coupled heat interaction between all cores, as well as temperature-dependent transistor leakage. In order to reduce the noise from the thermal sensors, STEAM incorporates RLS and Kalman filters to recursively eliminate the noisy sensor inputs. Therefore, STEAM’s prediction model is self-optimizing over time. STEAM optimizes a generalized form of energy efficiency – Perf^α /Power, where α is a parameter that is used to assign a greater importance to performance or power in the optimization. In summary, STEAM is a lightweight closed-loop controller that coordinates the multiple, independent DVFS settings of all cores on a multicore platform.

STEAM is implemented and evaluated on a Linux based system running on an Intel Quad-Core Sandy Bridge-based processor. It was evaluated by running the set of benchmarks called MIBench. In these set of experiments, STEAM was able to regulate core temperatures below a specified maximum core temperature to be within 3◦C above the specified maximum. Experiments to analyze STEAM’s ability to track power consumption and temperatures has shown that its prediction accuracy for temperature is within 0.1◦C mean error and 2.5◦C standard deviation error. For power consumption it has a 0.04 W mean error and 0.6 W standard deviation error for average workload power consumption of 30 W. In comparison, the standard deviation of noise for temperature sensors is 2.1◦C, and 0.7 W for power sensor. Since the prediction error is close to the sensors noise range, the proposed STEAM controller has demonstrated excellent tracking ability on a real processor.



The figure above shows a summary of the results comparing STEAM with other  DTM techniques available on Linux,: referred to as Performance (strategy to maximize performance), Ondemand (strategy that reacts to workload activity to maximize performance) and Powersave (strategy aimed a minimizing overall energy consumption) governors. The size of the circles in the figure represents relative MIPS/Watt.  STEAM  outperforms all of the existing policies on Linux by at least 32%. The black dotted curve in the figure shows the energy-delay pattern, and where a new policy might possibly fit in.


Relevant Publication

  • V. Hanumaiah, D. Desai, B. Gaudette, C.-J. Wuu, and S. Vrudhula, “STEAM: A Smart Tem- perature and Energy Aware Multicore Controller,” ACM Transactions on Embedded Comput- ing Systems, vol. 13, Sept. 2014.
  • V. Hanumaiah and S. Vrudhula, “Energy-efficient Operation of Multi-core Processors by DVFS, Task Migration and Active Cooling,” IEEE Transactions on Computers, vol. 63, pp. 349–360, Feb. 2014.
  • Vinay Hanumaiah, Sarma Vrudhula. "Temperature-aware DVFS for Hard Real-Time Applications on Multi-core Processors". In IEEE Transactions on Computers, October 2012.
  • Vinay Hanumaiah, Sarma Vrudhula, Karam Chatha. "Performance Optimal Online DVFS and Task Migration Techniques for Thermally Constrained Multi-core Processors. In IEEE Transactions on Computer-Aided Design (TCAD), Vol. 30(11):1677-1690, November 2011
  • Ravishankar Rao, Sarma Vrudhula. "Fast and accurate prediction of the steady state throughput of multi-core processors under thermal constraints". In IEEE Transactions on Computer-Aided Design (TCAD), Vol. 28(10):1559-1572, October 2009.
  • Vinay Hanumiah, Sarma Vrudhula, Karam Chatha. "Maximizing performance of thermally constrained multi-core processors by dynamic voltage and frequency control". In Proceedings of the IEEE International Conference on Computer Aided Design (ICCAD), San Jose, CA., November 2009.
  • Vinay Hanumiah, Ravishankar Rao, Sarma Vrudhula, Karam Chatha. "Throughput Optimal Task Allocation under Thermal Constraints for Multi-core Processors". In Proceedings of the IEEE/ACM Design Automation Conference (DAC), San Francisco, CA., July 2009.
  • Vinay Hanumiah, Sarma Vrudhula, Karam Chatha. "Performance optimal speed control of multi-core processors under thermal constraints". In Proceedings of the Design, Automation & Test in Europe (DATE), Nice, France, 20-24 March 2009. 
  • Ravishankar Rao and Sarma Vrudhula. "Analytical results for design space exploration of multi-core processors employing thread migration". In Proceedings of the ACM International Symposium on Low Power Electronics and Design (ISLPED), Pages 229-232, Bangalore, India, 2008.
  • Ravishankar Rao and Sarma Vrudhula. "Performance optimal processor throttling under thermal constraints". In Proceedings of the ACM/IEEE International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES), Pages 257-266, Salzburg, Austria, 1-3 October 2007.
  • Ravishankar Rao, Sarma Vrudhula, Chaitali Chakrabarti and Naehyuck Chang. "An optimal analytical solution for processor speed control with thermal constraints". In Proceedings of the IEEE International Symposium on Low Power Electronics and Design (ISLPED), Pages 292-297, Tegernsee, Germany, 4-6 October 2006.
Ċ
Sarma Vrudhula,
May 5, 2016, 8:29 AM