ISCA-45 Tutorial for Energy Efficient Computing

Energy Efficient Computing in Multicore CPUs: Design Margins and Variability

Sunday, 3 June 2018, Los Angeles, California, USA
Morning tutorial held in conjunction with 45th ACM/IEEE International Symposium on Computer Architecture (ISCA 2018)

Organizers/Presenters: Dimitris Gizopoulos, George Papadimitriou, Athanasios Chatzidimitriou (University of Athens)

Tutorial Summary

Conservative design margins in modern multicore CPU chips aim to guarantee correct execution of the software running on computing system under various operating conditions and accounting for the inherent variability among different cores of the chip, among different manufactured chips and among different workloads. However, guard-banding the main operational parameters of CPU chips (voltage, frequency), leads to limited energy efficiency.

In this tutorial we will present different aspects of the above topic.

(a) We will present the main challenges (and how they can be addressed) of the massive process of identifying the design margins and different types of variability of modern multicore CPUs as well as the characterization of the system behavior in scaled conditions (what types of malfunctions are observed – program Silent Data Corruptions, corrected and uncorrected errors captured by the hardware, application and system crashes – and what are the corresponding probabilities). We will discuss how such a process can be automated and how the margins of different CPU chips can be efficiently recorded. Implementation of the characterization process in state-of-the-art servers will be presented.

(b) We will analyze the magnitude of energy that can be saved through the exploitation of the margins and the variability. The characterized design margins and the variability among cores and chips can drive (static or dynamic) workload balance decisions at the system level based on voltage and frequency scaling knobs of the underlying hardware.

(c) We will discuss real measurements in different multicore server CPU chip mainly based on ARMv8 architecture (including AppliedMicro’s X-Gene 2 and X-Gene 3 as well Cavium’s ThunderX). Discussion and comparison among the implementations and also with different architectures will also take place.

(d) We will discuss the modeling of the behavior of CPUs when operating in scaled conditions by employing microarchitectural simulators. Different types of malfunction and the corresponding modeling will be presented.

The main purpose of the tutorial is to summarize recent characterization and exploitation findings on ARMv8 based server machines, emphasize on the potential of energy saving through identification and exploitation of design margins and to discuss our reports and findings to other machines similarly studied in the past.