Computer Architecture Lab

Dept. of Informatics & Telecommunications @ University of Athens

CAL's paper accepted in IEEE CLUSTER 2024 and it is nominated for the Best Paper Award!

L. Yang, G. Papadimitriou, D. Sartzetakis, A. Jog, E. Smirni, and D. Gizopoulos, “GPU Reliability Assessment: Insights Across the Abstraction Layers”, IEEE International Conference on Cluster Computing (CLUSTER 2024), Kobe, Japan, September 2024.

The Lab

Computer Architecture Lab of the Department of Informatics & Telecommunications,
@ University of Athens

CAL@DI Profile

CAL focuses on different aspects of Computer Architecture. We deal with computing systems built around general-purpose and specialized CPUs, memory systems, and accelerators such as GPUs. We care about the complex interactions among different parameters including performance, power/energy and dependability/reliability. We deliver methods and tools for fast evaluation of reliability, for energy-efficient computing, for error detection and recovery, as well as for silicon debug and validation.

Support

Our research is generously supported through European Union and national funds as well as by leading companies of the computing industry. CAL participates in several National and European research projects and Networks of Excellence.

Research

Performance

We design mechanisms for improving the performance of computing systems, focusing on emerging application domains that stress the limits of the memory system and its impact on execution. Our hardware, software, and co-design approach involves all the layers of the computing stack, from the application, the runtime and the OS, to the architecture and microarchitecture layers.

Silicon Validation

We devise models, methods and tools for CPUs and memories silicon debug and validation aiming to detect and locate hard-to-detect design bugs that escape pre-silicon, simulation-based verification. Our methods aim to improve the coverage of the silicon debug process (and thus its bug detection capability) while reducing significantly the time of the process.

Reliability Assesment

We deliver methods and tools for the evaluation of computing systems reliability for different types of models (transients or permanents) and for all major hardware components including CPUs, GPUs and memories. The main focus is on the speed of the reliability evaluation and its accuracy. We work at the microarchitecture level, the software level and the RT level.

Energy Efficient Computing

We investigate the margins of modern computing systems hardware to reveal the potential of energy and power savings when they operate beyond nominal conditions of voltage, frequency and refresh rates. We characterize the variation among different chips, different cores within chips and different workloads regarding their design margins, aiming to predict safe and energy-efficient operation points of modern hardware.

Error Detection and Fault Tolerance

We design hardware-based and software-based methods for the detection and tolerance of transient and permanent faults in the hardware components of computing systems. We provide solutions for CPUs, GPUs and memories. The main focus is the error coverage of the methods as well as the minimization of their cost in terms of system performance, energy/power and hardware area.

Latest News

Stay tuned for CAL@DI latest news.

Recent CAL’s paper in IEEE TC has been awarded the 2022 Best Paper Award

2022 Best Paper Award from IEEE Transactions on Computers by the IEEE Computer Society Publications Board P. R. Bodmann, G. Papadimitriou, R. L. Rech Jr, D. Gizopoulos, and P. Rech, “Soft Error Effects on Arm Microprocessors: Early Estimations vs. Chip Measurements”, IEEE Transactions on Computers (IEEE TC), Volume: 71, Issue: 10, pp. 2358-2369, October...

Lab’s Paper Accepted @ IEEE HPCA 2024

O. Chatzopoulos, G. Papadimitriou, V. Karakostas, and D. Gizopoulos, “gem5-MARVEL: Microarchitecture-Level Resilience Analysis of Heterogeneous SoC Architectures”, IEEE International Symposium on High-Performance Computer Architecture (HPCA 2024), Edinburgh, Scotland, March 2024.

Lab’s paper ACCEPTED @ IEEE/ACM MICRO 2023

D. Agiakatsikas, G. Papadimitriou, V. Karakostas, D. Gizopoulos, M. Psarakis, C. Belanger-Champagne, and E. Blackmore, “Impact of Voltage Scaling on Soft Errors Susceptibility of Multicore Server CPUs”, IEEE/ACM International Symposium on Microarchitecture (MICRO 2023), Toronto, Canada, October 2023.

Lab’s paper ACCEPTED @ IEEE HPCA 2023

G. Papadimitriou and D. Gizopoulos, “AVGI: Microarchitecture-Driven, Fast and Accurate Vulnerability Assessment”, IEEE International Symposium on High-Performance Computer Architecture (HPCA 2023), Montreal, QC, Canada, February 2023.