Energy and Performance Analysis for AI Workloads

AI workloads consume vast amounts of resources and energy. Efficient usage of resources is therefore a primary focus. However, due to the rapid pace of development and the complexity of software environments, a thorough analysis of performance and energy usage is often not possible. Additionally, common Python wrappers conceal lower-level libraries and promote a culture of copying and pasting ready-made workflow setups to manage the overall complexity. The aim is to support the project in understanding and optimizing the performance and energy efficiency of relevant application workloads.

This will be achieved by:

Defining proxy workload setups to benchmark and profile the applications and gain insight into the core algorithmic patterns, performance bottlenecks, parameters affecting energy utilization.
Investigate optimization opportunities, either changing operations parameters, suggesting alternative implementations or software optimizations.
Creating a map of how common AI algorithmic patterns and frameworks perform for applications used in DSgenAI.

The experiences in performance and energy research for common HPC workloads will be transferred to AI, leveraging the expertise of a growing AI support team with expert domain knowledge.

Author: Dr. Jan Eitzinger

Operation of a State of the Art HPC Cluster for DSgenAI

Energy and Performance Analysis for AI Workloads

Related Posts

Operation of a State of the Art HPC Cluster for DSgenAI