Scaling GPUs with EC2 UltraClusters
Welcome to QA’s lesson where you’re going to learn about High-Performance Cloud computing with Amazon’s EC2 UltraClusters.
This lesson covers Amazon’s GPU instances, such as EC2 P5 and P4d instances powered by NVIDIA GPUs, and EC2 Trn1 instances powered by AWS Trainium Accelerators. These instances can be deployed in EC2 UltraClusters, scaling to thousands of GPUs and Trainium Accelerators, delivering computing power comparable to a supercomputer. This capability enables transforming tasks in Machine Learning, AI, and other high-performance computing workloads.
By the end of this lesson, you will have an understanding of high-performance computing including:
- The core principles of GPU Computing
- An overview of Amazon’s EC2 UltraClusters
- Amazon’s GPU powered instances, EC2 P5 and EC2 P4d
- Amazons Tranium Accelerator powered instances, EC2 Trn1
- The scaling capabilities of these instances within EC2 UltraClusters
Intended Audience
This lesson has been created for those who are interested in High-Performance Computing, Machine Learning, and Generative AI.
Prerequisites
To get the most out of this lesson you should have an understanding of EC2 Instances and FSX for Lustre.