AWS AI chips deliver high performance and low cost for Llama 3.1 models on AWS

John Gray

Today, we are excited to announce AWS Trainium and AWS Inferentia support for fine-tuning and inference of the Llama 3.1 models. The Llama 3.1 family of multilingual large language models (LLMs) is a collection of pre-trained and instruction tuned generative models in 8B, 70B, and 405B sizes. In a previous post, we covered how to deploy Llama 3 models on AWS Trainium and Inferentia based instances in Amazon SageMaker JumpStart. In this post, we outline how to get started with fine-tuning and deploying the Llama 3.1 family of models on AWS AI chips, to realize their price-performance benefits.

Originally appeared here:
AWS AI chips deliver high performance and low cost for Llama 3.1 models on AWS

Go Here to Read this Fast! AWS AI chips deliver high performance and low cost for Llama 3.1 models on AWS