Below you can find a list of resources that describe optimizations of AI models.
-
Accelerate PyTorch Models using torch.compile on AMD GPUs with ROCm
-
Accelerating Large Language Models with Flash Attention on AMD GPUs
-
Reduce Memory Footprint and Improve Performance Running LLMs on AMD Ryzen™ AI and Radeon™ Platforms
-
Unveiling performance insights with PyTorch Profiler on an AMD GPU
-
Quantized 8-bit LLM training and inference using bitsandbytes on AMD GPUs
-
Leaner LLM Inference with INT8 Quantization on AMD GPUs using PyTorch
Copyright (C) 2025 Advanced Micro Devices, Inc. All rights reserved.
SPDX-License-Identifier: MIT