New AMD ROCm 6.3 Release Expands AI and HPC Horizons
Presenting SGLang, a new runtime optimized for inferring state-of-the-art generative models like LLMs and VLMs on AMD Instinct GPUs
According to research, you can outperform current systems on LLM inferencing by up to 6X, allowing your company to support AI applications on a large scale
With Python integrated and pre-configured in the ROCm Docker containers, developers can quickly construct scalable cloud backends
AMD resolves these issues with FlashAttention-2 designed for ROCm 6.3, allowing for quicker, more effective training and inference
Direct GPU Offloading: Use OpenMP offloading to take advantage of AMD Instinct GPUs and speed up important scientific applications
Backward Compatibility: Utilize AMD’s next-generation GPU capabilities while building upon pre-existing Fortran code
Streamlined Integrations: Connect to ROCm Libraries and HIP Kernels with ease, removing the need for intricate code rewrites