MLPerf Inference v4.1

AMD Instinct MI300X GPUs, powered by one of the most recent iterations of open-source ROCm, obtained in the MLPerf Inference v4.1 round

It set a high benchmark for AMD Instinct MI300X accelerators by outperforming the NVIDIA H100 in Gen AI inference

As large language models (LLMs) grow in size and complexity, inference and training need more efficient and economical performance

AMD Instinct MI300X performed well in its first MLPerf Inference on the Supermicro AS-8125GS-TNMR2 system with four significant LLaMA2-70B entries

Dell provided PowerEdge XE9680 and LLaMA2-70B data to confirm AMD Instinct accelerator platform-level performance on 8x MI300X

AMD Instinct MI300X has the largest GPU memory, therefore the whole LLaMA2-70B model fits and supports KV cache

The favorable results of MLPerf Inference with LLaMA2-70B established a foundation for future effectiveness with larger models like Llama 3.1