Azure powers intelligent services that have recently caught our attention, such as Microsoft Copilot, Bing, and Azure OpenAI Service

Large language models (LLMs) are the secret sauce behind these services, which enable a plethora of applications such as Microsoft Office 365, chatbots, and search engines with generative AI

How Microsoft uses LLMs to their fullest potential But developing new LLMs or enhancing the precision of already-existing ones is a difficult task

The most recent MLPerf 3.1 Training results demonstrate their constant dedication to developing high-caliber, high-performance cloud platforms in order to achieve unmatched efficiency when training large numbers of LLMs

On 1,344 ND H100 v5 virtual machines (VMs), which represent 10,752 NVIDIA H100 Tensor Core GPUs connected by the NVIDIA Quantum-2 InfiniBand networking infrastructure, the GPT-3 LLM model and its 175 billion parameters

The workload puts a strain on the Tensor Cores of the H100 GPUs, the direct-attached Non-Volatile Memory Express disks, and the NVLink interface

Azure has made remarkable progress in optimizing the size of training, as evidenced by its largest contribution in the history of MLPerf Training

Compared to the NVIDIA bare-metal submission, which offers the best-in-class virtual machine performance across all offerings of HPC instances in the cloud, this translates to just a 2 percent increase in training time in Azure For more details visit govindhtech.com