NVIDIA Energies Meta’s HyperLlama 3: Faster AI for All

NVIDIA revealed platform-wide optimisations aimed at speeding up Meta Llama 3, the most recent iteration of the large language model (LLM)

Meta fine-tuned its network, software, and model designs for its flagship LLM with assistance from NVIDIA

They are doing this while keeping up their leadership position in the responsible use and deployment of LLMs

They improved upon Llama 2 in a number of significant ways. With a vocabulary of 128K tokens, Llama 3’s tokenizer encodes language far more effectively, significantly enhancing model performance

They used a mask to make sure self-attention does not transcend document borders when training the models on sequences of 8,192 tokens

The meta training dataset has four times more code and is seven times larger than the one used for Llama 2

They created a number of data-filtering procedures to guarantee that Llama 3 is trained on the best possible data