Microsoft Open Phi-3 Mini Languages NVIDIA Quicken

NVIDIA revealed that NVIDIA TensorRT-LLM, an open-source framework for optimising large language model inference while running on NVIDIA GPUs from PC to cloud

Phi-3 Mini advances Phi-2 from its research-only origins by bringing the power of 10x larger models to the masses

The Phi-3 Mini comes in two versions: the 4k token variation supports up to 128K tokens, making it the first model in its class for extremely extended contexts

This enables developers to ask the model questions using 128,000 tokens, or the atomic components of language that the model processes

It is packaged as an NVIDIA NIM, a microservice with a standard API that can be deployed anywhere

The Phi-3 Mini variant has 3.8 billion parameters, which makes it small enough to function well on edge devices

In usage instances where resources and costs are limited, Phi-3 can help, particularly with easier jobs