Gemma 2 on Vertex AI

Gemma 2 offers top performance, rapid hardware compatibility, and easy AI tool integration

Google unveiled the Gemma family of lightweight, cutting-edge open models earlier this year

With CodeGemma, RecurrentGemma, and PaliGemma all of which offer special capabilities for various AI jobs

Google is now formally making Gemma 2 available to academics and developers throughout the world

Gemma 2 exceeds the first generation in inference performance and efficiency in parameter sizes of 9 billion (9B) and 27 billion (27B)

And that can now be accomplished on a single NVIDIA H100 Tensor Core GPU or TPU host, greatly lowering the cost of deployment

Gemma 2 (27B) offers competitive alternatives to models over twice its size and is the best performing model in its size class

9B Gemma 2 model outperforms other open models in its size group and the Llama 3 8B, delivering class-leading performance

Google Cloud TPU host, NVIDIA A100 80GB Tensor Core GPU, or NVIDIA H100 Tensor Core GPU, the 27B Gemma 2 model offers a cost-effective solution

Gemma 2 runs swiftly on powerful gaming laptops, high-end desktops, and cloud-based settings

Gemma is optimised using NVIDIA TensorRT-LLM to operate as an NVIDIA NIM inference microservice or on NVIDIA-accelerated infrastructure

Google Cloud users will be able to quickly and simply install and maintain Gemma 2 on Vertex AI as of next month

With Gemma 2, developers may now launch even more ambitious projects and unleash the full potential and performance of their AI creations