Scaling the Privacy-Conserving LLM Platform of Prediction Guard on an Intel Gaudi 2 AI Accelerator
An Intel Gaudi 2 AI accelerator-based large language model (LLM) platform, Prediction Guard, is leading the way in privacy-focused AI platforms
Prediction Guardian hosts cutting-edge, open-source LLMs including Meta Llama 3, Neural-Chat-7B, and DeepSeek and pioneered an LLM platform that achieves both goals
With hosting on inexpensive, scalable Intel Gaudi 2 AI accelerators, legal, healthcare, and financial industries can use privacy-preserving LLM applications
Lastly, using advice from the Intel Gaudi product team, Prediction Guard adjusted the KV cache size, numerical precision, and other hyperparameters
Intel Tiber Developer Cloud and Intel Gaudi 2 AI accelerator instances with these specs power Prediction Guard servers
Prediction Guard installations on Intel Gaudi 2 processors managed the load with ease even during periods of high demand