The software architect chose NVIDIA Triton speeds Inference Server when creating an AI inference platform to provide predictions for Oracle Cloud Infrastructure’s (OCI) Vision AI service
More specifically, for OCI Vision and Document Understanding Service models that were transferred to Triton, Triton speeds decreased inference latency by 51%, enhanced prediction throughput by 76%
Additionally, Oracle NetSuite a suite of business tools utilized by over 37,000 enterprises globally offers OCI AI. One application for it is in the automation of invoice recognition.
The machine learning foundation for NetSuite and Oracle Fusion software-as-a-service applications is provided by OCI’s Data Science service
A broad range of users, mostly from enterprises in the manufacturing, retail, transportation, and other sectors are included. They are creating and utilizing AI models in almost all sizes and shapes
Since its March OCI launch, Triton speeds has drawn interest from numerous Oracle internal teams that want to use it for inference tasks requiring the simultaneous feeding of predictions from several AI models
Going forward, Keisar’s group is testing the NVIDIA TensorRT-LLM program to accelerate inference on the intricate large language models (LLMs) that have piqued the interest of numerous users