Google Cloud Storage Fuse Speeds Model + Weight Load Times
Using secondary boot drives to cache container images with your inference engine and relevant libraries directly on the GKE node can speed up container load times
Using Cloud Storage Fuse or Hyperdisk ML to speed up model + weight load times from Google Cloud Storage
Cloud Storage Fuse (2) and Hyperdisk ML (3) are choices to link the pod to model + weight data saved in Cloud Storage or a network connected disk
During construction, GKE enables you to pre-cache your container image onto a secondary boot drive that is connected to your node
When a 16GB container image is cached on a secondary boot drive in advance, load times may be as much as 29 times faster than when the image is downloaded
The two primary solutions for retrieving your data at the GKE-pod level using Cloud Storage as the source of truth are HdML and Cloud Storage Fuse
For model weights stored in object storage buckets, Cloud Storage Fuse offers a direct connection to Cloud Storage