Google multimodal embeddings

Text, photos, and videos abound in the digital world

The rise of NLP and Google multimodal embeddings enables your clients search for images, videos, and information like they would text

The architecture stores media assets in BigQuery object tables using Google Cloud Storage

BigQuery indexes semantic embeddings for videos and images from multimodal embedding enabling similarity search and smooth cross-modal search

Google Cloud used GitHub-hosted Google Search movies and photos for the experiment

Create a BigQuery Object table to point to your Cloud Storage source picture and video files

Google Cloud uses a learned multimodal embedding model to numerically represent media data

Create a BigQuery VECTOR INDEX for your photo and video embeddings to efficiently store and query them