TreeAH Vector Index Preview On Google Cloud

Google  Cloud is pleased to present the TreeAH vector index preview

TreeAH Index is a kind of vector index based on the ScaNN algorithm developed by Google

When processing hundreds or more query vectors in a batch query, this method performs best

When compared to IVF, the adoption of product quantization can possibly result in orders of magnitude reductions in delay and expense

BigQuery can optimise the lookups and distance calculations needed to find closely matching embeddings by using a vector index

Due to the IVF and TreeAH indexes, BigQuery may perform ANN search instead of exact closest neighbour search

The number of distance calculations is significantly reduced when using the VECTOR_SEARCH function to search the vector data

When the query batch size is huge, the TreeAH index already performs significantly better than the IVF index

To compare TreeAH with IVF, Google cloud’s engineering team ran benchmarks across a range of table configurations and query batch sizes