AI Hypercomputer’s New Resource Hub & Speed Enhancements

The open software layer of AI Hypercomputer offers reference implementations and workload optimizations to enhance the time-to-value for your particular use case

Google Cloud is launching the AI Hypercomputer GitHub organization to make the advancements in its open software stack easily accessible to developers and practitioners

Accurate Quantized Training (AQT), the quantization library that drives INT8 mixed-precision training on Cloud TPUs, is how it added FP8 capability to MaxText

Performance-optimized LLM training examples are now available for A3 Mega VMs, which provide a 2X increase in GPU-to-GPU network capacity over A3 VMs

Lastly, it implemented ragged attention kernels and KV cache quantization in JetStream, an open-source throughput-and-memory-optimized engine for LLM inference

Google Cloud is launching the AI Hypercomputer GitHub organization to make the advancements in its open software stack easily accessible to developers and practitioners

To make it possible for collaborative communication and computing on GPUs to overlap, Google Cloud collaborated closely with NVIDIA to enhance JAX and XLA