IBM Z AI Grows With Telum II Processor

IBM Telum II Processor with IBM Spyre Accelerator enable enterprise-scale AI features like large language models and generative AI

At Hot Chips 2024, IBM disclosed the architecture for the future IBM Spyre Accelerator and IBM Telum II Processor

Many generative AI projects use Large Language Models (LLMs), which require scalable, safe, and power-efficient solutions from proof-of-concept to production

IBM Z has a coherently coupled DPU, 40% more cache, frequency, and an AI accelerator core than the first-generation Telum chip

Telum II and Spyre chips offer scalable ensemble modeling with machine learning or deep learning AI models and encoder LLMs

Each core has 36MB of L2 cache, and the on-chip cache capacity has been increased by 40% to 360MB

It has up to 1TB of memory that can support AI model workloads on the mainframe and is meant to use no more than 75W per card