IBM Telum II Processor with IBM Spyre Accelerator enable enterprise-scale AI features like large language models and generative AI
At Hot Chips 2024, IBM disclosed the architecture for the future IBM Spyre Accelerator and IBM Telum II Processor
Many generative AI projects use Large Language Models (LLMs), which require scalable, safe, and power-efficient solutions from proof-of-concept to production
IBM Z has a coherently coupled DPU, 40% more cache, frequency, and an AI accelerator core than the first-generation Telum chip
Telum II and Spyre chips offer scalable ensemble modeling with machine learning or deep learning AI models and encoder LLMs
Each core has 36MB of L2 cache, and the on-chip cache capacity has been increased by 40% to 360MB
It has up to 1TB of memory that can support AI model workloads on the mainframe and is meant to use no more than 75W per card