IBM LSM: New Watson Large Speech Model

Large language models, or LLMs, are a term that most people are familiar with because of generative AI’s remarkable ability to generate text and images

Rigid conversational experiences—yes, Interactive Voice Response, or IVR—remain the norm in today’s contact centers. Unlock the mysteries of Large Speech Models (LSMs)

The development teams of IBM Watsonx and IBM Research have been working diligently over the last few months to create a brand-new, cutting-edge Large Speech Model (LSM)

IBM’s LSM is designed with customer care use cases such as real-time call transcription and self-service phone assistants in mind

IBM are thrilled to announce the launch of new LSMs in both English and Japanese, which are only accessible to Watson Speech to Text and Watsonx Assistant phone customers in closed beta right now

The new LSM outperforms OpenAI’s Whisper model on short-form English use cases, making it our most accurate speech model to date according to internal benchmarking

With five times fewer parameters than the Whisper model, IBM’s LSM processes audio ten times faster on the same hardware