CUDA 12.8 & GeForce RTX GPUs

LM Studio 0.3.15 leverages CUDA 12.8 to significantly improve model load and response times on NVIDIA GeForce RTX GPUs, enabling faster local LLM inference

The update introduces a revamped system prompt editor for handling longer prompts and the "tool_choice" parameter for precise control over external tool usage

LM Studio is available for free on Linux, macOS, and Windows, offering users a powerful tool for local AI development and experimentation

LM Studio supports a variety of open models, including Gemma, Llama 3, Mistral, and Orca, across quantization formats ranging from 4-bit to full precision

The new editor provides a larger visual space for editing lengthy prompts, while the sidebar's compact editor remains available for quick adjustments

LM Studio now supports RTX 50-series GPUs with CUDA 12.8, improving first-time model load times and overall performance on these GPUs

The "tool_choice" parameter in the OpenAI-compatible REST API allows developers to control tool usage with options like "none," "auto," or "required

Users can now create, share, and download presets for system prompts and model settings, fostering collaboration and customization within the community

LM Studio supports a variety of open models, including Gemma, Llama 3, Mistral, and Orca, across quantization formats ranging from 4-bit to full precision

LM Studio supports RAG workflows, document-based Q&A, multi-turn chat with long context windows, and local agent pipelines for advanced AI applications