Google has introduced Gemini 3.1 Flash-Lite, its newest AI model tailored for developers managing complex and high-volume workloads. Positioned as the most cost-effective and fastest model in the Gemini 3 series, it is designed to handle significantly higher data demands.

The model is priced at $0.25 per 1 million input tokens and $1.50 per 1 million output tokens. Google highlights key improvements over the previous 2.5 Flash model, including a 2.5x faster time to first answer token and a 45% boost in output speed. It achieved a score of 1,432 on the Arena.ai Leaderboard.

Gemini 3.1 Flash-Lite is touted to outperform other models in reasoning and multimodal understanding benchmarks. Developers can fine-tune the model's "thinking" process to manage tasks at scale, enabling it to generate UI, create simulations, and follow complex instructions. Early testing by companies like Latitude and Whering in AI Studio and Vertex AI has yielded positive feedback.

Available in preview starting March 3 via the Gemini API, this model represents Google's continued push into the lightweight, high-performance AI space for development, succeeding the 2.5 Flash model with enhanced speed, lower cost, and greater customization for demanding workloads.