Gemini 3.1 Flash-Lite: Built for intelligence at scale

Gemini
Gemini 3.1 Flash-Lite is a new, fast, and cost-efficient AI model available in preview for high-volume workloads.

Summary

Google introduced Gemini 3.1 Flash-Lite, the fastest and most cost-efficient model in the Gemini 3 series, designed for high-volume developer workloads. It is currently rolling out in preview via the Gemini API in Google AI Studio and for enterprises on Vertex AI. Priced competitively at $0.25/1M input tokens and $1.50/1M output tokens, it significantly outperforms its predecessor, 2.5 Flash, showing a 2.5X faster Time to First Answer Token and 45% increased output speed while maintaining high quality, evidenced by its Elo score of 1432 on Arena.ai. The model supports adaptive intelligence through configurable thinking levels, allowing it to handle tasks ranging from high-volume translation and content moderation to more complex reasoning tasks like generating user interfaces and simulations. Early access users have praised its efficiency and reasoning capabilities for solving complex problems at scale.

(Source:Gemini)