Google launches 'implicit caching' feature in Gemini API to make its latest AI models cheaper for developers.
The feature can deliver 75% cost savings on repetitive context passed to models via Gemini API, supporting Gemini 2.5 Pro and 2.5 Flash models.
Caching, a common practice in the AI industry, reuses frequently accessed data to reduce computing requirements and costs.
Google's implicit caching is automatic and offers cost savings if a request hits the cache, with a minimum prompt token count required for eligibility.