Definition

What is Prompt Caching?

A technique that stores computed key-value tensors from a prompt's prefix so they can be reused on subsequent API calls, reducing cost and latency.

AI agents reprocess the same system instructions, tool definitions, and context on every API call. Prompt caching eliminates this redundancy by reusing previously computed representations when the prefix matches exactly. Major providers offer up to 90% discounts on cached input tokens, making it one of the most impactful optimizations for production agent workloads.

Put context into practice

Create your first context container and connect it to your AI tools in minutes.

Create Your First Container