Definition
What is Context Compression?
The practice of reducing token count in an AI agent's context window while preserving the information needed to complete tasks.
As AI agents work through multi-step tasks, they accumulate conversation history, tool outputs, and observations that dilute attention. Context compression techniques like structured summarization, tool response offloading, and embedding-based reduction keep the working context focused. Research shows effective compression can reduce memory usage by 26-54% while preserving task performance.
The gradual degradation of an AI system's usefulness as the context it relies on becomes stale, incomplete, or outdated.
The maximum amount of text (measured in tokens) that a language model can process in a single inference call.
The practice of deliberately designing, structuring, and managing the information provided to AI models to improve output quality and relevance.
An autonomous software program that uses a large language model to plan and execute multi-step tasks.
All terms
View full glossaryPut context into practice
Create your first context container and connect it to your AI tools in minutes.
Create Your First Container