Definition

What is Context Compression?

The practice of reducing token count in an AI agent's context window while preserving the information needed to complete tasks.

As AI agents work through multi-step tasks, they accumulate conversation history, tool outputs, and observations that dilute attention. Context compression techniques like structured summarization, tool response offloading, and embedding-based reduction keep the working context focused. Research shows effective compression can reduce memory usage by 26-54% while preserving task performance.

Related concepts

Context Rot

The gradual degradation of an AI system's usefulness as the context it relies on becomes stale, incomplete, or outdated.

Context Window

The maximum amount of text (measured in tokens) that a language model can process in a single inference call.

Context Engineering

The practice of deliberately designing, structuring, and managing the information provided to AI models to improve output quality and relevance.

AI Agent

An autonomous software program that uses a large language model to plan and execute multi-step tasks.

Put context into practice

Create your first context container and connect it to your AI tools in minutes.

Create Your First Container