Definition
What is Context Engineering?
Last updated
The practice of deliberately designing, structuring, and managing the information provided to AI models to improve output quality and relevance.
Where prompt engineering asks 'how should I phrase this?', context engineering asks 'what does the model need to know, and how do I make sure it has that?' It covers system instructions, retrieved knowledge, tool outputs, memory, and output structure. In production AI systems, the bottleneck is almost always context, not phrasing.
- Covers everything the model sees at inference time, not just the user's message.
- Most agent failures trace back to missing, stale, or overloaded context rather than phrasing.
- Dumping more data into a large context window degrades accuracy; selective, structured context wins.
- Production AI quality depends more on context pipelines than on prompt wording or model choice.
How context engineering works
At inference time, a language model can draw on several layers of information:
- System instructions: behavioral guidelines, role definitions, constraints.
- Conversation history: prior turns in the current session.
- Retrieved knowledge: external documents, database results, API responses.
- Long-term memory: information persisted across sessions.
- Tool definitions and outputs: descriptions of functions the model can invoke, and what they return.
- Structured outputs: format specifications for responses.
Context engineering is the practice of designing all of this deliberately, rather than letting it accumulate by default. In practical terms, that means building retrieval pipelines, shaping tool outputs, scoping access per task, compressing history, and verifying that what reaches the model is accurate, current, and relevant to the next step.
Why it matters
According to LangChain’s 2025 State of Agent Engineering report, 57% of organizations now have AI agents in production. The top-cited barrier is output quality. Rephrasing prompts rarely moves the needle on these failures because the problem is upstream: the model doesn’t have the right information at the right time.
Most agent failures trace back to one of three context problems:
- Missing information. The model answers confidently without knowing something relevant. It doesn’t ask because it doesn’t know it doesn’t know.
- Stale information. The model draws on data that was once accurate but is no longer. A customer support agent with last year’s pricing will confidently give wrong answers.
- Overloaded information. The model has too much context to reason over. Chroma’s context rot research shows accuracy drops from 95% to 60-70% as input length grows on trivially simple tasks.
These are engineering problems, not prompting problems. They require designing systems around the model, not just better sentences.
Common misconceptions about context engineering
- “It’s just prompt engineering with a new name.” Prompt engineering is one layer inside context engineering. The other layers (retrieval, memory, tools, structure, compression) sit outside the prompt and usually matter more.
- “Bigger context windows make context engineering obsolete.” Larger windows raise the ceiling on what fits, not on what the model attends to. Accuracy still degrades at long input lengths, and cost scales with every token you include.
- “It’s the same as RAG.” RAG is the retrieval slice. Context engineering also covers what the model sees before retrieval, how tool outputs are shaped, how memory is pruned, and how scope is enforced per user or task.
- “Demos prove context engineering doesn’t matter.” Demos work because the developer hand-crafts the context. Production fails because the same care isn’t applied to every query.
Context engineering and Wire
Wire handles the retrieval and storage slices of context engineering. Files land in a container, get chunked and indexed, and are exposed through five MCP tools (wire_explore, wire_search, wire_navigate, wire_write, wire_delete). Your agent decides when to call which tool; Wire returns scoped, typed results instead of raw document dumps. That leaves you to design the layers Wire doesn’t own: system instructions, orchestration, and what to do with the results.
FAQ
Frequently asked questions
Common questions about Context Engineering.
How is context engineering different from prompt engineering?
Why isn't a bigger context window enough?
What are the most common context failures in production agents?
Where does RAG fit into context engineering?
How do I know if my agent's problem is context or prompting?
Further reading
Articles about Context Engineering
MCP resources are the underused half of MCP
Every MCP discussion is about tools. The protocol's resources primitive is how you load context without paying for it every turn. Here's how to use it.
Agentic context engineering: how ACE evolves contexts
ACE (ICLR 2026) beats tuned prompts by 10.6% with self-evolving contexts that avoid brevity bias and context collapse, two real failures of prompt tuning.
Context engineering lives in substrates, not harnesses
Codex shipped codex-plugin-cc and AGENTS.md joined the Linux Foundation. The signal is consistent: context engineering is substrate work, not harness work.
Long context tripled hallucinations in 35 open models
A 172-billion-token study across 35 open models found hallucination rates triple from 32K to 128K context, and exceed 10% at 200K for every model tested.
All terms
View full glossaryPut context into practice
Create your first context container and connect it to your AI tools in minutes.
Create Your First Container