Structured Context Context Engineering AI Agent

Structured Context vs Raw Text for AI

Jitpal Kocher · March 17, 2026 · Updated April 27, 2026 · 6 min read

Key takeaway

Structured context outperforms raw text because format choice can swing LLM accuracy by up to 40% and tight retrieval cuts the noise that drives context rot. ETH Zurich found AI-generated context files actually hurt agent performance by 3% while raising inference cost over 20%, evidence that how you structure context matters more than how much you provide.

A team at ETH Zurich recently tested whether giving AI coding agents a context file about the repository they’re working in improves their performance. The result was counterintuitive: LLM-generated context files reduced task success rate by an average of 3% compared to giving the agent no context file at all. They also increased inference costs by over 20%, because agents took more steps to complete tasks while processing the extra noise.

Human-written context files fared better, improving success rate by about 4%. But the gap between “AI-generated context that hurts” and “human-written context that helps” isn’t about who wrote it. It’s about structure. The AI-generated files were verbose, generic, and full of information the agent could infer on its own. The human-written files were concise, specific, and focused on non-obvious details like custom build commands and project conventions.

The lesson generalizes far beyond coding agents. How you structure context matters more than how much context you provide.

The raw text problem

Dumping raw text into the prompt degrades model accuracy as context length grows. Research from Chroma shows even trivially simple tasks drop from 95% accuracy to 60-70% as input length increases, a phenomenon called context rot. The default approach to giving AI systems context ignores this: paste the full document, concatenate the conversation history, retrieve 20 chunks from a vector database, and stuff them into the prompt. a16z recently described this as the central bottleneck in enterprise AI: the gap between an organization’s messy data and the actionable context that agents actually need. (For a deeper look at the mechanism, see Context Rot: Why AI Performance Degrades With More Information.)

Augment Code put it well: your agent’s context is a junk drawer. Every irrelevant paragraph competes for the model’s finite attention budget, diluting the signal from the information that actually matters. The result is more hallucinations, more missed facts, and more wasted tokens.

Why structure matters

Three mechanisms explain why structured context outperforms raw text.

It reduces token waste

Raw documents contain formatting artifacts, boilerplate, repetitive headers, and filler text that consume tokens without adding information. A 50-page PDF might contain 10 pages of actual content relevant to the query. Structured representations strip the noise. Returning typed records like {ticket_id, component, status, summary} instead of the full ticket thread gives the model exactly what it needs in a fraction of the tokens.

It exploits attention patterns

Research on prompt formatting found that format choice alone can swing LLM accuracy by up to 40% on code translation tasks. Models aren’t format-agnostic. Clear section headers, consistent delimiters, and hierarchical organization help the model allocate attention to the right places. Anthropic’s context engineering guide recommends organizing context into distinct sections with XML tags or Markdown headers specifically because it improves model behavior. The choice between specific structured formats (JSON, YAML, TOON) is a more subtle one and depends on the model’s training distribution; we cover the empirical tradeoffs in TOON vs JSON: why smaller doesn’t mean cheaper for LLMs.

It enables selective retrieval

Structured context is queryable. When information is organized into typed fields with metadata, you can retrieve precisely the 3-5 records relevant to the current query rather than the 20 loosely related chunks that naive RAG returns. Less context, higher relevance, better output.

What doesn’t work

AI-generated context files

The ETH Zurich study tested this directly. Across 138 real-world Python tasks and four different coding agents (Claude 3.5 Sonnet, GPT-5.2, GPT-5.1 mini, and Qwen Code), LLM-generated AGENTS.md files consistently hurt performance. The generated files restated information the agent could infer from the codebase, adding tokens without adding knowledge.

Naively converting everything to JSON

Wrapping raw text in JSON syntax doesn’t make it structured. If your JSON object contains a single "content" field with a 5,000-word document pasted inside, you’ve added token overhead without improving the model’s ability to find relevant information. Structure means organizing information into meaningful fields that the model can reason over, not adding brackets around prose.

Over-structuring

There’s a point of diminishing returns. The ACE framework (ICLR 2026) found that representing context as a collection of structured, itemized bullets with metadata (unique IDs, helpfulness counters) outperformed monolithic prompts and matched top-ranked production agents using smaller open-source models. But each bullet was a small, self-contained unit: one strategy, one concept, one failure mode. The structure served retrieval and relevance, not complexity for its own sake.

What works

The research converges on a few principles.

Typed records over prose. When you can represent information as small, structured records with named fields, do it. A customer record with {name, plan, status, last_contact} is more useful to an agent than a paragraph describing the same information. The model can reason over fields directly instead of parsing natural language.

Metadata for relevance. The ACE framework attaches helpfulness scores to each context item, so the system can prioritize what’s been useful before. Even simple metadata like source, recency, and category helps retrieval systems select the right context for each query.

Minimal context, maximum signal. Anthropic’s guide emphasizes striving for “the minimal set of information that fully outlines your expected behavior.” The ETH Zurich researchers reached the same conclusion: limit context files to non-inferable details. If the model can figure it out from the input, don’t tell it twice.

Process at upload time, not query time. Rather than structuring context on every query, do the transformation work once when documents enter the system. Extract entities, categorize content, and build structured representations upfront. At query time, return pre-processed records instead of raw text. Tools like Wire take this approach, transforming files into structured context at upload time so agents receive clean, typed data on every query.

Practical checklist

If you’re building systems that deliver context to AI agents:

Audit your context payload. Pull the actual text being sent to the model for real queries. How much of it is signal vs. noise?
Convert documents to typed records where possible. Named fields beat paragraphs for factual content.
Cap your retrieval. Return 3-5 highly relevant items, not 20 loosely related ones. Less context with higher relevance beats more context with lower relevance.
Strip what’s inferable. If the model can determine something from the primary input, don’t repeat it in the context. The ETH Zurich data shows this actively hurts.
Measure the difference. Run the same queries with raw text and structured context. Track accuracy, hallucination rate, and token usage. The delta is usually significant.

The trend in context engineering is clear: the teams getting the best results from AI agents aren’t the ones with the most context. They’re the ones with the most structured context.

Sources: ETH Zurich: Evaluating AGENTS.md (arXiv 2602.11988) · ACE: Agentic Context Engineering (arXiv 2510.04618) · Prompt format impact on LLM performance (arXiv 2411.10541) · Anthropic: Effective Context Engineering for AI Agents · a16z: Your Data Agents Need Context · Augment Code: Your Agent’s Context Is a Junk Drawer · Chroma Research: Context Rot

Frequently asked questions

When should I convert documents to structured records instead of sending raw text to an AI?

Convert to structured records whenever the content has consistent fields the model needs to reason over, like tickets, customer profiles, or product catalogs. Keep raw text only when narrative flow carries the meaning, such as long-form prose or transcripts where the surrounding context is part of the signal.

Why do AI-generated context files hurt agent performance?

An ETH Zurich study across 138 Python tasks found AI-generated AGENTS.md files reduced success rate by an average of 3% and raised inference costs over 20%. The files restated information the agent could already infer from the input, adding tokens without adding non-obvious knowledge.

How do you avoid over-structuring context for an AI agent?

Structure for retrieval and relevance, not complexity. Itemized units with simple metadata (source, recency, helpfulness) outperform deeply nested schemas, because each unit needs to be small and self-contained enough for the model to reason over directly.

Does the choice of structured format (JSON, YAML, Markdown, TOON) matter for AI accuracy?

Format choice can swing accuracy by up to 40% on specific tasks, but a 9,649-experiment study found no statistically significant difference across JSON, YAML, Markdown, and TOON in aggregate. Frontier models tolerate any well-formed structured format; smaller open-source models are more sensitive, and JSON's training-data presence makes it the safer default.

What's the cost impact of switching from raw text to structured context?

Token usage typically drops by a large fraction because structured records strip boilerplate, headers, and inferable filler. Cost reduction compounds when retrieval also tightens, since 3-5 high-relevance records replace 20 loosely-related raw chunks.

Context Engineering AI Agent

Ready to give your AI agents better context?

Wire transforms your documents into structured, AI-optimized context containers. Upload files, get MCP tools instantly.

Create Your First Container