Why AI Coding Assistants Can't See Your Codebase

Jitpal Kocher · · 7 min read

According to the 2025 Stack Overflow Developer Survey, 84% of developers now use or plan to use AI coding tools. In the same survey, only 29% said they trust the output, down from 40% the year before. The number-one frustration, cited by 66% of respondents: AI-generated code that is “almost right, but not quite.”

The instinct is to blame the model. If GPT-5.2 or Claude Opus were smarter, the thinking goes, the suggestions would be better. But the model already knows how to write code. What it doesn’t know is your codebase: your internal APIs, your naming conventions, your architectural decisions, the migration you ran last Tuesday that changed the database schema. The gap between “knows how to code” and “knows how to code in your project” is a context engineering problem.

What your coding assistant actually sees

Every AI coding tool works within a context window, the maximum amount of text it can process in a single request. Even the largest windows (Gemini 3.1 Pro’s 2M tokens, roughly 1.5 million words) can’t hold a meaningful enterprise codebase. A typical enterprise project spans hundreds of thousands of files and millions of lines of code. A 200K-token context window holds roughly 500 to 800 pages of code, depending on language and commenting style.

So every coding assistant makes choices about what to include and what to leave out. The tool decides which files, which functions, which documentation to feed the model. Everything else is invisible.

Here’s how the major tools handle this:

File-level context. GitHub Copilot historically worked from the open file plus a few neighboring tabs. The model sees the function you’re editing but not the service it calls, the test that validates it, or the type definitions it depends on.

Repository indexing. Cursor and similar tools index your repository to build a searchable map of the codebase. When you ask a question, they retrieve relevant files and inject them into the context window. This is better, but it still compresses a million-line codebase into a fraction of what the model can hold.

Agentic retrieval. Claude Code takes a different approach entirely: instead of pre-indexing, it uses tools like grep, glob, and file reads to search the codebase on the fly. The agent decides what to look for based on the task, pulls in relevant files as needed, and builds context dynamically. It trades pre-built indexes for real-time exploration, which means context is always current but depends on the agent asking the right questions.

The pattern is clear: every generation of tooling invests more in what context reaches the model, not in making the model itself smarter.

Three context failures developers hit daily

The “almost right” code that frustrates 66% of developers comes from predictable context gaps.

Missing architectural context

Your codebase has conventions: how errors are handled, where state lives, which patterns are preferred. A model that sees a single file has no way to infer these. It generates code that is syntactically valid and functionally plausible but structurally wrong for your project. The fix takes longer than writing it from scratch.

Stale context

Codebases change constantly. A schema migration, a renamed module, a deprecated API. If the model’s understanding of your project is even a few days old, it generates code against an outdated reality. This is context rot applied to development: the context was accurate once, but the codebase moved on. (For more on this mechanism, see Context Rot: Why AI Performance Degrades With More Information.)

Cross-boundary blindness

Real features span multiple files, services, and layers. A model that can see your React component but not the API route it calls, the database query behind that route, or the validation middleware in between will generate code that works in isolation but breaks in integration. This is the same fragmented context problem that causes AI hallucinations: relevant information exists, but the model can’t see all of it at once.

Why bigger context windows don’t fix this

The obvious response is “just make the context window bigger.” Gemini 3.1 Pro already supports 2M tokens. Problem solved?

Not quite. Research consistently shows that more context often makes things worse. Chroma’s context rot study found accuracy drops from 95% to 60-70% as input length increases, even on trivially simple tasks. Stanford’s “lost in the middle” research showed a 15-20 percentage point accuracy gap between information at the edges versus the middle of long inputs.

Dumping your entire codebase into a million-token window doesn’t help if the model can’t find the one function signature that matters among thousands of irrelevant files. A focused 50K context with the right files consistently outperforms a 500K context packed with everything.

The problem was never window size. The problem is deciding what goes in.

What developers are building to work around this

The most interesting response to the codebase context problem is coming from developers themselves.

Context files

Developers have started creating configuration files, variously called CLAUDE.md, .cursorrules, or AGENTS.md, to provide coding agents with project-specific instructions. A recent paper from ETH Zurich found these files change agent behavior significantly: agents run more tests, read more files, and use project-specific tooling. The catch is that in well-documented repositories, the files added cost (20%+ more inference tokens) without reliably improving outcomes. Where they help most is in projects that lack other documentation, filling a context gap that nothing else addresses.

Codified context infrastructure

A February 2026 paper documented a more systematic approach: building a three-tier context infrastructure for a 108,000-line C# distributed system. The architecture separates “hot memory” (conventions, protocols, retrieval hooks always loaded), domain-expert agents (19 specialized modules), and “cold memory” (34 on-demand specification documents). Across 283 development sessions, this structure propagated context across sessions, preventing repeated failures and maintaining consistency.

External context layers

A growing category of tools externalizes codebase context entirely, maintaining a structured, searchable layer that any coding agent can query on demand. Context-as-a-service platforms like Wire let teams upload documentation, architecture decisions, and project knowledge into containers that any MCP-compatible agent can query. The agent asks “how do we handle authentication?” and gets back the relevant files, patterns, and conventions rather than trying to hold the entire codebase in context.

What this means for teams using AI to code

The Stack Overflow data tells a clear story. Developers aren’t abandoning AI coding tools. Usage is growing. But trust is declining because the tools keep generating code that doesn’t fit the project. The fix is the same one that works across every AI application domain: better context, not better models.

A few practical steps:

  1. Audit what your tool actually sees. Ask your coding assistant to explain your project’s architecture. If it can’t, it doesn’t have enough context to generate reliable code.
  2. Invest in context files. Even a basic CLAUDE.md or AGENTS.md with your project’s conventions, architecture, and key patterns helps, especially if your project lacks other documentation.
  3. Keep context fresh. Stale context generates stale code. If your coding assistant doesn’t sync with recent changes, it’s working against an outdated version of your project.
  4. Use selective retrieval over full-context loading. Focused, relevant context outperforms dumping everything into the window. Tools that index your codebase and retrieve relevant files per query (semantic search approaches) consistently produce better results.
  5. Treat context as infrastructure. The teams seeing the best results from AI coding are the ones treating context engineering as a first-class discipline, not an afterthought.

The model knows how to write code. Your job is to make sure it knows how to write code for your project.

References

Ready to give your AI agents better context?

Wire transforms your documents into structured, AI-optimized context containers. Upload files, get MCP tools instantly.

Create Your First Container