Long context tripled hallucinations in 35 open models
A 172-billion-token study across 35 open models found hallucination rates triple from 32K to 128K context, and exceed 10% at 200K for every model tested.
Further reading
5 articles from the Wire blog, sorted newest first. Return to the AI Hallucination definition for context.
A 172-billion-token study across 35 open models found hallucination rates triple from 32K to 128K context, and exceed 10% at 200K for every model tested.
OpenAI's GPT-5.5 system card reports 23% better claim-level accuracy, not the 60% hallucination reduction making press rounds. Here's what actually changed.
Vectara's 2026 benchmark shows OpenAI's flagship GPT-5.4-pro hallucinates at 8.3% while its nano variant stays at 3.1%. The reasoning-model tradeoff, explained.
Most AI inaccuracies in production are context quality failures, not model fabrications. Here's the research on what context engineering actually changes.
GPT-5.2 hallucinates at 10.8%, o3-pro at 23.3%. The fix has less to do with better models and more to do with context engineering. Here's the research.
Create your first context container and connect it to your AI tools in minutes.
Create Your First Container