What is RAG (Retrieval-Augmented Generation)?
A technique that retrieves relevant documents or data at inference time and injects them into the model's context window before generating a response.
RAG is commonly used to give LLMs access to knowledge bases that are too large to fit in the context window. Wire uses semantic search and structured retrieval to implement RAG automatically. When an agent queries a container, Wire retrieves the most relevant context and returns it.
Related concepts
A search method that finds results based on meaning and intent rather than exact keyword matching.
The maximum amount of text (measured in tokens) that a language model can process in a single inference call.
The practice of deliberately designing, structuring, and managing the information provided to AI models to improve output quality and relevance.
Put context into practice
Create your first context container and connect it to your AI tools in minutes.
Create Your First Container