AI glossary

What is retrieval-augmented generation (RAG)?

Retrieval-augmented generation (RAG) is a technique that connects a language model to your own knowledge sources. When a user asks a question, the system first retrieves the most relevant documents, then passes them to the model alongside the question. The answer is grounded in your data instead of relying only on what the model learned during training.

How a RAG pipeline works

Documents are ingested, split into passages and indexed, typically as vector embeddings that capture meaning rather than exact keywords. At query time the system retrieves the passages most relevant to the question, inserts them into the model’s context, and asks it to answer using that material, ideally citing its sources. The model supplies the language and reasoning; your repository supplies the facts.

Why enterprises choose RAG

Company knowledge changes daily, and retraining a model for every policy update is neither practical nor economical. With RAG, updating the answer means updating the document. Answers can cite their sources, which builds the trust adoption depends on. Because retrieval happens at query time, a well-built pipeline can also enforce document permissions, so users only get answers drawn from material they are allowed to read. For injecting knowledge, RAG is almost always worth trying before fine-tuning.

Security and quality considerations

Two design decisions dominate. First, permission-aware retrieval is non-negotiable: an index that ignores access controls will leak documents across the org chart. Second, retrieved content is untrusted input. A poisoned document can carry an indirect prompt injection that the model executes when it reads it. Answer quality is also capped by retrieval quality. RAG systems therefore need continuous evaluation and adversarial testing well beyond the launch demo.

Frequently asked questions

What is the difference between RAG and fine-tuning?

RAG changes what the model knows at question time; fine-tuning changes how the model behaves. Use RAG for facts and documents that change. Consider fine-tuning for tone, format or domain-specific behaviour. Many production systems combine both.

Does RAG eliminate hallucinations?

It reduces them substantially when retrieval works, because the model has the right material in front of it. It does not eliminate them: the model can still misread sources or fill gaps. Citations, grounding checks and ongoing evaluation remain necessary.

What data sources can RAG use?

Almost anything you can index: wikis, document drives, tickets, CRM records, policies, contracts, databases. The practical constraints are data quality and access control. Both matter more than raw volume.

Deploy AI with confidence

Code75 implements production AI across enterprise teams, with the security testing and governance to match. You will talk to an engineer.