Overview
Retrieval-augmented generation (RAG) is a four-stage pipeline: (1) decide whether to retrieve, (2) retrieve passages, (3) filter/rank retrieved context, and (4) generate with context, detecting hallucinations and deciding when to abstain.
Confidence controls all four stages. Low confidence in parametric knowledge triggers retrieval (FLARE, SELF-RAG). Low retrieval quality confidence gates context inclusion (CRAG, FILCO). Low grounding confidence in generated claims triggers refinement or abstention (HALT-RAG, Conformal-RAG). This section surveys confidence-gated approaches across the RAG pipeline.
When to Retrieve
Instead of always retrieving, confidence in parametric knowledge can gate retrieval, saving computation when the model is confident or the query is simple.
What Context to Keep
After retrieval, confidence in passage relevance and utility determines which passages to include in the context window.
Groundedness Detection
Even with context, models can hallucinate. These methods detect when generated claims are not grounded in retrieved passages.
Abstention & Conformal Prediction
Rather than always generating, these methods use confidence to decide when to abstain, with formal coverage guarantees.
Summary Table
| Method | Source | Signal | Unit | Role | Access |
|---|---|---|---|---|---|
| When to Retrieve | |||||
| FLARE | Self | Token confidence | Sentence/token | Trigger | Mid |
| DRAGIN | Self | Info need entropy | Context/step | Trigger, query | Mid |
| Adaptive-RAG | External | Complexity routing | Query | Route strategy | Pre, FT |
| SKR | Self | Self-knowledge | Query | Retrieve-or-skip | Pre |
| SELF-RAG | Self | Reflection tokens | Query/passage/answer | Trigger, critique | Pre, Ctx, FT |
| SEAKR | Mechanistic | Internal uncertainty | Query/snippet | Trigger, rerank, route | Pre, Ctx, WB |
| SUGAR | Self | Semantic uncertainty | Query | Trigger, depth | Pre, MS |
| PAIRS | Self | Parametric agreement | Query/doc | Trigger, filter | Pre, Ctx |
| What Context to Keep | |||||
| CRAG | External | Retrieval quality | Retrieval set | Correct, fallback | Ctx, Aux |
| FILCO | External | Usefulness score | Passage/sentence | Filter | Ctx, FT |
| InfoGain-RAG | Self | Info gain | Document | Rerank, filter | Ctx |
| SKILL-RAG | Self | Self-knowledge score | Sentence | Filter | Ctx, FT |
| Sparse-RAG | Self | Relevance confidence | Document | Select, sparsify | Ctx, Mid |
| UncertaintyRAG | Self | Span uncertainty | Span/chunk | Score, retrieve | Ctx, FT |
| Groundedness Detection & Abstention | |||||
| ReDeEP | Mechanistic | ECS + PKS | Answer/mechanism | Detect, mitigate | Post, WB |
| HALT-RAG | External | NLI ensemble | Claim/answer | Detect, abstain | Post, Aux |
| FRANQ | External | Faithfulness UQ | Claim | Detect | Post, Aux |
| TRAQ | Hybrid | Conformal confidence | Passage/answer set | Set-predict | Ctx, Post, CP |
| ConFLARE | External | Similarity threshold | Chunk/set | Calibrate | Ctx, CP |
| Principled Ctx Eng | External | Snippet relevance | Snippet | Filter | Ctx, CP |
| Conformal-RAG | Hybrid | Conformal factuality | Sub-claim | Filter | Post, CP |
| Divide-Then-Align | Hybrid | Knowledge boundary | Query | Abstain, align | Post, FT |
Discussion
RAG confidence systems are highly source-sensitive. Self-confidence works well for detecting parametric knowledge gaps (FLARE, SKR) but is less reliable for judging retrieval quality. Auxiliary signals (CRAG, HALT-RAG) excel at post-hoc verification but add latency. Mechanistic probes (ReDeEP, SEAKR) offer interpretability but require white-box access.
The best systems combine signals:
- Trigger retrieval using self-confidence (token uncertainty) or external routers (Adaptive-RAG).
- Filter retrieved passages using external auxiliary signals (CRAG, FILCO) or mechanistic measures (SEAKR).
- Verify grounding using external verifiers (HALT-RAG) or conformal methods (Conformal-RAG) with formal coverage guarantees.
- Abstain strategically when retrieval fails or claims are ungrounded, preserving user trust.