A Survey of Confidence Utilization in Large Language Models
Positioning confidence not just as an estimation target, but as a control primitive for building reliable LLM systems.
Most work on confidence in large language models has focused on estimation, uncertainty quantification, and calibration. In deployed systems, however, the key question is how confidence should be used to govern behavior. This survey studies confidence utilization: the use of confidence-related signals to control system decisions. We formalize this through a unified framework in which confidence is defined over decision units under a local state and then consumed by a policy to determine actions.
Using this lens, we organize the literature across the full LLM lifecycle: training, inference, model selection and cascading, retrieval-augmented generation, risk management, and agentic control. We compare methods by signal source, decision unit, and functional role, and highlight open challenges in confidence semantics, composition, source attribution, decision-aware evaluation, and robustness.
Six domains of the LLM lifecycle where confidence functions as control.
Data curation, fine-tuning, distillation, preference optimization, and RL — confidence determines where learning should occur, be damped, or teach the model to abstain.
Output selection, adaptive stopping and refinement, and contrastive decoding — confidence becomes an online control variable at candidate, state, and token levels.
Sequential cascading, pre-call routing, and hybrid systems — confidence governs which model to call, when to defer, and how to allocate capacity across portfolios.
Retrieval gating, context filtering, groundedness detection, and conformal guarantees — confidence becomes source-sensitive across parametric and non-parametric knowledge.
Hallucination detection, conformal prediction with coverage guarantees, and abstention-oriented reliability alignment — confidence serves calibration, selectivity, and coverage.
Selective escalation, self-correction and backtracking, verifier-guided search, and multi-agent deliberation — confidence propagates across tools, steps, and agents.
Confidence as a control signal defined over decision units, consumed by a policy — a single abstraction covering the full lifecycle.
Comprehensive coverage across training, inference, routing, RAG, risk management, and agentic systems.
Cross-cutting comparison tables organized by signal source, decision unit, and functional role.
Confidence semantics, composition, source attribution, decision-aware evaluation, and robustness.
@misc{li2026confidence,
title = {Confidence as Control: A Survey of Confidence Utilization
in Large Language Models},
author = {Yubo Li and Tianyang Zhou and Xiaobin Shen and
Yidi Miao and Rema Padman and Ramayya Krishnan},
year = {2026},
note = {Preprint},
url = {https://yubol-bobo.github.io/assets/pdf/Conf_Survey.pdf}
}