Confidence as Control

A Survey of Confidence Utilization in Large Language Models

Yubo Li, Tianyang Zhou, Xiaobin Shen, Yidi Miao, Rema Padman, Ramayya Krishnan
Carnegie Mellon University

📄 Paper ⭐ GitHub

What is this survey about?

Positioning confidence not just as an estimation target, but as a control primitive for building reliable LLM systems.

Most work on confidence in large language models has focused on estimation, uncertainty quantification, and calibration. In deployed systems, however, the key question is how confidence should be used to govern behavior. This survey studies confidence utilization: the use of confidence-related signals to control system decisions. We formalize this through a unified framework in which confidence is defined over decision units under a local state and then consumed by a policy to determine actions.

Using this lens, we organize the literature across the full LLM lifecycle: training, inference, model selection and cascading, retrieval-augmented generation, risk management, and agentic control. We compare methods by signal source, decision unit, and functional role, and highlight open challenges in confidence semantics, composition, source attribution, decision-aware evaluation, and robustness.

Figure 1: Confidence-guided control across six parallel domains of the LLM lifecycle.

Taxonomy at a Glance

Six domains of the LLM lifecycle where confidence functions as control.

§3 · Training

Confidence-Aware Training

Data curation, fine-tuning, distillation, preference optimization, and RL — confidence determines where learning should occur, be damped, or teach the model to abstain.

§4 · Inference

Confidence-Driven Inference

Output selection, adaptive stopping and refinement, and contrastive decoding — confidence becomes an online control variable at candidate, state, and token levels.

§5 · Model Selection

Confidence-Guided Model Selection

Sequential cascading, pre-call routing, and hybrid systems — confidence governs which model to call, when to defer, and how to allocate capacity across portfolios.

§6 · RAG

Confidence-Gated RAG

Retrieval gating, context filtering, groundedness detection, and conformal guarantees — confidence becomes source-sensitive across parametric and non-parametric knowledge.

§7 · Risk Management

Confidence-Based Risk Management

Hallucination detection, conformal prediction with coverage guarantees, and abstention-oriented reliability alignment — confidence serves calibration, selectivity, and coverage.

§8 · Agentic Systems

Confidence in Agentic Systems

Selective escalation, self-correction and backtracking, verifier-guided search, and multi-agent deliberation — confidence propagates across tools, steps, and agents.

Key Contributions

🔬

Unified Framework

Confidence as a control signal defined over decision units, consumed by a policy — a single abstraction covering the full lifecycle.

📊

6-Domain Taxonomy

Comprehensive coverage across training, inference, routing, RAG, risk management, and agentic systems.

100+

Methods Compared

Cross-cutting comparison tables organized by signal source, decision unit, and functional role.

Open Challenges

Confidence semantics, composition, source attribution, decision-aware evaluation, and robustness.

Citation

@misc{li2026confidence,
  title  = {Confidence as Control: A Survey of Confidence Utilization
            in Large Language Models},
  author = {Yubo Li and Tianyang Zhou and Xiaobin Shen and
            Yidi Miao and Rema Padman and Ramayya Krishnan},
  year   = {2026},
  note   = {Preprint},
  url    = {https://yubol-bobo.github.io/assets/pdf/Conf_Survey.pdf}
}