Large language models systematically fail when a salient surface cue conflicts with an unstated feasibility constraint. We diagnose the mechanism, measure it at scale across 14 frontier models, and show a zero-cost mitigation.
Code will be released upon acceptance.
A single sentence exposes the failure cleanly: no specialised knowledge, no multi-step arithmetic — just a conflict between a surface heuristic and an implicit constraint.
— @knowmadd, Mastodon, Feb 2026. In a follow-up 53-model sweep, 42 recommended walking on a single pass.
Under a strict 10/10 consistency criterion across 500 instances, no model reliably overrides salient heuristics when they conflict with hidden constraints.
Best model (Gemini 3.1 Pro) tops out at 74.6% strict override accuracy — no frontier system exceeds 75%.
In the car-wash case study, distance exerts 9–38× more causal influence on the decision than the goal.
A single italicised hint recovers +15.3 pp on average — the knowledge is present; the bottleneck is inference.
12 of 14 models do worse when the constraint is removed (drops up to −38.5 pp).
C-pres (object must be co-located with goal) is the hardest family — mean 44.4% across all 14 models.
Prompting models to enumerate preconditions first recovers +6–9 pp on weaker models — a zero-cost fix.
The input decomposes into three spans that pull the model in opposite directions. Across six open models, the distance span dominates the decision by 9–38×.
The correct answer is drive: you cannot wash a car that is not at the car wash. Yet every paraphrase, across every model we tested in Study 1, produces the wrong answer — 0% accuracy.
A four-stage arc that goes from a single viral example to a benchmark and a mitigation.
Causal occlusion + monotonicity curves on six open models. Distance dominates by 9–38×; goal spans barely move the decision.
500 instances across 4 heuristic × 5 constraint families, with minimal pairs and explicitness gradients, evaluated on 14 frontier models.
Four probes extend the sigmoid analysis to cost, efficiency, and semantic-similarity heuristics across three constraint families.
A one-line prefix prompting the model to list preconditions before answering recovers +9 pp on Llama 4 Scout — no tuning required.
@article{li2026model,
title={The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning},
author={Li, Yubo and Zhang, Lu and Jiang, Tianchong and Krishnan, Ramayya and Padman, Rema},
journal={arXiv preprint arXiv:2603.29025},
year={2026}
}