The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning 事件
PRODUCT_LAUNCH2026-06-10影响: MEDIUM
The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning arXiv:2603.29025v3 Announce Type: replace Abstract: Large language models fail when a salient surface cue conflicts with an unstated feasibility constraint. We introduce the Heuristic Override Benchmark (HOB): 500 instances spanning 4 heuristic families and 5 constraint families, with minimal pairs and explicitness gradients. We pair HOB with a falsifiable behavioral characterization following a diagnose-