When Do LLM Agents Treat Surface Noise Differently from Semantic Noise? A 68-Cell Measurement Study with a Held-Out Trace-Level Validation 事件

PRODUCT_LAUNCH2026-05-26影响: MEDIUM

When Do LLM Agents Treat Surface Noise Differently from Semantic Noise? A 68-Cell Measurement Study with a Held-Out Trace-Level Validation arXiv:2605.25981v1 Announce Type: new Abstract: We document an empirical phenomenon in chain-of-thought and ReAct agents driven by ten large language models from seven architecture families: meaning-bearing perturbations (e.g., paraphrase, synonym) alter final answers more often than presentation perturbations (e.g., formatting, reordering) of comparable sev

When Do LLM Agents Treat Surface Noise Differently from Semantic Noise? A 68-Cell Measurement Study with a Held-Out Trace-Level Validation · 相关公司

S
SECGOVERNMENT
S
SURFCOMPANY
A
arXivNONPROFIT
S
SpanNONPROFIT
E
EATNONPROFIT
A
ACTNONPROFIT
C
chainCOMPANY
R
replicateCOMPANY