When Do LLM Agents Treat Surface Noise Differently from Semantic Noise? A 68-Cell Measurement Study with a Held-Out Trace-Level Validation 事件

Name: When Do LLM Agents Treat Surface Noise Differently from Semantic Noise? A 68-Cell Measurement Study with a Held-Out Trace-Level Validation
Start: 2026-05-26

PRODUCT_LAUNCH2026-05-26影响: MEDIUM

When Do LLM Agents Treat Surface Noise Differently from Semantic Noise? A 68-Cell Measurement Study with a Held-Out Trace-Level Validation arXiv:2605.25981v1 Announce Type: new Abstract: We document an empirical phenomenon in chain-of-thought and ReAct agents driven by ten large language models from seven architecture families: meaning-bearing perturbations (e.g., paraphrase, synonym) alter final answers more often than presentation perturbations (e.g., formatting, reordering) of comparable sev

人工智能

关系图谱

When Do LLM Agents Treat Surface Noise Differently from Semantic Noise? A 68-Cell Measurement Study with a Held-Out Trace-Level Validation · 相关公司

SECGOVERNMENT

Abstract

SURFCOMPANY

arXivNONPROFIT

SpanNONPROFIT

EATNONPROFIT

ACTNONPROFIT

chainCOMPANY

SAP

replicateCOMPANY