When Skills Don't Help: A Negative Result on Procedural Knowledge for Tool-Grounded Agents in Offensive Cybersecurity 文章

ArXiv CS.AI2026-05-26NEWSen作者: Samuel Jacob Chacko, James Hugglestone, Chashi Mahiul Islam, Xiuwen Liu

摘要

arXiv:2605.20023v2 Announce Type: replace Abstract: Agent Skills, structured packages of procedural knowledge loaded into an LLM agent at inference time, are widely reported to improve task pass rates by an average of 16.2~percentage points across diverse domains. Yet the same benchmarks show wide variance, with 16 of 84 tasks suffering negative deltas when Skills are introduced. The community has not yet articulated a clean mechanism for \emph{when} Skills help and when they are merely redundant overhead. We re-analyze a recently published 180-run controlled study of an MCP-grounded autonomous Capture-the-Flag (CTF) agent under four documentation conditions of increasing richness (591, 12865, 17253, and 36001 tokens) and show that these conditions correspond almost exactly to a No-Skills, Experiential-Skills, Curated-Skills, and Comprehensive-Skills ablation.

When Skills Don't Help: A Negative Result on Procedural Knowledge for Tool-Grounded Agents in Offensive Cybersecurity 文章

摘要

相关事件查看全部 (2)

相关公司查看全部 (5)

相关人物

相关产品查看全部 (8)

相关技术查看全部 (19)