摘要
arXiv:2605.29115v1 Announce Type: cross Abstract: Unix competence is the ability to use shell and operating-system primitives as first-class tools, not merely to write programs through a terminal. Current terminal benchmarks tend to blur this distinction: a solver fluent in Python but weak in Unix can pass a substantial fraction of Terminal-Bench 2.0, while the reverse skill profile is rarely exercised. We make the distinction operational and build a training surface for the Unix component. unix-ctf is a procedural generator of capture-the-flag tasks for shell agents. Each task hides a short token (a flag of the form flag(a3b1c9...)) inside a fresh Linux container using a single Unix feature, and the agent must recover it.
相关事件查看全部 (2)
unix-ctf: Procedural Environments for Unix-Competence Reinforcement Learning
2026-05-29SHUTDOWN影响: LOW
unix-ctf: Procedural Environments for Unix-Competence Reinforcement Learning
2026-05-29PRODUCT_LAUNCH影响: MEDIUM
相关公司
暂无数据
相关人物
暂无数据