SentinelBench: A Benchmark for Long-Running Monitoring Agents 文章

ArXiv CS.AI2026-06-06NEWSen作者: Matheus Kunzler Maldaner, Adam Fourney, Amanda Swearngin, Hussein Mozzanar, Gagan Bansal, Maya Murad, Rafah Hosn, Saleema Amershi

SentinelBench: A Benchmark for Long-Running Monitoring Agents · 相关技术

暂无数据