Introduction of AGENTREDBENCH benchmark 事件

BREAKTHROUGH影响: medium

Researchers introduced AGENTREDBENCH, a dynamic LLM-driven redteaming benchmark covering 215 subtle underspecified authorization scenarios across 24 enterprise integrations, and evaluated eight models from Anthropic, OpenAI, and Google.

2

相关公司

0

相关人物

6

相关产品

0

相关技术

1

相关报道

详细信息

事件类型: BREAKTHROUGH
影响级别: 中

相关公司查看全部 (2)

G

GoogleCOMPANY

A

相关人物

暂无数据

相关产品查看全部 (6)

Salesforce

Jira

PLATFORM

★ 49420TypeScript

Gemini 3 Flash

AI model

Claude Sonnet 4.6

AI Model

相关技术

暂无数据

相关报道查看全部 (1)

AgentRedBench: Dynamic Redteaming and Integration-Aware Defense for LLM Agents over SaaS Integrations

ArXiv CS.CL2026-07-20