TamperBench: Systematically Stress-Testing LLM Safety Under Fine-Tuning and Tampering 事件

Name: TamperBench: Systematically Stress-Testing LLM Safety Under Fine-Tuning and Tampering
Start: 2026-06-04

BREAKTHROUGH2026-06-04影响: HIGH

TamperBench: Systematically Stress-Testing LLM Safety Under Fine-Tuning and Tampering arXiv:2602.06911v2 Announce Type: replace-cross Abstract: As increasingly capable open-weight large language models (LLMs) are deployed, improving their tamper resistance against unsafe modifications, whether accidental or intentional, becomes critical to minimize risks. However, there is no standard approach to evaluate tamper resistance. Varied datasets, metrics, and tampering configurations make it difficul

人工智能

关系图谱

TamperBench: Systematically Stress-Testing LLM Safety Under Fine-Tuning and Tampering 事件

相关公司查看全部 (9)

相关人物查看全部 (3)

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)