TamperBench: Systematically Stress-Testing LLM Safety Under Fine-Tuning and Tampering 文章

ArXiv CS.AI2026-06-04NEWSen作者: Saad Hossain, Tom Tseng, Punya Syon Pandey, Samanvay Vajpayee, Matthew Kowal, Nayeema Nonta, Samuel Simko, Stephen Casper, Zhijing Jin, Kellin Pelrine, Sirisha Rambhatla

TamperBench: Systematically Stress-Testing LLM Safety Under Fine-Tuning and Tampering · 相关技术

相关技术