TamperBench: Systematically Stress-Testing LLM Safety Under Fine-Tuning and Tampering 文章
ArXiv CS.AI2026-06-04NEWSen作者: Saad Hossain, Tom Tseng, Punya Syon Pandey, Samanvay Vajpayee, Matthew Kowal, Nayeema Nonta, Samuel Simko, Stephen Casper, Zhijing Jin, Kellin Pelrine, Sirisha Rambhatla