One Step to the Side: Why Defenses Against Malicious Finetuning Fail Under Adaptive Adversaries 事件

Name: One Step to the Side: Why Defenses Against Malicious Finetuning Fail Under Adaptive Adversaries
Start: 2026-05-26

PRODUCT_LAUNCH2026-05-26影响: MEDIUM

One Step to the Side: Why Defenses Against Malicious Finetuning Fail Under Adaptive Adversaries arXiv:2605.14605v2 Announce Type: replace-cross Abstract: Model providers increasingly release open weights or allow users to fine-tune foundation models through APIs. Although these models are safety-aligned before release, their safeguards can often be removed by fine-tuning on harmful data. Recent defenses aim to make models robust to such malicious fine-tuning, but they are largely evaluated only

人工智能

关系图谱

One Step to the Side: Why Defenses Against Malicious Finetuning Fail Under Adaptive Adversaries 事件

相关公司查看全部 (10)

相关人物查看全部 (3)

相关产品查看全部 (10)

相关技术查看全部 (10)

相关报道查看全部 (1)