CSULoRA: Closest Safe Update Low-Rank Adaptation 事件

PRODUCT_LAUNCH2026-06-01影响: MEDIUM

CSULoRA: Closest Safe Update Low-Rank Adaptation arXiv:2605.30640v1 Announce Type: cross Abstract: Low-rank adaptation has become a standard method for parameter-efficient fine-tuning of large language models, but even small amounts of unsafe or adversarial fine-tuning data can substantially weaken the safety behavior of aligned models. Existing safety-preserving LoRA methods often rely on hard interventions such as projection, pruning, thresholding, or additional training objectives. While the