CSULoRA: Closest Safe Update Low-Rank Adaptation 事件
PRODUCT_LAUNCH2026-06-01影响: MEDIUM
CSULoRA: Closest Safe Update Low-Rank Adaptation arXiv:2605.30640v1 Announce Type: cross Abstract: Low-rank adaptation has become a standard method for parameter-efficient fine-tuning of large language models, but even small amounts of unsafe or adversarial fine-tuning data can substantially weaken the safety behavior of aligned models. Existing safety-preserving LoRA methods often rely on hard interventions such as projection, pruning, thresholding, or additional training objectives. While the
相关产品查看全部 (10)
相关报道查看全部 (1)
CSULoRA: Closest Safe Update Low-Rank Adaptation
ArXiv CS.CL2026-06-01