SRA: Span Representation Alignment for Large Language Model Distillation 事件
PRODUCT_LAUNCH2026-06-03影响: MEDIUM
SRA: Span Representation Alignment for Large Language Model Distillation arXiv:2605.01205v2 Announce Type: replace Abstract: Cross-Tokenizer Knowledge Distillation (CTKD) enables knowledge transfer between a large language model and a smaller student, even when they employ different tokenizers. While existing approaches mainly focus on token-level alignment strategies, which are often brittle and sensitive to discrepancies between tokenizers, we argue that the method of aggregating tokens into
相关产品查看全部 (10)
相关报道查看全部 (1)
SRA: Span Representation Alignment for Large Language Model Distillation
ArXiv CS.CL2026-06-03