MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers 论文
2021引用 233
Topic ModelingNatural Language Processing TechniquesMultimodal Machine Learning Applications
MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers · 相关事件
暂无数据