MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers 论文

2021引用 233
Topic ModelingNatural Language Processing TechniquesMultimodal Machine Learning Applications

MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers · 相关事件

暂无数据