SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning 论文
2021引用 364
Advanced Neural Network ApplicationsTopic ModelingDomain Adaptation and Few-Shot Learning
SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning · 相关文章
暂无数据