SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning 论文

2021引用 364
Advanced Neural Network ApplicationsTopic ModelingDomain Adaptation and Few-Shot Learning

SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning · 相关技术