Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper
805
Stars
52
Forks
1
技术栈
0
替代方案
相关事件