A fast communication-overlapping library for tensor/expert parallelism on GPUs.
1306
Stars
102
Forks
2
技术栈
0
替代方案
1
相关事件