Running large language models on a single GPU for throughput-oriented scenarios.
9366
Stars
590
Forks
2
技术栈
0
替代方案
相关事件