A high-throughput and memory-efficient inference and serving engine for LLMs
79898
Stars
16748
Forks
6
技术栈
0
替代方案
5
相关事件