A high-throughput and memory-efficient inference and serving engine for LLMs
79898
Stars
16748
Forks
8
技术栈
0
替代方案
相关事件