ASTRA: Communication-Efficient Acceleration for Multi-Device Transformer Inference 文章

ArXiv CS.AI2026-05-28NEWSen作者: Xiao Liu, Lijun Zhang, Deepak Ganesan, Hui Guan

ASTRA: Communication-Efficient Acceleration for Multi-Device Transformer Inference · 相关技术