AuTO 论文
2018引用 278
Software-Defined Networks and 5GCloud Computing and Resource ManagementAdvanced Optical Network Technologies
摘要
Tra c optimizations (TO, e.g. ow scheduling, load balancing) in datacenters are di cult online decision-making problems. Previously, they are done with heuristics relying on operators' understanding of the workload and environment. Designing and implementing proper TO algorithms thus take at least weeks. Encouraged by recent successes in applying deep reinforcement learning (DRL) techniques to solve complex online control problems, we study if DRL can be used for automatic TO without human-intervention. However, our experiments show that the latency of current DRL systems cannot handle ow-level TO at the scale of current datacenters, because short ows (which constitute the majority of tra c) are usually gone before decisions can be made.