Autopilot 论文
2020引用 261
Cloud Computing and Resource ManagementIoT and Edge/Fog ComputingDistributed systems and fault tolerance
摘要
In many public and private Cloud systems, users need to specify a limit for the amount of resources (CPU cores and RAM) to provision for their workloads. A job that exceeds its limits might be throttled or killed, resulting in delaying or dropping end-user requests, so human operators naturally err on the side of caution and request a larger limit than the job needs. At scale, this results in massive aggregate resource wastage.