Reconciling high server utilization and sub-millisecond quality-of-service 论文

2014引用 216
Cloud Computing and Resource ManagementParallel Computing and Optimization TechniquesDistributed systems and fault tolerance

摘要

The simplest strategy to guarantee good quality of service (QoS) for a latency-sensitive workload with sub-millisecond latency in a shared cluster environment is to never run other workloads concurrently with it on the same server. Unfortunately, this inevitably leads to low server utilization, reducing both the capability and cost effectiveness of the cluster.