An optimal on-line algorithm for metrical task system 论文

1992Journal of the ACM引用 330
Optimization and Search ProblemsScheduling and Optimization AlgorithmsReinforcement Learning in Robotics

摘要

In practice, almost all dynamic systems require decisions to be made on-line, without full knowledge of their future impact on the system. A general model for the processing of sequences of tasks is introduced, and a general on-line decision algorithm is developed. It is shown that, for an important class of special cases, this algorithm is optimal among all on-line algorithms. Specifically, a task system ( S,d ) for processing sequences of tasks consists of a set S of states and a cost matrix d where d ( i, j is the cost of changing from state i to state j (we assume that d satisfies the triangle inequality and all diagonal entries are 0). The cost of processing a given task depends on the state of the system. A schedule for a sequence T 1 , T 2 ,…, T k of tasks is a sequence s 1 , s 2 ,…, s k of states where s i is the state in which T i is processed; the cost of a schedule is the sum of all task processing costs and the state transition costs incurred. An on-line scheduling algorithm is one that chooses s i only knowing T 1 T 2 … T i . Such an algorithm is w -competitive if, on any input task sequence, its cost is within an additive constant of w times the optimal offline schedule cost. The competitive ratio w ( S , d ) is the infimum w for which there is a w -competitive on-line scheduling algorithm for ( S , d ). It is shown that w ( S , d ) = 2|S|–1 for every task system in which d is symmetric, and w ( S, d ) = O (| S | 2 ) for every task system. Finally, randomized on-line scheduling algorithms are introduced. It is shown that for the uniform task system (in which d ( i,j ) = 1 for all i,j ), the expected competitive ratio w¯ ( S,d ) = O (log|S|).