论文标题
基于深度强化学习计划的多层次解释
Multi-level Explanation of Deep Reinforcement Learning-based Scheduling
论文作者
论文摘要
集群中的依赖性意识性工作计划是NP-HARD。最近的工作表明,深度强化学习(DRL)能够解决它。管理员很难理解基于DRL的策略,即使它取得了显着的绩效增长。因此,基于复杂的模型调度程序在偏爱简单性的系统中不容易获得信任。在本文中,我们给出了多层解释框架来解释基于DRL的调度的策略。我们将其决策过程剖析到工作级别和任务级别,并使用可解释的模型和规则近似于操作实践。我们表明,该框架为系统管理员的洞察力提供了对最先进的调度程序的见解,并揭示了有关其行为模式的鲁棒性问题。
Dependency-aware job scheduling in the cluster is NP-hard. Recent work shows that Deep Reinforcement Learning (DRL) is capable of solving it. It is difficult for the administrator to understand the DRL-based policy even though it achieves remarkable performance gain. Therefore the complex model-based scheduler is not easy to gain trust in the system where simplicity is favored. In this paper, we give the multi-level explanation framework to interpret the policy of DRL-based scheduling. We dissect its decision-making process to job level and task level and approximate each level with interpretable models and rules, which align with operational practices. We show that the framework gives the system administrator insights into the state-of-the-art scheduler and reveals the robustness issue in regards to its behavior pattern.