通过线性时间逻辑约束的策略优化

论文标题

通过线性时间逻辑约束的策略优化

Policy Optimization with Linear Temporal Logic Constraints

论文作者

Voloshin, Cameron, Le, Hoang M., Chaudhuri, Swarat, Yue, Yisong

论文摘要

我们使用线性时间逻辑（LTL）约束研究策略优化问题（PO）。 LTL的语言允许灵活地描述可能不自然的任务，以编码为标量成本函数。我们将LTL受限的PO视为系统的框架，将任务规范与策略选择取消，并作为成本塑造标准的替代方案。通过访问生成模型，我们开发了一种基于模型的方法，该方法享有样本复杂性分析，以确保任务满意度和成本最佳性（通过减少到可达性问题）。从经验上讲，即使在低样本制度中，我们的算法也可以实现强大的性能。

We study the problem of policy optimization (PO) with linear temporal logic (LTL) constraints. The language of LTL allows flexible description of tasks that may be unnatural to encode as a scalar cost function. We consider LTL-constrained PO as a systematic framework, decoupling task specification from policy selection, and as an alternative to the standard of cost shaping. With access to a generative model, we develop a model-based approach that enjoys a sample complexity analysis for guaranteeing both task satisfaction and cost optimality (through a reduction to a reachability problem). Empirically, our algorithm can achieve strong performance even in low-sample regimes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题