论文标题
部分可观测时空混沌系统的无模型预测
Stochastic Direct Search Method for Blind Resource Allocation
论文作者
论文摘要
通过程序化广告优化的激励,我们考虑了在一组资源上依次分配预算的任务。在每个时间步骤中,都会选择可行的分配,并且仅观察到相应的随机返回。目标是最大化累积的预期收益总和。这是一个在营销活动的细分之间进行预算分配的现实模型,其目的是最大化转化次数的数量。 我们研究了直接搜索(也称为模式搜索)方法,用于在存在噪声的情况下线性约束和无衍生的优化,这特别适用于顺序预算分配。这些不依赖资源领域的层次分区的算法很容易实现;他们通过避免在可行领域之外的评估来尊重资源分配的运营约束;它们也与(大约)下降算法兼容温暖的开始。但是,尚未从累积遗憾的角度对它们进行分析。我们表明,直接搜索方法在确定性和不受约束的情况下实现了有限的遗憾。在存在评估噪声和线性约束的情况下,我们提出了一个直接搜索的简单扩展,该扩展使$ t^{2/3} $的顺序遗憾。我们还提出了算法的加速版,依靠重复的顺序测试,可显着改善该方法的实际行为。
Motivated by programmatic advertising optimization, we consider the task of sequentially allocating budget across a set of resources. At every time step, a feasible allocation is chosen and only a corresponding random return is observed. The goal is to maximize the cumulative expected sum of returns. This is a realistic model for budget allocation across subdivisions of marketing campaigns, with the objective of maximizing the number of conversions. We study direct search (also known as pattern search) methods for linearly constrained and derivative-free optimization in the presence of noise, which apply in particular to sequential budget allocation. These algorithms, which do not rely on hierarchical partitioning of the resource space, are easy to implement; they respect the operational constraints of resource allocation by avoiding evaluation outside of the feasible domain; and they are also compatible with warm start by being (approximate) descent algorithms. However, they have not yet been analyzed from the perspective of cumulative regret. We show that direct search methods achieves finite regret in the deterministic and unconstrained case. In the presence of evaluation noise and linear constraints, we propose a simple extension of direct search that achieves a regret upper-bound of the order of $T^{2/3}$. We also propose an accelerated version of the algorithm, relying on repeated sequential testing, that significantly improves the practical behavior of the approach.