论文标题
使用输出敏感技术加速ERM进行数据驱动的算法设计
Accelerating ERM for data-driven algorithm design using output-sensitive techniques
论文作者
论文摘要
数据驱动的算法设计是一种具有可调参数的算法的最终案例分析的有前途的,基于学习的方法。一个重要的开放问题是针对具有多个参数的组合算法家族的计算高效数据驱动算法的设计。由于解决了问题实例并改变了参数,因此“双重”损耗函数通常具有分段可解释的结构,即在某些尖锐的过渡边界处得到很好的行为。在这项工作中,我们启动了对技术驱动算法设计有效学习算法的技术研究,通过枚举总和双重损失函数以收集问题实例。我们进近的运行时间缩放,与最坏情况相对的实际碎片数量与碎片数量上的上限相反。我们的方法涉及两种新颖的成分 - 一种使用计算几何形状工具诱导的枚举多型的输出敏感算法,以及一个执行图,该执行图代表算法可以达到所有可能参数值的算法可以达到的所有状态。我们通过提供定价问题,基于链接的聚类和基于动态编程的序列比对来说明我们的技术。
Data-driven algorithm design is a promising, learning-based approach for beyond worst-case analysis of algorithms with tunable parameters. An important open problem is the design of computationally efficient data-driven algorithms for combinatorial algorithm families with multiple parameters. As one fixes the problem instance and varies the parameters, the "dual" loss function typically has a piecewise-decomposable structure, i.e. is well-behaved except at certain sharp transition boundaries. In this work we initiate the study of techniques to develop efficient ERM learning algorithms for data-driven algorithm design by enumerating the pieces of the sum dual loss functions for a collection of problem instances. The running time of our approach scales with the actual number of pieces that appear as opposed to worst case upper bounds on the number of pieces. Our approach involves two novel ingredients -- an output-sensitive algorithm for enumerating polytopes induced by a set of hyperplanes using tools from computational geometry, and an execution graph which compactly represents all the states the algorithm could attain for all possible parameter values. We illustrate our techniques by giving algorithms for pricing problems, linkage-based clustering and dynamic-programming based sequence alignment.