论文标题

项目与忘记:解决大规模度量约束问题

Project and Forget: Solving Large-Scale Metric Constrained Problems

论文作者

Sonthalia, Rishi, Gilbert, Anna C.

论文摘要

给定数据点之间的一组差异测量值,确定哪种度量表示与输入测量值最“一致”或最能捕获数据相关几何特征的度量是许多机器学习算法的关键步骤。现有方法仅限于特定类型的指标或小问题大小,因为这些问题中有大量的度量约束。在本文中,我们提供了一种活跃的集合算法,即项目和忘记,该算法使用Bregman的预测来解决许多(可能是指数)不平等约束的度量约束问题。我们提供了\ textsc {project and忘记}的理论分析,并证明我们的算法将电流迭代的$ L_2 $距离与最佳解决方案的$ L_2 $距离渐近衰减。我们证明,使用我们的方法,我们可以解决三种类型的度量约束问题的大型问题实例:一般体重相关聚类,度量近距离和度量学习;在每种情况下,就CPU时间和问题大小而言,超越了最先进的方法。

Given a set of dissimilarity measurements amongst data points, determining what metric representation is most "consistent" with the input measurements or the metric that best captures the relevant geometric features of the data is a key step in many machine learning algorithms. Existing methods are restricted to specific kinds of metrics or small problem sizes because of the large number of metric constraints in such problems. In this paper, we provide an active set algorithm, Project and Forget, that uses Bregman projections, to solve metric constrained problems with many (possibly exponentially) inequality constraints. We provide a theoretical analysis of \textsc{Project and Forget} and prove that our algorithm converges to the global optimal solution and that the $L_2$ distance of the current iterate to the optimal solution decays asymptotically at an exponential rate. We demonstrate that using our method we can solve large problem instances of three types of metric constrained problems: general weight correlation clustering, metric nearness, and metric learning; in each case, out-performing the state of the art methods with respect to CPU times and problem sizes.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源