通过嵌套的Quasi独立套件改善了Euclidean $ K $ -MEANS和$ K $ -Median的近似值

论文标题

通过嵌套的Quasi独立套件改善了Euclidean $ K $ -MEANS和$ K $ -Median的近似值

Improved Approximations for Euclidean $k$-means and $k$-median, via Nested Quasi-Independent Sets

论文作者

Cohen-Addad, Vincent, Esfandiari, Hossein, Mirrokni, Vahab, Narayanan, Shyam

论文摘要

在数据分析和机器学习应用程序中，我们考虑了流行的高维欧几里得$ k $ -Median和$ k $ -MEANS问题。我们提出了一种新的原始偶偶有算法，灵感来自Jain和Vazirani的经典算法以及艾哈迈迪安，诺鲁兹·菲德，斯文森和沃德的最新算法。我们的算法的近似值分别为$ 2.406 $和$ 5.912 $的欧几里得$ k $ -Median和$ k $ -Means，在Ahmadian等人的2.633近似值上提高了。以及Grandoni，Ostrovsky，Rabani，Schulman和Venkat的6.1291近似值。我们的技术涉及对欧几里得度量的强化比以前在欧几里得聚类方面的工作要强得多。此外，我们引入了一种新的方法，该方法使用一系列独立集的变体在图表上删除了多余的中心，我们将“嵌套的准独立集合”配音。反过来，这种技术可能引起欧几里得和$ \ ell_p $公制空间的其他优化问题的关注。

Motivated by data analysis and machine learning applications, we consider the popular high-dimensional Euclidean $k$-median and $k$-means problems. We propose a new primal-dual algorithm, inspired by the classic algorithm of Jain and Vazirani and the recent algorithm of Ahmadian, Norouzi-Fard, Svensson, and Ward. Our algorithm achieves an approximation ratio of $2.406$ and $5.912$ for Euclidean $k$-median and $k$-means, respectively, improving upon the 2.633 approximation ratio of Ahmadian et al. and the 6.1291 approximation ratio of Grandoni, Ostrovsky, Rabani, Schulman, and Venkat. Our techniques involve a much stronger exploitation of the Euclidean metric than previous work on Euclidean clustering. In addition, we introduce a new method of removing excess centers using a variant of independent sets over graphs that we dub a "nested quasi-independent set". In turn, this technique may be of interest for other optimization problems in Euclidean and $\ell_p$ metric spaces.

下载PDF全文

下载文献需遵守相关版权规定

论文标题