选择网络嵌入维度的原则方法

论文标题

选择网络嵌入维度的原则方法

Principled approach to the selection of the embedding dimension of networks

论文作者

Gu, Weiwei, Tandon, Aditya, Ahn, Yong-Yeol, Radicchi, Filippo

论文摘要

网络嵌入是一种通用机器学习技术，可在具有可调尺寸的向量空间中编码网络结构。选择适当的嵌入尺寸（足够小，足以有效且足够大，可以有效），这是具有挑战性的，但对于生成适用于多种任务的嵌入的必要条件。选择嵌入维度的现有策略取决于下游任务中的性能最大化。在这里，我们提出了一种有原则的方法，以使网络的所有结构信息均简短地编码。该方法在各种嵌入算法和大量现实世界网络上进行了验证。我们方法在现实世界网络中选择的嵌入维度表明，通常可以在低维空间中进行有效编码。

Network embedding is a general-purpose machine learning technique that encodes network structure in vector spaces with tunable dimension. Choosing an appropriate embedding dimension -- small enough to be efficient and large enough to be effective -- is challenging but necessary to generate embeddings applicable to a multitude of tasks. Existing strategies for the selection of the embedding dimension rely on performance maximization in downstream tasks. Here, we propose a principled method such that all structural information of a network is parsimoniously encoded. The method is validated on various embedding algorithms and a large corpus of real-world networks. The embedding dimension selected by our method in real-world networks suggest that efficient encoding in low-dimensional spaces is usually possible.

下载PDF全文

下载文献需遵守相关版权规定

论文标题