论文标题
量化的wasserstein procrustes单词嵌入空间的对齐
Quantized Wasserstein Procrustes Alignment of Word Embedding Spaces
论文作者
论文摘要
最佳传输(OT)提供了一个有用的几何框架,以估算无监督的跨语言嵌入(CLWE)模型下的排列矩阵,该模型将构成对齐任务作为Wasserstein-Prolocrustes问题。但是,线性编程算法和通过sindhorn近似OT求解器用于计算置换矩阵具有显着的计算负担,因为它们在输入大小上分别在立方体和四边形上进行缩放。这使得精确地计算出较大的输入尺寸的OT距离使其缓慢且不可行,从而导致置换矩阵的近似质量较差,随后获得了较不健壮的学习传输函数或映射器。本文提出了一个无监督的基于投影的Clwe模型,称为量化的wasserstein procrustes(QWP)。 QWP依赖于源和目标单语嵌入空间的量化步骤,以估算廉价采样程序的排列矩阵。在固定的计算成本下,这种方法显着提高了经验求解器的近似质量。我们证明QWP在双语词典诱导(BLI)任务上取得了最新的结果。
Optimal Transport (OT) provides a useful geometric framework to estimate the permutation matrix under unsupervised cross-lingual word embedding (CLWE) models that pose the alignment task as a Wasserstein-Procrustes problem. However, linear programming algorithms and approximate OT solvers via Sinkhorn for computing the permutation matrix come with a significant computational burden since they scale cubically and quadratically, respectively, in the input size. This makes it slow and infeasible to compute OT distances exactly for a larger input size, resulting in a poor approximation quality of the permutation matrix and subsequently a less robust learned transfer function or mapper. This paper proposes an unsupervised projection-based CLWE model called quantized Wasserstein Procrustes (qWP). qWP relies on a quantization step of both the source and target monolingual embedding space to estimate the permutation matrix given a cheap sampling procedure. This approach substantially improves the approximation quality of empirical OT solvers given fixed computational cost. We demonstrate that qWP achieves state-of-the-art results on the Bilingual lexicon Induction (BLI) task.