通过神经网络度量的快速排名的不对称散列

论文标题

通过神经网络度量的快速排名的不对称散列

Asymmetric Hashing for Fast Ranking via Neural Network Measures

论文作者

Doan, Khoa, Tan, Shulong, Zhao, Weijie, Li, Ping

论文摘要

快速项目排名是推荐系统中的重要任务。在以前的作品中，基于图形的近似邻居（ANN）方法通过通用搜索/匹配度量（包括复杂的措施，例如神经网络测量），在项目排名任务上表现出良好的性能。但是，由于这些ANN方法在排名期间必须多次通过神经测量，因此，如果神经度量是大型网络，则计算是不切实际的。另一方面，使用现有基于哈希的方法（例如局部敏感哈希（LSH））进行快速项目排名，仅适用于一组有限的度量。以前的学习对用途方法也不适合解决快点排名问题，因为它们可以花费大量时间和计算来训练哈希功能。但是，哈希方法很有吸引力，因为它们提供了一种原理和有效的方法来检索候选项目。在本文中，我们为快速项目排名问题提出了一种简单有效的学习方法，该方法可用于任何类型的度量，包括神经网络测量。具体而言，我们通过基于离散的内部产品拟合的不对称哈希框架解决了这个问题。我们学习了一对相关的哈希功能，这些函数将异质对象（例如用户和项目）映射到一个常见的离散空间中，其中其二进制代码的内部产物通过原始搜索措施揭示了其真正的相似性。快速排名问题通过这种不对称的哈希方案降低为ANN搜索。然后，我们提出了一种采样策略，以有效选择相关和对比样品来训练哈希模型。我们从经验上验证了所提出的方法在非线性搜索功能和突出数据集的几种组合中，反对现有的最新项目排名方法。

Fast item ranking is an important task in recommender systems. In previous works, graph-based Approximate Nearest Neighbor (ANN) approaches have demonstrated good performance on item ranking tasks with generic searching/matching measures (including complex measures such as neural network measures). However, since these ANN approaches must go through the neural measures several times during ranking, the computation is not practical if the neural measure is a large network. On the other hand, fast item ranking using existing hashing-based approaches, such as Locality Sensitive Hashing (LSH), only works with a limited set of measures. Previous learning-to-hash approaches are also not suitable to solve the fast item ranking problem since they can take a significant amount of time and computation to train the hash functions. Hashing approaches, however, are attractive because they provide a principle and efficient way to retrieve candidate items. In this paper, we propose a simple and effective learning-to-hash approach for the fast item ranking problem that can be used for any type of measure, including neural network measures. Specifically, we solve this problem with an asymmetric hashing framework based on discrete inner product fitting. We learn a pair of related hash functions that map heterogeneous objects (e.g., users and items) into a common discrete space where the inner product of their binary codes reveals their true similarity defined via the original searching measure. The fast ranking problem is reduced to an ANN search via this asymmetric hashing scheme. Then, we propose a sampling strategy to efficiently select relevant and contrastive samples to train the hashing model. We empirically validate the proposed method against the existing state-of-the-art fast item ranking methods in several combinations of non-linear searching functions and prominent datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题