学习相似性保存推荐系统的二进制代码

论文标题

学习相似性保存推荐系统的二进制代码

Learning Similarity Preserving Binary Codes for Recommender Systems

论文作者

Shi, Yang, Chung, Young-joo

论文摘要

基于哈希的推荐系统（RSS）进行了广泛的研究以提供可扩展的服务。该系统的现有方法结合了三个模块以实现效率：特征提取，相互作用建模和二元化。在本文中，我们研究了针对基于哈希的推荐系统的未开发的模块组合，即紧凑型跨相似性推荐剂（CCSR）。受跨模式检索的启发，CCSR利用最大的后验相似性，而不是矩阵分解和评级重建来建模用户和项目之间的交互。我们在Movielens1M，Amazon产品评论，Ichiba购买数据集上进行了实验，并确认CCSR的表现优于现有的基于矩阵分解的方法。在Movielens1m数据集上，NDCG的绝对性能改进高达15.69％，召回率为4.29％。此外，我们对三个二进制模块进行了广泛研究：$ sign $，缩放的tanh和符号标准的tanh。结果表明，尽管在最近的离散功能学习文献中，可区分的tanh很受欢迎，但是当缩放$ tanh $的输出被迫为二进制时，就会发生巨大的性能下降。

Hashing-based Recommender Systems (RSs) are widely studied to provide scalable services. The existing methods for the systems combine three modules to achieve efficiency: feature extraction, interaction modeling, and binarization. In this paper, we study an unexplored module combination for the hashing-based recommender systems, namely Compact Cross-Similarity Recommender (CCSR). Inspired by cross-modal retrieval, CCSR utilizes Maximum a Posteriori similarity instead of matrix factorization and rating reconstruction to model interactions between users and items. We conducted experiments on MovieLens1M, Amazon product review, Ichiba purchase dataset and confirmed CCSR outperformed the existing matrix factorization-based methods. On the Movielens1M dataset, the absolute performance improvements are up to 15.69% in NDCG and 4.29% in Recall. In addition, we extensively studied three binarization modules: $sign$, scaled tanh, and sign-scaled tanh. The result demonstrated that although differentiable scaled tanh is popular in recent discrete feature learning literature, a huge performance drop occurs when outputs of scaled $tanh$ are forced to be binary.

下载PDF全文

下载文献需遵守相关版权规定

论文标题