零射击异质转移从推荐系统到冷启动搜索检索

论文标题

零射击异质转移从推荐系统到冷启动搜索检索

Zero-Shot Heterogeneous Transfer Learning from Recommender Systems to Cold-Start Search Retrieval

论文作者

Wu, Tao, Chio, Ellie Ka-In, Cheng, Heng-Tze, Du, Yu, Rendle, Steffen, Kuzmin, Dima, Agarwal, Ritesh, Zhang, Li, Anderson, John, Singh, Sarvjeet, Chandra, Tushar, Chi, Ed H., Li, Wen, Kumar, Ankit, Ma, Xiang, Soares, Alex, Jindal, Nitin, Cao, Pei

论文摘要

神经信息检索模型的许多最新进展，这些模型预测了Top-K项目的查询，直接从大型培训集（查询，项目）对中学习。但是，当有许多以前看不见的（查询，项目）组合（通常称为冷启动问题）时，它们通常是不够的。此外，搜索系统可能会偏向于以前经常显示给查询的项目，也称为“ Rich Get Forrive”（又称反馈循环）问题。鉴于这些问题，我们观察到大多数在线内容平台都有搜索和推荐系统，尽管具有异质输入空间，但可以通过其共同的输出项目空间和共享的语义表示形式连接。在本文中，我们提出了一个新的零摄影异质转移学习框架，该学习框架从推荐系统组件中传输知识，以改善内容平台的搜索组件。首先，它通过预测（项目，项目）相关图作为辅助任务来了解项目及其自然语言特征的表示形式。然后，将学习的表示形式转移以解决目标搜索检索任务，执行查询到项目预测，而没有看到任何（查询，项目）对。我们对Google最大的搜索和推荐系统之一进行了在线和离线实验，并介绍了结果和经验教训。我们证明，所提出的方法可以在离线搜索检索任务上实现高性能，更重要的是，在在线实验中，相对于高度优化的生产系统，相关性和用户互动取得了重大改进。

Many recent advances in neural information retrieval models, which predict top-K items given a query, learn directly from a large training set of (query, item) pairs. However, they are often insufficient when there are many previously unseen (query, item) combinations, often referred to as the cold start problem. Furthermore, the search system can be biased towards items that are frequently shown to a query previously, also known as the 'rich get richer' (a.k.a. feedback loop) problem. In light of these problems, we observed that most online content platforms have both a search and a recommender system that, while having heterogeneous input spaces, can be connected through their common output item space and a shared semantic representation. In this paper, we propose a new Zero-Shot Heterogeneous Transfer Learning framework that transfers learned knowledge from the recommender system component to improve the search component of a content platform. First, it learns representations of items and their natural-language features by predicting (item, item) correlation graphs derived from the recommender system as an auxiliary task. Then, the learned representations are transferred to solve the target search retrieval task, performing query-to-item prediction without having seen any (query, item) pairs in training. We conduct online and offline experiments on one of the world's largest search and recommender systems from Google, and present the results and lessons learned. We demonstrate that the proposed approach can achieve high performance on offline search retrieval tasks, and more importantly, achieved significant improvements on relevance and user interactions over the highly-optimized production system in online experiments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题