论文标题

在赞助搜索中查询重写的统一生成和密集检索

Unified Generative & Dense Retrieval for Query Rewriting in Sponsored Search

论文作者

Mohankumar, Akash Kumar, Dodla, Bhargav, K, Gururaj, Singh, Amit

论文摘要

赞助搜索是搜索引擎的关键收入来源,广告商在其中竞标关键字以针对用户或感兴趣的搜索查询。但是,由于庞大而动态的关键字空间,模棱两可的用户/广告客户意图以及各种可能的主题和语言,找到给定查询的相关关键字是具有挑战性的。在这项工作中,我们介绍了两个用于在线查询重写的范例:生成(NLG)和密集检索(DR)方法。我们观察到,这两种方法都提供了添加剂的互补益处。结果,我们表明,这两种方法检索到的高质量关键字中约有40%是独一无二的,而不是另一个方法检索的。为了利用这两种方法的优势,我们提出了三叶草 - 一种新颖的方法,该方法统一了一个单个模型中的生成和密集检索方法。通过离线实验,我们表明,在公共和内部基准上,Clover-Unity的NLG和DR组件始终胜过单独训练的NLG和DR模型。此外,我们表明,三叶草 - 良好的关键字密度比两个独立的DR和NLG模型的合奏高9.8%,同时将计算成本降低了几乎一半。我们在140多个国家/地区的Microsoft Bing进行了广泛的在线A/B实验,并提高了用户参与度,总点点击率平均增加了0.89%,收入增加了1.27%。我们还分享了在生产中部署此类统一模型的实用课程和优化技巧。

Sponsored search is a key revenue source for search engines, where advertisers bid on keywords to target users or search queries of interest. However, finding relevant keywords for a given query is challenging due to the large and dynamic keyword space, ambiguous user/advertiser intents, and diverse possible topics and languages. In this work, we present a comprehensive comparison between two paradigms for online query rewriting: Generative (NLG) and Dense Retrieval (DR) methods. We observe that both methods offer complementary benefits that are additive. As a result, we show that around 40% of the high-quality keywords retrieved by the two approaches are unique and not retrieved by the other. To leverage the strengths of both methods, we propose CLOVER-Unity, a novel approach that unifies generative and dense retrieval methods in one single model. Through offline experiments, we show that the NLG and DR components of CLOVER-Unity consistently outperform individually trained NLG and DR models on public and internal benchmarks. Furthermore, we show that CLOVER-Unity achieves 9.8% higher good keyword density than the ensemble of two separate DR and NLG models while reducing computational costs by almost half. We conduct extensive online A/B experiments on Microsoft Bing in 140+ countries and achieve improved user engagement, with an average increase in total clicks by 0.89% and increased revenue by 1.27%. We also share our practical lessons and optimization tricks for deploying such unified models in production.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源