论文标题
DADNN:通过域知觉的深神经网络进行多场景CTR预测
DADNN: Multi-Scene CTR Prediction via Domain-Aware Deep Neural Network
论文作者
论文摘要
单击速率(CTR)预测是广告系统中的核心任务。我们公司蓬勃发展的电子商务业务导致了越来越多的场景。他们中的大多数是所谓的长尾场景,这意味着一个场景的流量有限,但总体流量相当大。典型的研究主要集中于使用设计精良的模型提供单个场景。但是,这种方法在离线培训和在线服务方面带来了过多的资源消耗。此外,简单地使用来自多个场景的数据训练单个模型就会忽略其自己的特征。为了应对这些挑战,我们提出了一种新颖但实用的模型,称为域名深神经网络(DADNN),只需提供一个模型即可提供多个场景。具体而言,应用所有场景之间的共享底部块用于学习共同的表示,而特定领域的头则保持每个场景的特征。此外,引入知识转移以增强不同场景之间知识共享的机会。 In this paper, we study two instances of DADNN where its shared bottom block is multilayer perceptron(MLP) and Multi-gate Mixture-of-Experts(MMoE) respectively, for which we denote as DADNN-MLP and DADNN-MMoE.Comprehensive offline experiments on a real production dataset from our company show that DADNN outperforms several state-of-the-art methods for multi-scene CTR 预言。广泛的在线A/B测试表明,与设计精良的DCN模型相比,DADNN-MLP贡献高达6.7%的CTR和3.0%CPM(每毫米成本)促销。此外,DADNN-MMOE的表现分别优于DADNN-MLP,CTR和CPM的相对提高分别为2.2%和2.7%。更重要的是,DADNN利用单个模型用于多个场景,可节省大量离线培训和在线服务资源。
Click through rate(CTR) prediction is a core task in advertising systems. The booming e-commerce business in our company, results in a growing number of scenes. Most of them are so-called long-tail scenes, which means that the traffic of a single scene is limited, but the overall traffic is considerable. Typical studies mainly focus on serving a single scene with a well designed model. However, this method brings excessive resource consumption both on offline training and online serving. Besides, simply training a single model with data from multiple scenes ignores the characteristics of their own. To address these challenges, we propose a novel but practical model named Domain-Aware Deep Neural Network(DADNN) by serving multiple scenes with only one model. Specifically, shared bottom block among all scenes is applied to learn a common representation, while domain-specific heads maintain the characteristics of every scene. Besides, knowledge transfer is introduced to enhance the opportunity of knowledge sharing among different scenes. In this paper, we study two instances of DADNN where its shared bottom block is multilayer perceptron(MLP) and Multi-gate Mixture-of-Experts(MMoE) respectively, for which we denote as DADNN-MLP and DADNN-MMoE.Comprehensive offline experiments on a real production dataset from our company show that DADNN outperforms several state-of-the-art methods for multi-scene CTR prediction. Extensive online A/B tests reveal that DADNN-MLP contributes up to 6.7% CTR and 3.0% CPM(Cost Per Mille) promotion compared with a well-engineered DCN model. Furthermore, DADNN-MMoE outperforms DADNN-MLP with a relative improvement of 2.2% and 2.7% on CTR and CPM respectively. More importantly, DADNN utilizes a single model for multiple scenes which saves a lot of offline training and online serving resources.