论文标题
对抗性的反事实学习和推荐系统评估
Adversarial Counterfactual Learning and Evaluation for Recommender System
论文作者
论文摘要
推荐系统的反馈数据通常受到用户暴露的反馈数据;但是,大多数学习和评估方法都无法解释潜在的暴露机制。我们首先表明,在没有暴露信息的情况下,应用监督学习来检测用户偏好可能会导致结果不一致。因果推理的反事实倾向加权方法可以解释暴露机制。然而,反馈数据的部分观察性质可能会导致可识别性问题。我们通过引入最小值经验风险制定提出了一种原则性的解决方案。我们表明,双重问题的放松可以转换为两个推荐模型之间的对抗游戏,其中候选模型的对手表征了基本的暴露机制。我们提供学习范围并进行广泛的仿真研究,以说明和证明在广泛的建议环境中提出的方法是合理的,这对拟议方法的各种好处进行了见解。
The feedback data of recommender systems are often subject to what was exposed to the users; however, most learning and evaluation methods do not account for the underlying exposure mechanism. We first show in theory that applying supervised learning to detect user preferences may end up with inconsistent results in the absence of exposure information. The counterfactual propensity-weighting approach from causal inference can account for the exposure mechanism; nevertheless, the partial-observation nature of the feedback data can cause identifiability issues. We propose a principled solution by introducing a minimax empirical risk formulation. We show that the relaxation of the dual problem can be converted to an adversarial game between two recommendation models, where the opponent of the candidate model characterizes the underlying exposure mechanism. We provide learning bounds and conduct extensive simulation studies to illustrate and justify the proposed approach over a broad range of recommendation settings, which shed insights on the various benefits of the proposed approach.