论文标题
超越双重上升,通过重复的神经切线内核在顺序推荐中
Beyond Double Ascent via Recurrent Neural Tangent Kernel in Sequential Recommendation
论文作者
论文摘要
长期以来,在顺序建议中,过度拟合一直被认为是大型神经网络模型的常见问题。在我们的研究中,观察到一个有趣的现象,即过度拟合是暂时的。当模型量表增加时,性能的趋势首先上升,然后下降(即过拟合),最后再次上升,在本文中被称为双重上升。因此,我们提出了一个假设,即通过较高的性能,更大的模型将更好地推广。在极端的无限宽度的情况下,预计性能将达到该特定结构的极限。不幸的是,由于资源的极限,直接建立一个巨大的模型是不切实际的。在本文中,我们提出了过度参数化的推荐人(OverRec),该推荐剂(OverRec)利用经常性的神经切线内核(RNTK)作为用户序列的相似性测量,以成功地绕过大型模型的硬件限制。我们进一步证明,建议中绑定的输入输出嵌入的RNTK与通用未接收输入输入嵌入的RNTK相同,这使得RNTK理论上适合推荐。由于RNTK是分析得出的,因此OverRec不需要任何培训,避免物理构建巨大的模型。在四个数据集上进行了广泛的实验,该数据集验证了OverRec的最新性能。
Overfitting has long been considered a common issue to large neural network models in sequential recommendation. In our study, an interesting phenomenon is observed that overfitting is temporary. When the model scale is increased, the trend of the performance firstly ascends, then descends (i.e., overfitting) and finally ascends again, which is named as double ascent in this paper. We therefore raise an assumption that a considerably larger model will generalise better with a higher performance. In an extreme case to infinite-width, performance is expected to reach the limit of this specific structure. Unfortunately, it is impractical to directly build a huge model due to the limit of resources. In this paper, we propose the Overparameterised Recommender (OverRec), which utilises a recurrent neural tangent kernel (RNTK) as a similarity measurement for user sequences to successfully bypass the restriction of hardware for huge models. We further prove that the RNTK for the tied input-output embeddings in recommendation is the same as the RNTK for general untied input-output embeddings, which makes RNTK theoretically suitable for recommendation. Since the RNTK is analytically derived, OverRec does not require any training, avoiding physically building the huge model. Extensive experiments are conducted on four datasets, which verifies the state-of-the-art performance of OverRec.