学习使用多视图共同教学网络从稀疏交互数据中的简历匹配工作

论文标题

学习使用多视图共同教学网络从稀疏交互数据中的简历匹配工作

Learning to Match Jobs with Resumes from Sparse Interaction Data using Multi-View Co-Teaching Network

论文作者

Bian, Shuqing, Chen, Xu, Zhao, Wayne Xin, Zhou, Kun, Hou, Yupeng, Song, Yang, Zhang, Tao, Wen, Ji-Rong

论文摘要

随着在线招聘数据的不断增长，工作清点匹配已成为自动与合适简历匹配的工作的重要任务。该任务通常被施放为有监督的文本匹配问题。当标记的数据足够时，监督学习是有力的。但是，在在线招聘平台上，工作清点交互数据稀疏且嘈杂，这会影响工作清点匹配算法的性能。为了减轻这些问题，在本文中，我们提出了一个新型的多视图共同教学网络，从稀疏的交互数据中进行工作，以进行工作清点匹配。我们的网络由两个主要组件组成，即基于文本的匹配模型和基于关系的匹配模型。这两个部分以两种不同的视图捕获语义兼容性，并相互补充。为了解决稀疏和嘈杂数据的挑战，我们设计了两种具体策略来结合两个组件。首先，两个组件共享学习的参数或表示形式，以便可以增强每个组件的原始表示形式。更重要的是，我们采用共同教学机制来减少噪声在训练数据中的影响。核心思想是通过选择更可靠的培训实例，让两个组件相互帮助。这两种策略分别集中于表示增强和数据增强。与纯基于文本的匹配模型相比，所提出的方法能够从有限甚至稀疏的交互数据中学习更好的数据表示，这对训练数据中的噪声更容易抵抗。实验结果表明，我们的模型能够超过最先进的方法来匹配工作。

With the ever-increasing growth of online recruitment data, job-resume matching has become an important task to automatically match jobs with suitable resumes. This task is typically casted as a supervised text matching problem. Supervised learning is powerful when the labeled data is sufficient. However, on online recruitment platforms, job-resume interaction data is sparse and noisy, which affects the performance of job-resume match algorithms. To alleviate these problems, in this paper, we propose a novel multi-view co-teaching network from sparse interaction data for job-resume matching. Our network consists of two major components, namely text-based matching model and relation-based matching model. The two parts capture semantic compatibility in two different views, and complement each other. In order to address the challenges from sparse and noisy data, we design two specific strategies to combine the two components. First, two components share the learned parameters or representations, so that the original representations of each component can be enhanced. More importantly, we adopt a co-teaching mechanism to reduce the influence of noise in training data. The core idea is to let the two components help each other by selecting more reliable training instances. The two strategies focus on representation enhancement and data enhancement, respectively. Compared with pure text-based matching models, the proposed approach is able to learn better data representations from limited or even sparse interaction data, which is more resistible to noise in training data. Experiment results have demonstrated that our model is able to outperform state-of-the-art methods for job-resume matching.

下载PDF全文

下载文献需遵守相关版权规定

论文标题