论文标题

信息增益传播:一种使用软标签的新方法进行主动学习

Information Gain Propagation: a new way to Graph Active Learning with Soft Labels

论文作者

Zhang, Wentao, Wang, Yexin, You, Zhenbang, Cao, Meng, Huang, Ping, Shan, Jiulong, Yang, Zhi, Cui, Bin

论文摘要

图形神经网络(GNN)在各种任务中取得了巨大的成功,但是它们的性能高度依赖于大量标记的节点,这通常需要相当大的人为努力。提出了基于GNN的主动学习(AL)方法,以通过选择最有价值的标签节点来提高标签效率。现有方法假设Oracle可以正确地对所有选定的节点进行分类,从而仅关注节点选择。但是,这样的确切标记任务是昂贵的,尤其是当分类不在个人专家(Oracle)的领域时。该论文进一步发展,对GNN上的AL呈现了软标签方法。我们的关键创新是:i)轻松的查询,域专家(Oracle)仅判断预测标签的正确性(二进制问题),而不是确定确切的类别(多级问题),ii)具有轻松的查询和软标签的活动的新标准,即使信息增强的活动增长为活跃的学习者。关于公共数据集的实证研究表明,就准确性和标签成本而言,我们的方法显着优于最先进的基于GNN的AL方法。

Graph Neural Networks (GNNs) have achieved great success in various tasks, but their performance highly relies on a large number of labeled nodes, which typically requires considerable human effort. GNN-based Active Learning (AL) methods are proposed to improve the labeling efficiency by selecting the most valuable nodes to label. Existing methods assume an oracle can correctly categorize all the selected nodes and thus just focus on the node selection. However, such an exact labeling task is costly, especially when the categorization is out of the domain of individual expert (oracle). The paper goes further, presenting a soft-label approach to AL on GNNs. Our key innovations are: i) relaxed queries where a domain expert (oracle) only judges the correctness of the predicted labels (a binary question) rather than identifying the exact class (a multi-class question), and ii) new criteria of maximizing information gain propagation for active learner with relaxed queries and soft labels. Empirical studies on public datasets demonstrate that our method significantly outperforms the state-of-the-art GNN-based AL methods in terms of both accuracy and labeling cost.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源