在社交媒体上统一多模式来源和传播图，以发现缺少功能的社交媒体

论文标题

在社交媒体上统一多模式来源和传播图，以发现缺少功能的社交媒体

Unifying Multimodal Source and Propagation Graph for Rumour Detection on Social Media with Missing Features

论文作者

Cheung, Tsun-Hin, Lam, Kin-Man

论文摘要

随着在线社交媒体平台的快速发展，谣言的传播已成为一个关键的社会问题。当前的谣言检测方法可以分为图像文本对分类和源图形分类。在本文中，我们提出了一种新颖的方法，该方法结合了多模式源和传播图的特征，以进行谣言分类。我们介绍了统一的多模式图形变压器网络（UMGTN），该网络集成了变压器编码器以融合这些功能。鉴于社交媒体中并非每个信息都与传播图中的图像和社区响应相关联，并未立即遵循源消息，我们的目的是构建一个处理缺失功能（例如图像或答复）的网络体系结构。为了增强模型对具有缺失功能的数据的鲁棒性，我们采用了多任务学习框架，该框架同时学习具有完整功能和缺失功能的样本之间的表示。我们在四个现实世界数据集上评估了我们提出的方法，通过从Twitter和Weibo中恢复图像和答复来增强它们。实验结果表明，我们的多任务学习的UMGTN实现了最新的表现，将F1得分提高了1.0％至4.0％，同时与未经多任务学习框架的训练的模型相比，在2％的准确性和F1分数中保持对缺失功能的鲁棒性。我们已在以下网址公开提供模型和数据集，网址为：https：//thcheung.github.io/umgtn/。

With the rapid development of online social media platforms, the spread of rumours has become a critical societal concern. Current methods for rumour detection can be categorized into image-text pair classification and source-reply graph classification. In this paper, we propose a novel approach that combines multimodal source and propagation graph features for rumour classification. We introduce the Unified Multimodal Graph Transformer Network (UMGTN) which integrates Transformer encoders to fuse these features. Given that not every message in social media is associated with an image and community responses in propagation graphs do not immediately follow source messages, our aim is to build a network architecture that handles missing features such as images or replies. To enhance the model's robustness to data with missing features, we adopt a multitask learning framework that simultaneously learns representations between samples with complete and missing features. We evaluate our proposed method on four real-world datasets, augmenting them by recovering images and replies from Twitter and Weibo. Experimental results demonstrate that our UMGTN with multitask learning achieves state-of-the-art performance, improving F1-score by 1.0% to 4.0%, while maintaining detection robustness to missing features within 2% accuracy and F1-score compared to models trained without the multitask learning framework. We have made our models and datasets publicly available at: https://thcheung.github.io/umgtn/.

下载PDF全文

下载文献需遵守相关版权规定

论文标题