MM-COVID：一个多语言和多模式数据存储库，用于对抗COVID-19

论文标题

MM-COVID：一个多语言和多模式数据存储库，用于对抗COVID-19

MM-COVID: A Multilingual and Multimodal Data Repository for Combating COVID-19 Disinformation

论文作者

Li, Yichuan, Jiang, Bohan, Shu, Kai, Liu, Huan

论文摘要

Covid-19-19被认为是整个社会的全球健康危机，也是自第二次世界大战以来人类面临的最大挑战。不幸的是，关于Covid-19的假新闻传播的速度与病毒本身一样快。不正确的健康测量，焦虑和仇恨言论将对人们的身体健康以及全世界的心理健康造成不良后果。为了帮助更好地打击COVID-19假新闻，我们提出了一个新的假新闻检测数据集MM-COVID（多语言和多维COVID-19假新闻数据存储库）。该数据集提供了多语言的假新闻和相关的社会环境。我们从英语，西班牙语，葡萄牙语，印地语，法语和意大利语，6种不同的语言中收集3981件假新闻内容和7192个值得信赖的信息。我们从不同的角度进行了对MM-Covid的详细和探索性分析，并证明了MM-Covid在Covid-19的多种语言和社交媒体上的Fake News研究的几种潜在应用中的实用性。

The COVID-19 epidemic is considered as the global health crisis of the whole society and the greatest challenge mankind faced since World War Two. Unfortunately, the fake news about COVID-19 is spreading as fast as the virus itself. The incorrect health measurements, anxiety, and hate speeches will have bad consequences on people's physical health, as well as their mental health in the whole world. To help better combat the COVID-19 fake news, we propose a new fake news detection dataset MM-COVID(Multilingual and Multidimensional COVID-19 Fake News Data Repository). This dataset provides the multilingual fake news and the relevant social context. We collect 3981 pieces of fake news content and 7192 trustworthy information from English, Spanish, Portuguese, Hindi, French and Italian, 6 different languages. We present a detailed and exploratory analysis of MM-COVID from different perspectives and demonstrate the utility of MM-COVID in several potential applications of COVID-19 fake news study on multilingual and social media.

下载PDF全文

下载文献需遵守相关版权规定

论文标题