论文标题
当代在线反犹太主义和阴谋叙事的代码,模式和形状 - 在COVID-19的背景下的注释指南和标记为德语数据集
Codes, Patterns and Shapes of Contemporary Online Antisemitism and Conspiracy Narratives -- an Annotation Guide and Labeled German-Language Dataset in the Context of COVID-19
论文作者
论文摘要
在共同199大流行期间,现有的阴谋论得到了刷新,并创建了新的阴谋论,通常与反犹太叙述,刻板印象和代码交织在一起。互联网上的反犹太和阴谋理论内容的庞大数量使数据驱动的算法方法对于反歧视组织和研究人员都至关重要。但是,这两种相互关联的现象的表现和传播在大型文本语料库的学术经验研究中仍在研究。用于检测和分类特定内容的算法方法通常需要标记的数据集,并根据概念合理的指南进行注释。尽管越来越多的数据集用于更普遍的仇恨言论现象,但语料库的发展和反犹太和阴谋内容的注释指南仍处于起步阶段,尤其是对于英语以外的其他语言。我们通过在COVID-19大流行的背景下为反犹太和阴谋论在线内容开发注释指南来缩小这一差距。我们提供工作定义,包括特定形式的反犹太主义,例如编码和大屠杀后的反犹太主义。我们使用这些用来注释一个德语数据集,该数据集由〜3,700个电报消息组成,发送给03/2020和12/2021之间。
Over the course of the COVID-19 pandemic, existing conspiracy theories were refreshed and new ones were created, often interwoven with antisemitic narratives, stereotypes and codes. The sheer volume of antisemitic and conspiracy theory content on the Internet makes data-driven algorithmic approaches essential for anti-discrimination organizations and researchers alike. However, the manifestation and dissemination of these two interrelated phenomena is still quite under-researched in scholarly empirical research of large text corpora. Algorithmic approaches for the detection and classification of specific contents usually require labeled datasets, annotated based on conceptually sound guidelines. While there is a growing number of datasets for the more general phenomenon of hate speech, the development of corpora and annotation guidelines for antisemitic and conspiracy content is still in its infancy, especially for languages other than English. We contribute to closing this gap by developing an annotation guide for antisemitic and conspiracy theory online content in the context of the COVID-19 pandemic. We provide working definitions, including specific forms of antisemitism such as encoded and post-Holocaust antisemitism. We use these to annotate a German-language dataset consisting of ~3,700 Telegram messages sent between 03/2020 and 12/2021.