表征，检测和预测在线禁令逃避

论文标题

表征，检测和预测在线禁令逃避

Characterizing, Detecting, and Predicting Online Ban Evasion

论文作者

Niverthi, Manoj, Verma, Gaurav, Kumar, Srijan

论文摘要

主持人和自动化方法对从事破坏性行为的恶意用户执行禁令。但是，恶意用户可以轻松地创建一个新帐户来逃避此类禁令。先前的研究集中在其他形式的在线欺骗上，例如同一实体（sockpupepetry）同时运行多个帐户，对其他人的模仿以及研究了平台成形的个人和社区的影响。在这里，我们进行了首次以数据为逃避的数据驱动研究，即在在线平台上规避禁令的行为，从而导致同一用户的临时账目运行。我们策划了一个新颖的数据集，其中包括在Wikipedia上确定的8,551个禁令对（父母，孩子），并与良性用户和非撤销恶意用户进行对比。我们发现，逃避儿童帐户在几个行为轴上都表现出与他们的被禁止的父母帐户相似之处 - 从用户名和编辑页面的相似性到添加到平台及其心理语言属性的内容的相似性。我们揭示了可能逃避禁令的账目的关键行为属性。根据分析的见解，我们训练逻辑回归分类器，以检测和预测禁令逃避生命周期中三个不同点的禁令。结果证明了我们方法在预测未来逃生者（AUC = 0.78），禁令逃避（AUC = 0.85）以及与父母帐户的匹配（MRR = 0.97）中的有效性。与当前的手动和基于启发式的方法相比，我们的工作可以通过减少工作量并更有效地识别逃避对来帮助主持人。数据集可用https://github.com/srijankr/ban_evasion。

Moderators and automated methods enforce bans on malicious users who engage in disruptive behavior. However, malicious users can easily create a new account to evade such bans. Previous research has focused on other forms of online deception, like the simultaneous operation of multiple accounts by the same entities (sockpuppetry), impersonation of other individuals, and studying the effects of de-platforming individuals and communities. Here we conduct the first data-driven study of ban evasion, i.e., the act of circumventing bans on an online platform, leading to temporally disjoint operation of accounts by the same user. We curate a novel dataset of 8,551 ban evasion pairs (parent, child) identified on Wikipedia and contrast their behavior with benign users and non-evading malicious users. We find that evasion child accounts demonstrate similarities with respect to their banned parent accounts on several behavioral axes - from similarity in usernames and edited pages to similarity in content added to the platform and its psycholinguistic attributes. We reveal key behavioral attributes of accounts that are likely to evade bans. Based on the insights from the analyses, we train logistic regression classifiers to detect and predict ban evasion at three different points in the ban evasion lifecycle. Results demonstrate the effectiveness of our methods in predicting future evaders (AUC = 0.78), early detection of ban evasion (AUC = 0.85), and matching child accounts with parent accounts (MRR = 0.97). Our work can aid moderators by reducing their workload and identifying evasion pairs faster and more efficiently than current manual and heuristic-based approaches. Dataset is available https://github.com/srijankr/ban_evasion.

下载PDF全文

下载文献需遵守相关版权规定

论文标题