论文标题
部分可观测时空混沌系统的无模型预测
1Cademy @ Causal News Corpus 2022: Leveraging Self-Training in Causality Classification of Socio-Political Event Data
论文作者
论文摘要
本文详细介绍了我们参与从文本(案例)研讨会 @ emnlp 2022中自动提取社会政治事件的挑战和应用,在那里我们参与了共享任务3的子任务1。我们通过提出一条自我培训的管道来解决事件因果关系检测的给定任务,该任务是在教师结构的classement classifier splassifier方法之后提出的。更具体地说,我们最初是在真实的原始任务数据上培训教师模型,并使用该教师模型将自标记数据用于培训单独的学生模型以进行最终任务预测。我们测试自我训练过程中正面或负自我标记示例的数量如何影响分类绩效。我们的最终结果表明,在事件因果关系序列分类任务中测试的所有模型和自标记的训练集都会在所有模型中产生全面的绩效提高。最重要的是,我们发现,即使限制了训练中使用的正/负面示例,自我训练的表现也不会降低。我们的代码可在https://github.com/gzhang-umich/1cademyteamofcase上公开获取。
This paper details our participation in the Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE) workshop @ EMNLP 2022, where we take part in Subtask 1 of Shared Task 3. We approach the given task of event causality detection by proposing a self-training pipeline that follows a teacher-student classifier method. More specifically, we initially train a teacher model on the true, original task data, and use that teacher model to self-label data to be used in the training of a separate student model for the final task prediction. We test how restricting the number of positive or negative self-labeled examples in the self-training process affects classification performance. Our final results show that using self-training produces a comprehensive performance improvement across all models and self-labeled training sets tested within the task of event causality sequence classification. On top of that, we find that self-training performance did not diminish even when restricting either positive/negative examples used in training. Our code is be publicly available at https://github.com/Gzhang-umich/1CademyTeamOfCASE.