通过转移学习和主动学习的有效论证结构提取

论文标题

通过转移学习和主动学习的有效论证结构提取

Efficient Argument Structure Extraction with Transfer Learning and Active Learning

论文作者

Hua, Xinyu, Wang, Lu

论文摘要

提取论点结构的自动化在（1）编码长期环境以促进全面理解的情况下面临着一对挑战，以及（2）提高数据效率，因为构建高质量的论点结构是耗时的。在这项工作中，我们提出了一个基于新颖的上下文感知到变压器的参数结构预测模型，该模型在五个不同的域上大大优于依赖特征或仅编码有限上下文的模型。为了解决数据注释的难度，我们检查了两种互补方法：（i）转移学习以利用现有的注释数据来提高新目标域中的模型性能，以及（ii）积极学习以策略性地识别少量的注释样本。我们进一步提出了与模型无关的样本采集策略，可以将其推广到不同领域。通过广泛的实验，我们表明，我们简单效率有效的获取策略可与三个强大的比较产生竞争成果。结合转移学习，可以在跨域的主动学习的早期迭代中进一步实现实质性的F1得分提升（5-25）。

The automation of extracting argument structures faces a pair of challenges on (1) encoding long-term contexts to facilitate comprehensive understanding, and (2) improving data efficiency since constructing high-quality argument structures is time-consuming. In this work, we propose a novel context-aware Transformer-based argument structure prediction model which, on five different domains, significantly outperforms models that rely on features or only encode limited contexts. To tackle the difficulty of data annotation, we examine two complementary methods: (i) transfer learning to leverage existing annotated data to boost model performance in a new target domain, and (ii) active learning to strategically identify a small amount of samples for annotation. We further propose model-independent sample acquisition strategies, which can be generalized to diverse domains. With extensive experiments, we show that our simple-yet-effective acquisition strategies yield competitive results against three strong comparisons. Combined with transfer learning, substantial F1 score boost (5-25) can be further achieved during the early iterations of active learning across domains.

下载PDF全文

下载文献需遵守相关版权规定

论文标题