论文标题
文本和作者级的政治推论使用异质知识表示
Text and author-level political inference using heterogeneous knowledge representations
论文作者
论文摘要
从文本数据中推断出具有政治收费的信息是文本和作者级别的自然语言处理(NLP)的流行研究主题。近年来,这类研究是在伯特(Bert)等变形金刚的表示的帮助下进行的。但是,尽管取得了很大的成功,但我们可能会询问是否通过将基于转换的模型与其他知识表示形式相结合,是否可以进一步改善结果。为了阐明这个问题,本工作描述了一系列实验,以比较英语和葡萄牙语中文本的政治推论的替代模型配置。结果表明,某些文本表示形式 - 特别是,与句法依赖性模型的BERT预训练的语言模型的联合使用 - 可以在多个实验环境中胜过替代方案,从而为在这些以及其他可能其他NLP任务中使用异质文本表示的进一步研究提供了进一步研究。
The inference of politically-charged information from text data is a popular research topic in Natural Language Processing (NLP) at both text- and author-level. In recent years, studies of this kind have been implemented with the aid of representations from transformers such as BERT. Despite considerable success, however, we may ask whether results may be improved even further by combining transformed-based models with additional knowledge representations. To shed light on this issue, the present work describes a series of experiments to compare alternative model configurations for political inference from text in both English and Portuguese languages. Results suggest that certain text representations - in particular, the combined use of BERT pre-trained language models with a syntactic dependency model - may outperform the alternatives across multiple experimental settings, making a potentially strong case for further research in the use of heterogeneous text representations in these and possibly other NLP tasks.