论文标题
开放域目标分析的挑战
Challenges for Open-domain Targeted Sentiment Analysis
论文作者
论文摘要
由于先前对开放域目标情绪分析的研究受到数据集域品种和句子级别的限制,因此我们提出了一个由6,013个人标记的数据组成的新型数据集,以扩展感兴趣和文档级别主题中的数据域。此外,我们提供了一个嵌套的目标注释模式,以提取文档中的完整情感信息,从而提高开放域目标情感分析的实用性和有效性。此外,我们将预先训练的模型BART用于任务的顺序到序列生成方法。基准结果表明,有很大的空间可以改善开放域的目标情感分析。同时,实验表明,在有效使用开放域数据,长文档,目标结构的复杂性和域差异中,挑战仍然存在。
Since previous studies on open-domain targeted sentiment analysis are limited in dataset domain variety and sentence level, we propose a novel dataset consisting of 6,013 human-labeled data to extend the data domains in topics of interest and document level. Furthermore, we offer a nested target annotation schema to extract the complete sentiment information in documents, boosting the practicality and effectiveness of open-domain targeted sentiment analysis. Moreover, we leverage the pre-trained model BART in a sequence-to-sequence generation method for the task. Benchmark results show that there exists large room for improvement of open-domain targeted sentiment analysis. Meanwhile, experiments have shown that challenges remain in the effective use of open-domain data, long documents, the complexity of target structure, and domain variances.