论文标题
基于跨语性查询的危机相关社交媒体的摘要:一种使用变形金刚的抽象方法
Cross-Lingual Query-Based Summarization of Crisis-Related Social Media: An Abstractive Approach Using Transformers
论文作者
论文摘要
危机期间从社交媒体中收集的相关和及时信息可能是应急管理的宝贵资源。但是,提取此信息仍然是一项具有挑战性的任务,尤其是在处理多种语言的社交媒体帖子时。这项工作提出了一种跨语性方法,用于从社交媒体帖子中检索和总结与危机相关的信息。我们描述了一种通过结构化查询来表达各种信息需求的统一方法,以及一种创建摘要来满足这些信息需求的方式。该方法基于多语言变压器的嵌入。查询是用嵌入式支持的一种语言编写的,并且提取的句子可以用支持的任何其他语言。抽象性摘要由变压器创建。该评估是由众包评估者和应急管理专家进行的,并在跨越五种大规模的灾难中从Twitter提取的藏品进行了评估,这表明了我们的方法的灵活性。与现有的最新方法相比,生成的摘要被认为是更加集中,结构化和连贯的,并且专家将它们与现有最先进方法创建的摘要进行了比较。
Relevant and timely information collected from social media during crises can be an invaluable resource for emergency management. However, extracting this information remains a challenging task, particularly when dealing with social media postings in multiple languages. This work proposes a cross-lingual method for retrieving and summarizing crisis-relevant information from social media postings. We describe a uniform way of expressing various information needs through structured queries and a way of creating summaries answering those information needs. The method is based on multilingual transformers embeddings. Queries are written in one of the languages supported by the embeddings, and the extracted sentences can be in any of the other languages supported. Abstractive summaries are created by transformers. The evaluation, done by crowdsourcing evaluators and emergency management experts, and carried out on collections extracted from Twitter during five large-scale disasters spanning ten languages, shows the flexibility of our approach. The generated summaries are regarded as more focused, structured, and coherent than existing state-of-the-art methods, and experts compare them favorably against summaries created by existing, state-of-the-art methods.