论文标题
复杂的网络用于异质大量新闻流中的事件检测
Complex networks for event detection in heterogeneous high volume news streams
论文作者
论文摘要
在大量新闻流中检测重要事件是各种目的的重要任务。在线新闻的数量和速率增加了对可以实时运行的自动事件检测方法的需求。在本文中,我们开发了一种基于网络的方法,该方法使工作记录始终涉及新闻文章中链接的指定实体(例如人,位置和组织)。我们的方法使用自然语言处理技术在新闻文章流中检测这些实体,然后创建一个时间stamp的系列网络,其中检测到的实体通过文章和句子中的共发生链接。在这个原型中,随着时间的推移跟踪加权节点度,并使用用于定位事件的更改点检测。使用社区检测器钥匙图对潜在事件进行了特征和区分,这些钥匙图将指定的实体和相关文章中的信息名词词组相关联。该方法学已经产生了令人鼓舞的结果,将来将扩展到包括复杂网络分析技术的更广泛变化。
Detecting important events in high volume news streams is an important task for a variety of purposes.The volume and rate of online news increases the need for automated event detection methods thatcan operate in real time. In this paper we develop a network-based approach that makes the workingassumption that important news events always involve named entities (such as persons, locationsand organizations) that are linked in news articles. Our approach uses natural language processingtechniques to detect these entities in a stream of news articles and then creates a time-stamped seriesof networks in which the detected entities are linked by co-occurrence in articles and sentences. Inthis prototype, weighted node degree is tracked over time and change-point detection used to locateimportant events. Potential events are characterized and distinguished using community detectionon KeyGraphs that relate named entities and informative noun-phrases from related articles. Thismethodology already produces promising results and will be extended in future to include a widervariety of complex network analysis techniques.