论文标题

调查数据和人类计算,以改善流感跟踪

Survey Data and Human Computation for Improved Flu Tracking

论文作者

Wojcik, Stefan, Bijral, Avleen, Johnston, Richard, Lavista, Juan Miguel, King, Gary, Kennedy, Ryan, Vespignani, Alessandro, Lazer, David

论文摘要

虽然来自搜索引擎等来源的数字跟踪数据具有跟踪和理解人类行为的巨大潜力,但这些数据流缺乏有关这些人生成数据的实际经验的信息。此外,大多数当前的方法忽略或不足以使人类加工能力降低,从而使人类可以解决计算机尚未解决的问题(人类计算)。我们演示了如何利用数字数据流的行为研究,将数字和现实世界行为以及人类计算联系起来。这项研究着眼于使用搜索数据跟踪流感样疾病(ILI)的流行率。我们基于与用户在线浏览数据链接的调查数据建立了流感搜索的行为模型。然后,我们利用人类计算来对搜索字符串进行分类。利用这些资源,我们构建了ILI患病率的跟踪模型,该模型仅使用有限的搜索数据流优于强大的历史基准,并借给以较小的地理单位跟踪ILI。尽管本文仅解决与ILI有关的搜索,但我们描述的方法具有在实时接近实时跟踪一系列现象的潜力。

While digital trace data from sources like search engines hold enormous potential for tracking and understanding human behavior, these streams of data lack information about the actual experiences of those individuals generating the data. Moreover, most current methods ignore or under-utilize human processing capabilities that allow humans to solve problems not yet solvable by computers (human computation). We demonstrate how behavioral research, linking digital and real-world behavior, along with human computation, can be utilized to improve the performance of studies using digital data streams. This study looks at the use of search data to track prevalence of Influenza-Like Illness (ILI). We build a behavioral model of flu search based on survey data linked to users online browsing data. We then utilize human computation for classifying search strings. Leveraging these resources, we construct a tracking model of ILI prevalence that outperforms strong historical benchmarks using only a limited stream of search data and lends itself to tracking ILI in smaller geographic units. While this paper only addresses searches related to ILI, the method we describe has potential for tracking a broad set of phenomena in near real-time.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源