CAVA：使用知识图的视觉分析系统用于探索性柱状数据增强

论文标题

CAVA：使用知识图的视觉分析系统用于探索性柱状数据增强

CAVA: A Visual Analytics System for Exploratory Columnar Data Augmentation Using Knowledge Graphs

论文作者

Cashman, Dylan, Xu, Shenyu, Das, Subhajit, Heimerl, Florian, Liu, Cong, Humayoun, Shah Rukh, Gleicher, Michael, Endert, Alex, Chang, Remco

论文摘要

大多数视觉分析系统都认为所有数据的觅食都发生在分析过程之前。一旦开始分析，考虑的数据属性集为固定。这种数据构建与分析的分离排除了迭代，可以使觅食在分析过程中出现的原位的需求告知。觅食环与数据分析任务的分离可以限制分析的步伐和范围。在本文中，我们介绍了CAVA，该系统将数据策划和数据增强与传统数据探索和分析任务相结合，从而在分析过程中逐渐觅食。识别要添加到数据集的属性很困难，因为它需要人类知识来确定哪些可用属性将有助于随后的分析任务。 Cava爬网图为用户提供一组从外部数据绘制的属性集。然后，用户可以在知识图上指定复杂操作以构建其他属性。 CAVA通过让用户视觉探索可用数据集，并用作查询构建的接口，从而展示了视觉分析如何通过视觉探索可用数据集来帮助用户觅食属性。它还提供了知识图本身的可视化，以帮助用户了解复杂的连接，例如多跳聚合。我们评估了系统使用户能够在两个数据集对用户研究中进行编程的情况下执行复杂的数据组合的能力。然后，我们通过另外两个用法方案来证明CAVA的普遍性。评估的结果证实，CAVA有效地帮助用户执行数据觅食，从而改善了分析结果，并提供证据以支持将数据扩展作为视觉分析管道的一部分。

Most visual analytics systems assume that all foraging for data happens before the analytics process; once analysis begins, the set of data attributes considered is fixed. Such separation of data construction from analysis precludes iteration that can enable foraging informed by the needs that arise in-situ during the analysis. The separation of the foraging loop from the data analysis tasks can limit the pace and scope of analysis. In this paper, we present CAVA, a system that integrates data curation and data augmentation with the traditional data exploration and analysis tasks, enabling information foraging in-situ during analysis. Identifying attributes to add to the dataset is difficult because it requires human knowledge to determine which available attributes will be helpful for the ensuing analytical tasks. CAVA crawls knowledge graphs to provide users with a a broad set of attributes drawn from external data to choose from. Users can then specify complex operations on knowledge graphs to construct additional attributes. CAVA shows how visual analytics can help users forage for attributes by letting users visually explore the set of available data, and by serving as an interface for query construction. It also provides visualizations of the knowledge graph itself to help users understand complex joins such as multi-hop aggregations. We assess the ability of our system to enable users to perform complex data combinations without programming in a user study over two datasets. We then demonstrate the generalizability of CAVA through two additional usage scenarios. The results of the evaluation confirm that CAVA is effective in helping the user perform data foraging that leads to improved analysis outcomes, and offer evidence in support of integrating data augmentation as a part of the visual analytics pipeline.

下载PDF全文

下载文献需遵守相关版权规定

论文标题