论文标题
Sigma工作表:OLAP查询的交互式构造
Sigma Worksheet: Interactive Construction of OLAP Queries
论文作者
论文摘要
新一代的云数据仓库(CDWS)带来了大量数据,并更接近企业的用户。直接访问仓库数据,进行交互分析和探索它的能力可以使用户能够改善其决策周期。但是,现有用于分析CDW数据的工具要么受到临时转换的限制,要么难以用于企业用户,这是企业中最大的用户领域。在这里,我们介绍了Sigma工作表,这是一种新的交互式系统,使用户可以轻松地对CDW中的数据进行临时视觉分析。为此,Sigma工作表提供了一个可访问的电子表格样界面,用于通过直接操纵来进行数据分析。 Sigma Worksheet在此熟悉的界面上的用户交互中动态构建了匹配SQL查询,并建立在SQL的多功能性和表达性上。 Sigma工作表直接在CDW上执行构造的查询,利用新一代CDW的出色特征,包括可扩展性。为了评估Sigma工作表,我们首先通过两个现实生活中的用例,同类分析和会话来证明其表现力。然后,我们使用TPC-H基准测量了工作表生成的查询的性能。结果表明,我们编译的SQL查询的性能与基准的参考查询相当。最后,为了评估Sigma工作表在部署的有用性,我们通过100人的调查引起反馈,然后对70名参与者进行了半结构化访谈研究。我们发现Sigma工作表更容易使用和学习,从而提高用户的生产率。我们的发现还表明,Sigma工作表可以通过在数据分析的各个步骤为用户提供指导来进一步改善用户体验。
The new generation of cloud data warehouses (CDWs) brings large amounts of data and compute power closer to users in enterprises. The ability to directly access the warehouse data, interactively analyze and explore it at scale can empower users to improve their decision making cycles. However, existing tools for analyzing data in CDWs are either limited in ad-hoc transformations or difficult to use for business users, the largest user segment in enterprises. Here we introduce Sigma Worksheet, a new interactive system that enables users to easily perform ad-hoc visual analysis of data in CDWs at scale. For this, Sigma Worksheet provides an accessible spreadsheet-like interface for data analysis through direct manipulation. Sigma Worksheet dynamically constructs matching SQL queries from user interactions on this familiar interface, building on the versatility and expressivity of SQL. Sigma Worksheet executes constructed queries directly on CDWs, leveraging the superior characteristics of the new generation CDWs, including scalability. To evaluate Sigma Worksheet, we first demonstrate its expressivity through two real life use cases, cohort analysis and sessionization. We then measure the performance of the Worksheet generated queries with a set of experiments using the TPC-H benchmark. Results show the performance of our compiled SQL queries is comparable to that of the reference queries of the benchmark. Finally, to assess the usefulness of Sigma Worksheet in deployment, we elicit feedback through a 100-person survey followed by a semi-structured interview study with 70 participants. We find that Sigma Worksheet is easier to use and learn, improving the productivity of users. Our findings also suggest Sigma Worksheet can further improve user experience by providing guidance to users at various steps of data analysis.