论文标题
使用Flowyager探索网络范围的流数据
Exploring Network-Wide Flow Data with Flowyager
论文作者
论文摘要
从攻击调查和缓解到交通管理,许多网络操作都需要在几秒钟内回答网络范围的流量查询。尽管每个路由器都收集了流记录,但使用可用的流量捕获实用程序,从跨站点和随着时间的推移从数百个路由器中查询所得数据集,这仍然是一个重大挑战,因为流量记录的流量量和分布式性质越来越大。 在本文中,我们研究了如何改善先验未知网络范围的查询的响应时间。我们介绍Flowyager,这是一个建立在现有流量捕获实用程序之上的系统。 Flowyager生成并分析了树数据结构,我们称之为流动,这是捕获实用程序可用的原始流数据的简洁摘要。与原始流程记录相比,流动板是自调整的数据结构,大幅度降低了空间和转移要求,与原始流记录相比,降低了75%至95%。 Flowyager管理Flowtrees的存储和传输,支持Flowtree操作员,并提供一种结构化的查询语言,用于回答跨站点和时间段的流量查询。通过在较大的Internet交换点和Tier-1 Internet服务提供商处部署流式原型,我们展示了其具有数百个路由器接口的网络功能。我们的结果表明,与替代数据分析平台相比,查询响应时间可以减少数量级。因此,Flowyager可以启用交互式网络范围的查询,并提供前所未有的钻孔功能,例如识别DDOS罪魁祸首,查明所涉及的站点并确定攻击的长度。
Many network operations, ranging from attack investigation and mitigation to traffic management, require answering network-wide flow queries in seconds. Although flow records are collected at each router, using available traffic capture utilities, querying the resulting datasets from hundreds of routers across sites and over time, remains a significant challenge due to the sheer traffic volume and distributed nature of flow records. In this paper, we investigate how to improve the response time for a priori unknown network-wide queries. We present Flowyager, a system that is built on top of existing traffic capture utilities. Flowyager generates and analyzes tree data structures, that we call Flowtrees, which are succinct summaries of the raw flow data available by capture utilities. Flowtrees are self-adjusted data structures that drastically reduce space and transfer requirements, by 75% to 95%, compared to raw flow records. Flowyager manages the storage and transfers of Flowtrees, supports Flowtree operators, and provides a structured query language for answering flow queries across sites and time periods. By deploying a Flowyager prototype at both a large Internet Exchange Point and a Tier-1 Internet Service Provider, we showcase its capabilities for networks with hundreds of router interfaces. Our results show that the query response time can be reduced by an order of magnitude when compared with alternative data analytics platforms. Thus, Flowyager enables interactive network-wide queries and offers unprecedented drill-down capabilities to, e.g., identify DDoS culprits, pinpoint the involved sites, and determine the length of the attack.