论文标题
EGEON:对象存储的软件定义的数据保护
EGEON: Software-Defined Data Protection for Object Storage
论文作者
论文摘要
随着云计算的普及,对象存储系统(例如Amazon S3,OpenStack Swift,Ceph)的势头已经获得了相对较低的人均成本和高可用性的动力。但是,随着越来越敏感的数据被累积,将隐私控制纳入存储中的需求正在增长。如今,由于对象存储界面差,数据策展人在清除中完全访问数据。这促使需要采取新的数据隐私方法,该方法可以为数据所有者提供强大的保证和控制。为了满足这一需求,本文介绍了EGEON,这是一种新颖的软件定义数据保护框架,用于对象存储。 EGEON使用户能够就如何共享其数据来声明地制定隐私政策。在隐私政策中,用户可以通过数据转换组成来构建复杂的数据保护服务,而数据转换是通过读取请求向egeon召唤的。结果,数据所有者可以从同一数据件中微不足道显示多个视图,并仅通过更新策略来修改这些视图。所有这些都不重组基础对象存储系统的内部。 EGEON原型已在OpenStack Swift上建造。评估结果表明,开发数据保护服务的希望很少直接进入对象存储。此外,根据转换视图中过滤的数据量,由于网络通信的节省,端到端的延迟可能会很低。
With the growth in popularity of cloud computing, object storage systems (e.g., Amazon S3, OpenStack Swift, Ceph) have gained momentum for their relatively low per-GB costs and high availability. However, as increasingly more sensitive data is being accrued, the need to natively integrate privacy controls into the storage is growing in relevance. Today, due to the poor object storage interface, privacy controls are enforced by data curators with full access to data in the clear. This motivates the need for a new approach to data privacy that can provide strong assurance and control to data owners. To fulfill this need, this paper presents EGEON, a novel software-defined data protection framework for object storage. EGEON enables users to declaratively set privacy policies on how their data can be shared. In the privacy policies, the users can build complex data protection services through the composition of data transformations, which are invoked inline by EGEON upon a read request. As a result, data owners can trivially display multiple views from the same data piece, and modify these views by only updating the policies. And all without restructuring the internals of the underlying object storage system. The EGEON prototype has been built atop OpenStack Swift. Evaluation results shows promise in developing data protection services with little overhead directly into the object store. Further, depending on the amount of data filtered out in the transformed views, end-to-end latency can be low due to the savings in network communication.