论文标题
深入强化学习和允许的区块链用于车辆边缘计算和网络中的内容缓存
Deep Reinforcement Learning and Permissioned Blockchain for Content Caching in Vehicular Edge Computing and Networks
论文作者
论文摘要
车辆边缘计算(VEC)是一个有前途的范式,可以使大量数据和多媒体内容能够靠近车辆。但是,车辆的高移动性和动态无线通道状况使设计最佳内容缓存政策变得挑战。此外,凭借许多敏感的个人信息,车辆可能不愿意将其内容缓存到不受信任的缓存提供商中。深度加固学习(DRL)是一种新兴技术,可通过高维和时间变化的特征来解决该问题。许可区块链能够建立一个安全且分散的点对点交易环境。在本文中,我们将DRL和权限的区块链集成到车辆网络中,以进行智能和安全的内容缓存。我们首先提出了一个区块链授权的分布式内容缓存框架,在该框架中,车辆执行内容缓存和基站维护许可的区块链。然后,我们利用高级DRL方法来设计最佳内容缓存方案,并考虑到移动性。最后,我们提出了一种新的块验证器选择方法,即实验证明(POU),以加速块验证过程。安全分析表明,我们提出的区块链授权的内容缓存可以实现安全性和隐私保护。基于Uber的真实数据集的数值结果表明,DRL启发的内容缓存方案显着优于两个基准策略。
Vehicular Edge Computing (VEC) is a promising paradigm to enable huge amount of data and multimedia content to be cached in proximity to vehicles. However, high mobility of vehicles and dynamic wireless channel condition make it challenge to design an optimal content caching policy. Further, with much sensitive personal information, vehicles may be not willing to caching their contents to an untrusted caching provider. Deep Reinforcement Learning (DRL) is an emerging technique to solve the problem with high-dimensional and time-varying features. Permission blockchain is able to establish a secure and decentralized peer-to-peer transaction environment. In this paper, we integrate DRL and permissioned blockchain into vehicular networks for intelligent and secure content caching. We first propose a blockchain empowered distributed content caching framework where vehicles perform content caching and base stations maintain the permissioned blockchain. Then, we exploit the advanced DRL approach to design an optimal content caching scheme with taking mobility into account. Finally, we propose a new block verifier selection method, Proof-of-Utility (PoU), to accelerate block verification process. Security analysis shows that our proposed blockchain empowered content caching can achieve security and privacy protection. Numerical results based on a real dataset from Uber indicate that the DRL-inspired content caching scheme significantly outperforms two benchmark policies.