使用无监督和强化的无监管的深度学习来优化无线系统

论文标题

使用无监督和强化的无监管的深度学习来优化无线系统

Optimizing Wireless Systems Using Unsupervised and Reinforced-Unsupervised Deep Learning

论文作者

Liu, Dong, Sun, Chengjian, Yang, Chenyang, Hanzo, Lajos

论文摘要

无线网络中的资源分配和收发器通常是通过解决特定约束的优化问题来设计的，这些问题可以作为可变或功能优化配制。如果可以得出变量优化问题的目标和约束函数，则可以应用标准的数值算法来查找最佳解决方案，但是当变量的尺寸较高时，该算法然而，但是在高计算成本中。为了降低在线计算复杂性，将深度神经网络（DNNS）的最佳解决方案学习为环境状态的函数是一种有效的方法。可以在最佳解决方案的监督下对DNN进行训练，但是，如果没有模型，则不适用于场景，也不适用于难以获得最佳解决方案的功能优化。如果无法获得目标和约束功能，则可以应用强化学习来找到功能优化问题的解决方案，但是，这并不是针对无线网络中的优化问题量身定制的。在本文中，我们介绍了无监督和强化的无监督的学习框架，以解决最佳解决方案的监督，以解决可变和功能优化问题。当环境的数学模型是完全已知的并且环境状态的分布是已知或未知的分布时，我们可以调用无监督的学习算法。当环境的数学模型不完整时，我们介绍了通过与环境相互作用来学习模型的加强无障碍学习算法。我们的仿真结果通过以用户关联问题为例证实了这些学习框架的适用性。

Resource allocation and transceivers in wireless networks are usually designed by solving optimization problems subject to specific constraints, which can be formulated as variable or functional optimization. If the objective and constraint functions of a variable optimization problem can be derived, standard numerical algorithms can be applied for finding the optimal solution, which however incur high computational cost when the dimension of the variable is high. To reduce the on-line computational complexity, learning the optimal solution as a function of the environment's status by deep neural networks (DNNs) is an effective approach. DNNs can be trained under the supervision of optimal solutions, which however, is not applicable to the scenarios without models or for functional optimization where the optimal solutions are hard to obtain. If the objective and constraint functions are unavailable, reinforcement learning can be applied to find the solution of a functional optimization problem, which is however not tailored to optimization problems in wireless networks. In this article, we introduce unsupervised and reinforced-unsupervised learning frameworks for solving both variable and functional optimization problems without the supervision of the optimal solutions. When the mathematical model of the environment is completely known and the distribution of environment's status is known or unknown, we can invoke unsupervised learning algorithm. When the mathematical model of the environment is incomplete, we introduce reinforced-unsupervised learning algorithms that learn the model by interacting with the environment. Our simulation results confirm the applicability of these learning frameworks by taking a user association problem as an example.

下载PDF全文

下载文献需遵守相关版权规定

论文标题