论文标题
离线加强学习可以帮助自然语言理解吗?
Can Offline Reinforcement Learning Help Natural Language Understanding?
论文作者
论文摘要
预训练是一种学习隐式转移知识的有用方法,它表明了在不同方式提供互补特征的好处。例如,最近的工作主要集中在图像和文本等方式上,研究表明,从图像中学到的视觉特征可以帮助视觉上的语言理解。在本文中,我们考虑研究离线增强学习(RL)与语言建模(LM)之间的潜在联系。在直觉上,RL和LM在基于当前和以前的状态预测下一个状态时相似,这些状态依赖于各州的局部和远程依赖性。为了验证这样的假设,我们使用变压器预先训练了不同的离线RL任务,然后在各种与语言相关的任务上评估这些模型。实验结果表明,与使用LM训练目标相比,我们的RL预训练模型可以给予近距离性能,表明这两种方式中存在常见的有用特征。为了进一步探索潜在关系,我们研究了一些因素,例如马尔可夫属性和RL轨迹的顺序性质。
Pre-training has been a useful method for learning implicit transferable knowledge and it shows the benefit of offering complementary features across different modalities. Recent work mainly focuses on the modalities such as image and text, for example, studies show that visual features learned from images can help visual-grounded language understanding. In this paper, we consider investigating the potential connection between offline reinforcement learning (RL) and language modeling (LM). Intuitively, RL and LM are similar in predicting the next states based on the current and previous states, which rely on both local and long-range dependency across states. To validate such an assumption, we pre-trained different offline RL tasks using Transformer and then evaluate these models on various language-related tasks. Experimental results show that our RL pre-trained models can give close performance compared with the models using the LM training objective, showing that there exist common useful features across these two modalities. To further explore the potential relationship, we investigate some factors such as Markov property and the sequential nature of RL trajectory.