语言模型作为代理模型

论文标题

语言模型作为代理模型

Language Models as Agent Models

论文作者

Andreas, Jacob

论文摘要

语言模型（LMS）经过文档集的培训，该文档的集合是由个别人类代理商撰写的，以实现外界的特定目标。在培训期间，LMS只能访问这些文档的文本，而没有直接证据表明产生它们的代理的内部状态 - 这一事实经常用来争辩LMS无法对人类语言生产和理解的目标指导的方面进行建模。接受文本培训的LM可以了解语言与使用之间的关系吗？我认为，在特定的狭窄意义上，LMS是故意交流的模型。当执行下一个单词预测给定文本上下文时，LM可以推断并表示可能产生该上下文的代理的属性。这些表示形式反过来可以影响随后的LM生成，就像代理人的交流意图影响其语言一样。我调查了最近的文献中的发现，即即使在当今的不稳定和错误的模型中，LMS也可以推断并使用细粒度的交流意图以及更抽象的信念和目标。尽管他们的培训数据的性质有限，但他们可以作为有意沟通和行动的系统的基础。

Language models (LMs) are trained on collections of documents, written by individual human agents to achieve specific goals in an outside world. During training, LMs have access only to text of these documents, with no direct evidence of the internal states of the agents that produced them -- a fact often used to argue that LMs are incapable of modeling goal-directed aspects of human language production and comprehension. Can LMs trained on text learn anything at all about the relationship between language and use? I argue that LMs are models of intentional communication in a specific, narrow sense. When performing next word prediction given a textual context, an LM can infer and represent properties of an agent likely to have produced that context. These representations can in turn influence subsequent LM generation in the same way that agents' communicative intentions influence their language. I survey findings from the recent literature showing that -- even in today's non-robust and error-prone models -- LMs infer and use representations of fine-grained communicative intentions and more abstract beliefs and goals. Despite the limited nature of their training data, they can thus serve as building blocks for systems that communicate and act intentionally.

下载PDF全文

下载文献需遵守相关版权规定

论文标题