人类和深层语言模型中的记忆：链接模型增强的假设

论文标题

人类和深层语言模型中的记忆：链接模型增强的假设

Memory in humans and deep language models: Linking hypotheses for model augmentation

论文作者

Raccah, Omri, Chen, Phoebe, Willke, Ted L., Poeppel, David, Vo, Vy A.

论文摘要

变压器模型中自我注意机制的计算复杂性显着限制了它们在较长时间持续时间内概括的能力。记忆启发或在外部内存中的显式存储以进行后续预测，已成为减轻此限制的建设性途径。我们认为，内存增强的变压器可以通过考虑人类记忆文献的见解而受益匪浅。我们详细介绍了一种通过跨域连接假设的规范来整合人类记忆系统证据的方法。然后，我们提供了一种经验证明，以评估惊奇作为联系假设的使用，并进一步确定这种方法的局限性，以告知未来研究。

The computational complexity of the self-attention mechanism in Transformer models significantly limits their ability to generalize over long temporal durations. Memory-augmentation, or the explicit storing of past information in external memory for subsequent predictions, has become a constructive avenue for mitigating this limitation. We argue that memory-augmented Transformers can benefit substantially from considering insights from the memory literature in humans. We detail an approach for integrating evidence from the human memory system through the specification of cross-domain linking hypotheses. We then provide an empirical demonstration to evaluate the use of surprisal as a linking hypothesis, and further identify the limitations of this approach to inform future research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题