关于语言任务的ESN和LSTM可视化的旅程

论文标题

关于语言任务的ESN和LSTM可视化的旅程

A journey in ESN and LSTM visualisations on a language task

论文作者

Variengien, Alexandre, Hinaut, Xavier

论文摘要

回声状态网络（ESN）和长期术语内存网络（LSTM）是两个经常性神经网络（RNN）的流行架构，用于求解涉及顺序数据的机器学习任务。但是，几乎没有做到比较他们在共同任务上的表现和内部机制。在这项工作中，我们培训了ESN和LSTMS的交叉效率学习（CSL）任务。该任务旨在建模婴儿如何学习语言：它们在单词和视觉刺激之间建立关联，以从单词和句子中提取意义。结果分为三种：性能比较，内部动力学分析和潜在空间的可视化。（1）我们发现这两个模型都能够成功地学习任务：LSTM达到了基本语料库的最低错误，但是ESN更快地训练了。此外，无需任何进一步的调整，ESN能够在数据集中胜过数据集的LSTM。（2）我们还对LSTM和ESN的内部单元激活进行了分析。尽管两个模型（经过训练或固定的内部权重）之间存在很大差异，但我们还是能够发现相似的内部机制：两者都强调编码句子结构方面的单位。（3）此外，我们提出了经常性状态空间可视化（RSSVIZ），这是一种基于尺寸降低（使用UMAP）的RNNS潜在状态空间的结构的方法。该技术使我们能够观察到LSTM中序列的分形嵌入。 RSSVIZ对于分析ESN（i）的分析也很有用，以发现困难的示例，并且（ii）生成动画图，以显示跨学习阶段激活的演变。最后，我们定性地探索RSSVIZ如何提供直观的可视化，以了解超参数在ESN训练之前对储层动力学的影响。

Echo States Networks (ESN) and Long-Short Term Memory networks (LSTM) are two popular architectures of Recurrent Neural Networks (RNN) to solve machine learning task involving sequential data. However, little have been done to compare their performances and their internal mechanisms on a common task. In this work, we trained ESNs and LSTMs on a Cross-Situationnal Learning (CSL) task. This task aims at modelling how infants learn language: they create associations between words and visual stimuli in order to extract meaning from words and sentences. The results are of three kinds: performance comparison, internal dynamics analyses and visualization of latent space. (1) We found that both models were able to successfully learn the task: the LSTM reached the lowest error for the basic corpus, but the ESN was quicker to train. Furthermore, the ESN was able to outperform LSTMs on datasets more challenging without any further tuning needed. (2) We also conducted an analysis of the internal units activations of LSTMs and ESNs. Despite the deep differences between both models (trained or fixed internal weights), we were able to uncover similar inner mechanisms: both put emphasis on the units encoding aspects of the sentence structure. (3) Moreover, we present Recurrent States Space Visualisations (RSSviz), a method to visualize the structure of latent state space of RNNs, based on dimension reduction (using UMAP). This technique enables us to observe a fractal embedding of sequences in the LSTM. RSSviz is also useful for the analysis of ESNs (i) to spot difficult examples and (ii) to generate animated plots showing the evolution of activations across learning stages. Finally, we explore qualitatively how the RSSviz could provide an intuitive visualisation to understand the influence of hyperparameters on the reservoir dynamics prior to ESN training.

下载PDF全文

下载文献需遵守相关版权规定

论文标题