学习用于转移任务和分布的有用表示

论文标题

学习用于转移任务和分布的有用表示

Learning useful representations for shifting tasks and distributions

论文作者

Zhang, Jianyu, Bottou, Léon

论文摘要

当我们处理多个分布时，学习表示表示的主要方法（作为优化单个培训分配的预期成本的副作用）仍然是一种好方法吗？我们的论点是，与单个优化插曲获得的表述相比，这种情况更好。我们通过简单的理论论点和实验来支持本文，并利用一种明显的结合技术来支持本文：使用相同的数据，模型，算法和超级参数和超级参数和超级参数，但随机种子相似地表现出相似的范围，而在各种训练的网络中表现出不同的范围，则在多个培训情节中获得的表示，与各种分布相似。同等大小的网络进行了单个培训，这证明了由多个培训情节构建的代表性实际上是不同的。

Does the dominant approach to learn representations (as a side effect of optimizing an expected cost for a single training distribution) remain a good approach when we are dealing with multiple distributions? Our thesis is that such scenarios are better served by representations that are richer than those obtained with a single optimization episode. We support this thesis with simple theoretical arguments and with experiments utilizing an apparently na\"ıve ensembling technique: concatenating the representations obtained from multiple training episodes using the same data, model, algorithm, and hyper-parameters, but different random seeds. These independently trained networks perform similarly. Yet, in a number of scenarios involving new distributions, the concatenated representation performs substantially better than an equivalently sized network trained with a single training run. This proves that the representations constructed by multiple training episodes are in fact different. Although their concatenation carries little additional information about the training task under the training distribution, it becomes substantially more informative when tasks or distributions change. Meanwhile, a single training episode is unlikely to yield such a redundant representation because the optimization process has no reason to accumulate features that do not incrementally improve the training performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题