论文标题

通过可逆随机性提高神经乘法单元的鲁棒性

Improving the Robustness of Neural Multiplication Units with Reversible Stochasticity

论文作者

Mistry, Bhumika, Farrahi, Katayoun, Hare, Jonathon

论文摘要

多层感知器努力学习某些简单的算术任务。用于算术的专业神经模块可以超过经典体系结构,具有外推,可解释性和收敛速度,但对训练范围非常敏感。在本文中,我们表明神经乘法单元(NMU)无法可靠地学习任务,就像在给出不同训练范围时乘以两个输入一样简单。失败的原因与电感和输入偏见有关,这些偏见鼓励与不良最佳的解决方案收敛。提出了一种溶液,即随机NMU(SNMU),以应用可逆的随机性,鼓励避免这种优化,同时融合到真实溶液中。从经验上讲,我们表明随机性提供了改善的鲁棒性,并有可能改善上游网络的学会表示数值和图像任务。

Multilayer Perceptrons struggle to learn certain simple arithmetic tasks. Specialist neural modules for arithmetic can outperform classical architectures with gains in extrapolation, interpretability and convergence speeds, but are highly sensitive to the training range. In this paper, we show that Neural Multiplication Units (NMUs) are unable to reliably learn tasks as simple as multiplying two inputs when given different training ranges. Causes of failure are linked to inductive and input biases which encourage convergence to solutions in undesirable optima. A solution, the stochastic NMU (sNMU), is proposed to apply reversible stochasticity, encouraging avoidance of such optima whilst converging to the true solution. Empirically, we show that stochasticity provides improved robustness with the potential to improve learned representations of upstream networks for numerical and image tasks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源