论文标题
深度学习模型语义连续性的实验研究
An Experimental Study of Semantic Continuity for Deep Learning Models
论文作者
论文摘要
深度学习模型遭受语义不连续性问题的困扰:输入空间中的小扰动往往会导致语义级别的干扰对模型输出。我们认为,语义不连续性来自这些不适当的训练目标,并有助于臭名昭著的问题,例如对抗性鲁棒性,可解释性等。我们首先进行数据分析,以提供现有深度学习模型中语义不连续性的证据,然后设计简单的语义连续性约束,从而设计出一种理论上能够获得平稳的梯度和学习语义方便的模型。定性和定量实验证明,语义连续模型成功地减少了非语义信息的使用,这进一步有助于改善对抗性鲁棒性,可解释性,模型转移和机器偏见。
Deep learning models suffer from the problem of semantic discontinuity: small perturbations in the input space tend to cause semantic-level interference to the model output. We argue that the semantic discontinuity results from these inappropriate training targets and contributes to notorious issues such as adversarial robustness, interpretability, etc. We first conduct data analysis to provide evidence of semantic discontinuity in existing deep learning models, and then design a simple semantic continuity constraint which theoretically enables models to obtain smooth gradients and learn semantic-oriented features. Qualitative and quantitative experiments prove that semantically continuous models successfully reduce the use of non-semantic information, which further contributes to the improvement in adversarial robustness, interpretability, model transfer, and machine bias.