可靠的机器聆听的本地解释

论文标题

可靠的机器聆听的本地解释

Reliable Local Explanations for Machine Listening

论文作者

Mishra, Saumitra, Benetos, Emmanouil, Sturm, Bob L., Dixon, Simon

论文摘要

分析机器学习模型行为的一种方法是通过局部解释，突出了最大程度地影响模型预测的输入特征。灵敏度分析涉及分析输入扰动对模型预测的影响，是生成局部解释的方法之一。有意义的输入扰动对于生成可靠的解释至关重要，但是在这种扰动是什么以及如何执行它们方面的工作有限。这项工作在分析音频的机器听力模型的背景下研究了这些问题。具体而言，我们使用最先进的深层歌声检测（SVD）模型来分析Soundlime（局部解释方法）的解释是否对方法的模型输入方式敏感。结果表明，声音解释对遮挡的输入区域中的内容很敏感。我们进一步提出并展示了一种新颖的方法，用于定量识别合适的内容类型，以可靠地阻断机器听力模型的输入。 SVD模型的结果表明，输入MEL光谱箱的平均幅度是时间解释最合适的内容类型。

One way to analyse the behaviour of machine learning models is through local explanations that highlight input features that maximally influence model predictions. Sensitivity analysis, which involves analysing the effect of input perturbations on model predictions, is one of the methods to generate local explanations. Meaningful input perturbations are essential for generating reliable explanations, but there exists limited work on what such perturbations are and how to perform them. This work investigates these questions in the context of machine listening models that analyse audio. Specifically, we use a state-of-the-art deep singing voice detection (SVD) model to analyse whether explanations from SoundLIME (a local explanation method) are sensitive to how the method perturbs model inputs. The results demonstrate that SoundLIME explanations are sensitive to the content in the occluded input regions. We further propose and demonstrate a novel method for quantitatively identifying suitable content type(s) for reliably occluding inputs of machine listening models. The results for the SVD model suggest that the average magnitude of input mel-spectrogram bins is the most suitable content type for temporal explanations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题