论文标题
瓶颈中的Gini:图形卷积神经网络的稀疏分子表示
Gini in a Bottleneck: Sparse Molecular Representations for Graph Convolutional Neural Networks
论文作者
论文摘要
由于深度学习方法的性质,本质上很难理解分子图的哪些方面驱动网络的预测。作为一种缓解策略,我们根据Gini指数来限制多任务图卷积神经网络中的某些权重,以最大程度地提高学习表示的“不平等”。我们表明,此约束不会降低某些目标的评估指标,并允许我们以视觉上可解释的方式组合图形卷积操作的输出。然后,我们对公共QM9数据集的量子化学目标进行了概念验证实验,并对专有药物样分子的ADMET靶标进行了更大的实验。由于很难在后一种情况下进行解释性的基准,因此我们对组织内的药物化学家进行了非正式调查,以检查它们与所讨论的属性相关的分子区域之间的一致性。
Due to the nature of deep learning approaches, it is inherently difficult to understand which aspects of a molecular graph drive the predictions of the network. As a mitigation strategy, we constrain certain weights in a multi-task graph convolutional neural network according to the Gini index to maximize the "inequality" of the learned representations. We show that this constraint does not degrade evaluation metrics for some targets, and allows us to combine the outputs of the graph convolutional operation in a visually interpretable way. We then perform a proof-of-concept experiment on quantum chemistry targets on the public QM9 dataset, and a larger experiment on ADMET targets on proprietary drug-like molecules. Since a benchmark of explainability in the latter case is difficult, we informally surveyed medicinal chemists within our organization to check for agreement between regions of the molecule they and the model identified as relevant to the properties in question.