论文标题

通过公制学习增强的最佳运输改善分子表示学习

Improving Molecular Representation Learning with Metric Learning-enhanced Optimal Transport

论文作者

Wu, Fang, Courty, Nicolas, Jin, Shuting, Li, Stan Z.

论文摘要

在许多化学和生物学应用中,培训数据通常受到限制或异质性。现有的化学和材料科学的机器学习模型无法考虑概括训练领域以外的概括。在本文中,我们开发了一种称为MROT的新型最佳基于运输算法,以增强其在分子回归问题上的概括能力。 MROT通过测量域距离的新度量和在运输计划上的后方正则化来了解数据的连续标签,以弥合化学域间隙。在下游任务中,我们考虑了无监督和半监督环境中的基本化学回归任务,包括化学性质预测和材料吸附选择。广泛的实验表明,MROT明显胜过最先进的模型,显示出在加速具有所需特性的新物质方面具有希望的潜力。

Training data are usually limited or heterogeneous in many chemical and biological applications. Existing machine learning models for chemistry and materials science fail to consider generalizing beyond training domains. In this article, we develop a novel optimal transport-based algorithm termed MROT to enhance their generalization capability for molecular regression problems. MROT learns a continuous label of the data by measuring a new metric of domain distances and a posterior variance regularization over the transport plan to bridge the chemical domain gap. Among downstream tasks, we consider basic chemical regression tasks in unsupervised and semi-supervised settings, including chemical property prediction and materials adsorption selection. Extensive experiments show that MROT significantly outperforms state-of-the-art models, showing promising potential in accelerating the discovery of new substances with desired properties.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源