论文标题
半监督的连接树分子属性预测的变异自动编码器
Semi-Supervised Junction Tree Variational Autoencoder for Molecular Property Prediction
论文作者
论文摘要
分子表示学习对于解决许多药物发现和计算化学问题至关重要。由于分子的复杂结构和庞大的化学空间,这是一个具有挑战性的问题。分子的图表比传统表示更具表现力,例如分子指纹。因此,他们可以提高机器学习模型的性能。我们提出了Semole,这是一种使用半监督学习的分子图的最新生成模型,可以增强连接树变异自动编码器。 Semole旨在通过利用未标记的数据来提高分子属性预测的准确性。我们强迫该模型通过将属性纳入潜在表示形式来生成以目标性质为条件的分子图。我们提出了一个额外的训练阶段,以改善半监督生成模型的训练过程。我们使用三种不同的分子特性对锌数据集进行了实验评估,并证明了半佩斯维斯的益处。
Molecular Representation Learning is essential to solving many drug discovery and computational chemistry problems. It is a challenging problem due to the complex structure of molecules and the vast chemical space. Graph representations of molecules are more expressive than traditional representations, such as molecular fingerprints. Therefore, they can improve the performance of machine learning models. We propose SeMole, a method that augments the Junction Tree Variational Autoencoders, a state-of-the-art generative model for molecular graphs, with semi-supervised learning. SeMole aims to improve the accuracy of molecular property prediction when having limited labeled data by exploiting unlabeled data. We enforce that the model generates molecular graphs conditioned on target properties by incorporating the property into the latent representation. We propose an additional pre-training phase to improve the training process for our semi-supervised generative model. We perform an experimental evaluation on the ZINC dataset using three different molecular properties and demonstrate the benefits of semi-supervision.