使用Monte Carlo辍学自动编码器的生物医学数据的多重插补

论文标题

使用Monte Carlo辍学自动编码器的生物医学数据的多重插补

Multiple Imputation for Biomedical Data using Monte Carlo Dropout Autoencoders

论文作者

Miok, Kristian, Nguyen-Doan, Dong, Robnik-Šikonja, Marko, Zaharie, Daniela

论文摘要

由于复杂的实验设置，缺少值在生物医学数据中很常见。为了解决这个问题，已经提出了许多方法，从忽略不完整实例到各种数据插补方法。随着深度神经网络的最新兴起，缺少数据插补的领域旨在建模数据分布。本文介绍了一种基于（变性）自动编码器中蒙特卡洛辍学的方法，该方法不仅可以很好地适应数据的分布，而且还允许生成新数据（适用于每个特定实例）。评估表明，可以通过提出的方法提高归合误差和预测性相似性。

Due to complex experimental settings, missing values are common in biomedical data. To handle this issue, many methods have been proposed, from ignoring incomplete instances to various data imputation approaches. With the recent rise of deep neural networks, the field of missing data imputation has oriented towards modelling of the data distribution. This paper presents an approach based on Monte Carlo dropout within (Variational) Autoencoders which offers not only very good adaptation to the distribution of the data but also allows generation of new data, adapted to each specific instance. The evaluation shows that the imputation error and predictive similarity can be improved with the proposed approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题