论文标题
使用分裂神经网络的分层聚类对缺失功能的鲁棒性
Robustness to Missing Features using Hierarchical Clustering with Split Neural Networks
论文作者
论文摘要
很长一段时间以来,丢失数据的问题一直持续存在,并在机器学习和统计数据分析中构成了主要障碍。该领域的过去工作已尝试使用各种数据插补技术来填写缺少的数据,或者使用丢失的数据培训神经网络(NNS)。在这项工作中,我们提出了一种简单而有效的方法,该方法将使用层次聚类将相似的输入特征簇簇在一起,然后训练与关节损失相称的神经网络。我们在一系列基准数据集上评估了这种方法,即使使用简单的插补技术,也可以显示出令人鼓舞的改进。我们将其归因于通过模型体系结构中类似功能的簇学习。源代码可从https://github.com/usarawgi911/robustness-to-missing-features获得
The problem of missing data has been persistent for a long time and poses a major obstacle in machine learning and statistical data analysis. Past works in this field have tried using various data imputation techniques to fill in the missing data, or training neural networks (NNs) with the missing data. In this work, we propose a simple yet effective approach that clusters similar input features together using hierarchical clustering and then trains proportionately split neural networks with a joint loss. We evaluate this approach on a series of benchmark datasets and show promising improvements even with simple imputation techniques. We attribute this to learning through clusters of similar features in our model architecture. The source code is available at https://github.com/usarawgi911/Robustness-to-Missing-Features