论文标题

深层森林,筛查和窗户筛选

Deep Forest with Hashing Screening and Window Screening

论文作者

Ma, Pengfei, Wu, Youxi, Li, Yan, Guo, Lei, Jiang, He, Zhu, Xingquan, Wu, Xindong

论文摘要

作为一种新颖的深度学习模型,GCFOREST已被广泛用于各种应用中。但是,当前的GCFOREST多透明扫描会产生许多冗余特征向量,这增加了模型的时间成本。为了筛选冗余特征向量,我们引入了一种用于多层次扫描的哈希筛选机制,并提出了一种称为HW-Forest的模型,该模型采用了两种策略,即哈希筛选和窗口筛选。 HW-Forest采用感知散列算法来计算哈希筛选策略中特征向量之间的相似性,该策略用于删除由多层次扫描产生的冗余特征向量,并可以大大降低时间成本和内存消耗。此外,我们采用了一种自适应实例筛选策略来提高我们的方法的性能,称为窗口筛选,可以实现更高的精度,而无需在不同数据集上进行超级参数调整。我们的实验结果表明,HW-Forest的精度比其他模型更高,并且时间成本也降低。

As a novel deep learning model, gcForest has been widely used in various applications. However, the current multi-grained scanning of gcForest produces many redundant feature vectors, and this increases the time cost of the model. To screen out redundant feature vectors, we introduce a hashing screening mechanism for multi-grained scanning and propose a model called HW-Forest which adopts two strategies, hashing screening and window screening. HW-Forest employs perceptual hashing algorithm to calculate the similarity between feature vectors in hashing screening strategy, which is used to remove the redundant feature vectors produced by multi-grained scanning and can significantly decrease the time cost and memory consumption. Furthermore, we adopt a self-adaptive instance screening strategy to improve the performance of our approach, called window screening, which can achieve higher accuracy without hyperparameter tuning on different datasets. Our experimental results show that HW-Forest has higher accuracy than other models, and the time cost is also reduced.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源