Prayatul矩阵：一种直接比较方法来评估监督机器学习模型的性能

论文标题

Prayatul矩阵：一种直接比较方法来评估监督机器学习模型的性能

Prayatul Matrix: A Direct Comparison Approach to Evaluate Performance of Supervised Machine Learning Models

论文作者

Biswas, Anupam

论文摘要

监督机器学习（ML）模型的性能比较是根据在测试数据集中获得的基于混淆矩阵的分数而广泛完成的。但是，数据集包含多个具有不同难度级别的实例。因此，比较ML模型对单个实例的有效性，而不是比较整个数据集获得的分数更为逻辑。在本文中，提出了一种替代方法，以直接比较数据集中的各个实例。引入了一个直接比较矩阵，称为\ emph {prayatul矩阵}，该矩阵在数据集的不同实例上说明了两种ML算法的比较结果。基于Prayatul矩阵设计了五种不同的性能指标。在三个数据集上使用四种分类技术分析了所提出的方法以及设计措施的功效。还对四个大型复杂图像数据集进行了分析，该数据集具有四个深度学习模型，即Resnet50V2，MobilenetV2，EdgitionNet和XceptionNet。结果很明显，新设计的度量能够对比较ML算法有更多的了解，而现有基于混淆矩阵（例如精度，精度和回忆）是不可能的。

Performance comparison of supervised machine learning (ML) models are widely done in terms of different confusion matrix based scores obtained on test datasets. However, a dataset comprises several instances having different difficulty levels. Therefore, it is more logical to compare effectiveness of ML models on individual instances instead of comparing scores obtained for the entire dataset. In this paper, an alternative approach is proposed for direct comparison of supervised ML models in terms of individual instances within the dataset. A direct comparison matrix called \emph{Prayatul Matrix} is introduced, which accounts for comparative outcome of two ML algorithms on different instances of a dataset. Five different performance measures are designed based on prayatul matrix. Efficacy of the proposed approach as well as designed measures is analyzed with four classification techniques on three datasets. Also analyzed on four large-scale complex image datasets with four deep learning models namely ResNet50V2, MobileNetV2, EfficientNet, and XceptionNet. Results are evident that the newly designed measure are capable of giving more insight about the comparing ML algorithms, which were impossible with existing confusion matrix based scores like accuracy, precision and recall.

下载PDF全文

下载文献需遵守相关版权规定

论文标题