论文标题
将机器学习应用于履行网络的设备调试审核的重复检测,节流和优先级
Applying Machine Learning for Duplicate Detection, Throttling and Prioritization of Equipment Commissioning Audits at Fulfillment Network
论文作者
论文摘要
VQ(供应商资格)和IOQ(安装和操作资格)审核在仓库中进行了审核,以确保在履行网络中将所有设备交换为符合质量标准。如果在短时间内进行许多检查,则可能会跳过审核检查。此外,探索性数据分析揭示了对相同资产进行类似检查的几个实例,从而重复了工作。在这项工作中,自然语言处理和机器学习被应用于通过识别相似性和重复项来修剪仓库网络的大清单数据集,并预测具有较高及格率的非批评性数据集。该研究建议ML分类器识别具有IOQ和VQ较高概率的检查,并将优先级分配给检查,以便在无法执行所有检查的时间时优先考虑检查。这项研究建议使用基于NLP的BlazingText分类器以高速速率投入清单,这可以降低检查的10%-37%,并大大降低成本。应用的算法超过了随机森林和神经网络分类器,并在90%的曲线下达到了一个区域。由于数据不平衡,使用F1分数对模型的准确性产生了积极影响,从8%提高到75%。此外,提出的重复检测过程确定了要修剪的17%可能的冗余支票。
VQ (Vendor Qualification) and IOQ (Installation and Operation Qualification) audits are implemented in warehouses to ensure all equipment being turned over in the fulfillment network meets the quality standards. Audit checks are likely to be skipped if there are many checks to be performed in a short time. In addition, exploratory data analysis reveals several instances of similar checks being performed on the same assets and thus, duplicating the effort. In this work, Natural Language Processing and Machine Learning are applied to trim a large checklist dataset for a network of warehouses by identifying similarities and duplicates, and predict the non-critical ones with a high passing rate. The study proposes ML classifiers to identify checks which have a high passing probability of IOQ and VQ and assign priorities to checks to be prioritized when the time is not available to perform all checks. This research proposes using NLP-based BlazingText classifier to throttle the checklists with a high passing rate, which can reduce 10%-37% of the checks and achieve significant cost reduction. The applied algorithm over performs Random Forest and Neural Network classifiers and achieves an area under the curve of 90%. Because of imbalanced data, down-sampling and upweighting have shown a positive impact on the models' accuracy using F1 score, which improve from 8% to 75%. In addition, the proposed duplicate detection process identifies 17% possible redundant checks to be trimmed.