论文标题
数据质量超过数量:过程分析的陷阱和准则
Data Quality Over Quantity: Pitfalls and Guidelines for Process Analytics
论文作者
论文摘要
高级过程控制,过程分析和机器学习中涉及的很大一部分努力涉及获取和准备数据。文献经常强调越来越复杂的建模技术,并提高了性能的增量。但是,当发表工业案例研究时,他们通常缺乏有关数据获取和准备工作的重要细节。尽管数据预处理被不公平地误以为是微不足道的,而且在技术上无关紧要,但实际上,它对现实世界中人工智能应用程序的成功具有外数影响。这项工作描述了获取和准备操作数据以追求数据驱动的建模和控制机会的最佳实践。我们提出了预处理工业时间序列数据的实用考虑,以告知可靠的软传感器的有效开发,这些软传感器可提供有价值的流程见解。
A significant portion of the effort involved in advanced process control, process analytics, and machine learning involves acquiring and preparing data. Literature often emphasizes increasingly complex modelling techniques with incremental performance improvements. However, when industrial case studies are published they often lack important details on data acquisition and preparation. Although data pre-processing is unfairly maligned as trivial and technically uninteresting, in practice it has an out-sized influence on the success of real-world artificial intelligence applications. This work describes best practices for acquiring and preparing operating data to pursue data-driven modelling and control opportunities in industrial processes. We present practical considerations for pre-processing industrial time series data to inform the efficient development of reliable soft sensors that provide valuable process insights.