论文标题
从过程模型通过非负张量分解生成隐藏的马尔可夫模型
Generating Hidden Markov Models from Process Models Through Nonnegative Tensor Factorization
论文作者
论文摘要
监测工业流程是行业和政府的关键能力,以确保生产周期的可靠性,快速的应急响应和国家安全。流程监控使用户可以在工业过程中衡量组织的进步,或者预测机器零件在偏远位置发生的过程中的降解或老化。与许多数据科学应用程序类似,我们通常只能访问有限的原始数据,例如卫星图像,简短的视频剪辑,事件日志以及一小部分传感器捕获的签名。为了打击数据稀缺性,我们利用熟悉感兴趣的行为的主题专家(SME)的知识。中小型企业提供了有关任务完成所需的基本活动的专家知识以及进行这些活动所需的必要资源。为此类型的分析开发了各种过程挖掘技术。通常,这种方法结合了基于域专家见解的理论过程模型以及可用原始数据的临时集成。在这里,我们介绍了一种新颖的数学声音方法,该方法将理论过程模型(由中小企业提出)与相互关联的最小隐藏的马尔可夫模型(HMM)集成在一起,该模型是通过非负张量分解构建的。我们的方法合并:(a)理论过程模型,(b)HMMS,(c)耦合的非负矩阵tensor因素化,以及(d)自定义模型选择。为了证明我们的方法论及其能力,我们将其应用于简单的合成和现实世界过程模型。
Monitoring of industrial processes is a critical capability in industry and in government to ensure reliability of production cycles, quick emergency response, and national security. Process monitoring allows users to gauge the progress of an organization in an industrial process or predict the degradation or aging of machine parts in processes taking place at a remote location. Similar to many data science applications, we usually only have access to limited raw data, such as satellite imagery, short video clips, event logs, and signatures captured by a small set of sensors. To combat data scarcity, we leverage the knowledge of Subject Matter Experts (SMEs) who are familiar with the actions of interest. SMEs provide expert knowledge of the essential activities required for task completion and the resources necessary to carry out each of these activities. Various process mining techniques have been developed for this type of analysis; typically such approaches combine theoretical process models built based on domain expert insights with ad-hoc integration of available pieces of raw data. Here, we introduce a novel mathematically sound method that integrates theoretical process models (as proposed by SMEs) with interrelated minimal Hidden Markov Models (HMM), built via nonnegative tensor factorization. Our method consolidates: (a) theoretical process models, (b) HMMs, (c) coupled nonnegative matrix-tensor factorizations, and (d) custom model selection. To demonstrate our methodology and its abilities, we apply it on simple synthetic and real world process models.