论文标题
用于层次网络物理系统检查点和恢复检查点和恢复的框架
A Framework for Checkpointing and Recovery of Hierarchical Cyber-Physical Systems
论文作者
论文摘要
本文解决了制造复杂的资源受限的网络物理系统(CPS)对传感器异常的弹性的问题。特别是,我们提出了一个框架,用于检查和向前恢复与非线性层次CPS中具有异常传感器数据的状态估算物的恢复。我们介绍了三个检查点范式,以确保整个层次结构的不同级别的检查点一致性。我们的框架具有算法实施一致的范式,以耗时的方式执行准确的恢复,同时通过系统资源管理权衡,并处理整个层次结构各种异常检测系统之间的相互作用。此外,在这项工作中,我们详细介绍了恢复的状态误差,最大可容忍的异常持续时间以及由上述权衡取舍产生的准确性资源差距。我们探索了框架的用例,并在模拟地面机器人的案例研究中对其进行了评估,以表明它比较到多个层次结构,并且比在传感器异常过程中不包含检查点过程的扩展卡尔曼滤波器(EKF)更好。我们通过讨论将提议的框架扩展到分布式系统的讨论,总结了这项工作。
This paper tackles the problem of making complex resource-constrained cyber-physical systems (CPS) resilient to sensor anomalies. In particular, we present a framework for checkpointing and roll-forward recovery of state-estimates in nonlinear, hierarchical CPS with anomalous sensor data. We introduce three checkpointing paradigms for ensuring different levels of checkpointing consistency across the hierarchy. Our framework has algorithms implementing the consistent paradigm to perform accurate recovery in a time-efficient manner while managing the tradeoff with system resources and handling the interplay between diverse anomaly detection systems across the hierarchy. Further in this work, we detail bounds on the recovered state-estimate error, maximum tolerable anomaly duration and the accuracy-resource gap that results from the aforementioned tradeoff. We explore use-cases for our framework and evaluate it on a case study of a simulated ground robot to show that it scales to multiple hierarchies and performs better than an extended Kalman filter (EKF) that does not incorporate a checkpointing procedure during sensor anomalies. We conclude the work with a discussion on extending the proposed framework to distributed systems.