论文标题
数据新鲜度如何影响实时监督学习?
How Does Data Freshness Affect Real-time Supervised Learning?
论文作者
论文摘要
在本文中,我们根据在感应节点(例如,相机或激光镜头)观察到的功能(例如,视频帧)(例如,视频帧)(例如,视频框架)来分析数据新鲜度对实时监督学习的影响。人们可能会期望随着功能变得陈旧,实时监督学习的表现会单调地降低。使用信息理论分析,我们表明,如果特征和目标数据序列可以与马尔可夫链紧密接近,这是正确的。如果数据序列远非马克维亚人,那是不正确的。因此,实时监督学习的预测错误是信息时代(AOI)的函数,该函数可能是非单调的。进行了几项实验,以说明预测误差的单调和非单调行为。为了最大程度地减少实时的推理误差,我们提出了一种用于发送特征的新“选择 - 逃避器”模型,该模型比早期研究中使用的“生成意志”模型更一般。通过使用Gittins和Whittle索引,开发了低复杂性调度策略来最大程度地减少推理误差,在此发现Gittins索引理论与信息时代(AOI)之间的新联系最小化。这些调度结果(i)可将一般AOI函数(单调或非单调)和(ii)最小化,以最大程度地减少一般特征传输时间分布。提出了数据驱动的评估,以说明提出的调度算法的好处。
In this paper, we analyze the impact of data freshness on real-time supervised learning, where a neural network is trained to infer a time-varying target (e.g., the position of the vehicle in front) based on features (e.g., video frames) observed at a sensing node (e.g., camera or lidar). One might expect that the performance of real-time supervised learning degrades monotonically as the feature becomes stale. Using an information-theoretic analysis, we show that this is true if the feature and target data sequence can be closely approximated as a Markov chain; it is not true if the data sequence is far from Markovian. Hence, the prediction error of real-time supervised learning is a function of the Age of Information (AoI), where the function could be non-monotonic. Several experiments are conducted to illustrate the monotonic and non-monotonic behaviors of the prediction error. To minimize the inference error in real-time, we propose a new "selection-from-buffer" model for sending the features, which is more general than the "generate-at-will" model used in earlier studies. By using Gittins and Whittle indices, low-complexity scheduling strategies are developed to minimize the inference error, where a new connection between the Gittins index theory and Age of Information (AoI) minimization is discovered. These scheduling results hold (i) for minimizing general AoI functions (monotonic or non-monotonic) and (ii) for general feature transmission time distributions. Data-driven evaluations are presented to illustrate the benefits of the proposed scheduling algorithms.