数据新鲜度如何影响实时监督学习？

论文标题

数据新鲜度如何影响实时监督学习？

How Does Data Freshness Affect Real-time Supervised Learning?

论文作者

Shisher, Md Kamran Chowdhury, Sun, Yin

论文摘要

在本文中，我们根据在感应节点（例如，相机或激光镜头）观察到的功能（例如，视频帧）（例如，视频帧）（例如，视频框架）来分析数据新鲜度对实时监督学习的影响。人们可能会期望随着功能变得陈旧，实时监督学习的表现会单调地降低。使用信息理论分析，我们表明，如果特征和目标数据序列可以与马尔可夫链紧密接近，这是正确的。如果数据序列远非马克维亚人，那是不正确的。因此，实时监督学习的预测错误是信息时代（AOI）的函数，该函数可能是非单调的。进行了几项实验，以说明预测误差的单调和非单调行为。为了最大程度地减少实时的推理误差，我们提出了一种用于发送特征的新“选择 - 逃避器”模型，该模型比早期研究中使用的“生成意志”模型更一般。通过使用Gittins和Whittle索引，开发了低复杂性调度策略来最大程度地减少推理误差，在此发现Gittins索引理论与信息时代（AOI）之间的新联系最小化。这些调度结果（i）可将一般AOI函数（单调或非单调）和（ii）最小化，以最大程度地减少一般特征传输时间分布。提出了数据驱动的评估，以说明提出的调度算法的好处。

In this paper, we analyze the impact of data freshness on real-time supervised learning, where a neural network is trained to infer a time-varying target (e.g., the position of the vehicle in front) based on features (e.g., video frames) observed at a sensing node (e.g., camera or lidar). One might expect that the performance of real-time supervised learning degrades monotonically as the feature becomes stale. Using an information-theoretic analysis, we show that this is true if the feature and target data sequence can be closely approximated as a Markov chain; it is not true if the data sequence is far from Markovian. Hence, the prediction error of real-time supervised learning is a function of the Age of Information (AoI), where the function could be non-monotonic. Several experiments are conducted to illustrate the monotonic and non-monotonic behaviors of the prediction error. To minimize the inference error in real-time, we propose a new "selection-from-buffer" model for sending the features, which is more general than the "generate-at-will" model used in earlier studies. By using Gittins and Whittle indices, low-complexity scheduling strategies are developed to minimize the inference error, where a new connection between the Gittins index theory and Age of Information (AoI) minimization is discovered. These scheduling results hold (i) for minimizing general AoI functions (monotonic or non-monotonic) and (ii) for general feature transmission time distributions. Data-driven evaluations are presented to illustrate the benefits of the proposed scheduling algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题