带有匪徒反馈的最小于点误差估计

论文标题

带有匪徒反馈的最小于点误差估计

Minimum mean-squared error estimation with bandit feedback

论文作者

Ghosh, Ayon, Prashanth, L. A., Sen, Dipayan, Gopalan, Aditya

论文摘要

我们考虑了按平方误差（MSE）Sense依次学习估算的问题，即高斯$ k $ - 未知协方差的向量，只能在每轮中观察其条目的$ m <k $。我们提出了两个MSE估计器，并分析其浓度特性。第一个估计量是非自适应的，因为它与预定的$ m $ subset相关，并且缺乏过渡到替代子集的灵活性。使用回归框架得出的第二个估计器具有自适应，并且与第一个估计器相比具有更好的浓度界限。我们将MSE估计问题与Bandit反馈构架，其中目的是以高信任找到MSE最佳子集。我们提出了连续消除算法的一种变体来解决此问题。我们还得出了最小值下限，以了解该问题样本复杂性的基本限制。

We consider the problem of sequentially learning to estimate, in the mean squared error (MSE) sense, a Gaussian $K$-vector of unknown covariance by observing only $m < K$ of its entries in each round. We propose two MSE estimators, and analyze their concentration properties. The first estimator is non-adaptive, as it is tied to a predetermined $m$-subset and lacks the flexibility to transition to alternative subsets. The second estimator, which is derived using a regression framework, is adaptive and exhibits better concentration bounds in comparison to the first estimator. We frame the MSE estimation problem with bandit feedback, where the objective is to find the MSE-optimal subset with high confidence. We propose a variant of the successive elimination algorithm to solve this problem. We also derive a minimax lower bound to understand the fundamental limit on the sample complexity of this problem.

下载PDF全文

下载文献需遵守相关版权规定

论文标题