Gibbs算法的概括误差的信息理论特征

论文标题

Gibbs算法的概括误差的信息理论特征

Information-theoretic Characterizations of Generalization Error for the Gibbs Algorithm

论文作者

Aminian, Gholamali, Bu, Yuheng, Toni, Laura, Rodrigues, Miguel R. D., Wornell, Gregory W.

论文摘要

已经开发了各种方法来限制监督学习算法的概括误差。但是，在实践中评估时，现有的界限通常是松散的，甚至是空虚的。结果，他们可能无法表征学习算法的确切概括能力。我们的主要贡献是使用不同的信息测量值（尤其是输入训练样本与输出假设之间的对称性KL信息）的众所周知的Gibbs算法（又称Gibbs后部）的预期泛化误差的确切表征。我们的结果可以应用于收紧现有的预期概括误差和pac-bayesian界限。我们的信息理论方法具有多功能性，因为它也表征了gibbs算法的概括误差，其依赖于数据的正常化程序和渐近状态中Gibbs算法的概括误差，在该算法中，它会收敛到标准的经验风险最小化算法。我们的结果特别相关，突出了对称的KL信息在控制Gibbs算法的概括误差中所扮演的角色。

Various approaches have been developed to upper bound the generalization error of a supervised learning algorithm. However, existing bounds are often loose and even vacuous when evaluated in practice. As a result, they may fail to characterize the exact generalization ability of a learning algorithm. Our main contributions are exact characterizations of the expected generalization error of the well-known Gibbs algorithm (a.k.a. Gibbs posterior) using different information measures, in particular, the symmetrized KL information between the input training samples and the output hypothesis. Our result can be applied to tighten existing expected generalization error and PAC-Bayesian bounds. Our information-theoretic approach is versatile, as it also characterizes the generalization error of the Gibbs algorithm with a data-dependent regularizer and that of the Gibbs algorithm in the asymptotic regime, where it converges to the standard empirical risk minimization algorithm. Of particular relevance, our results highlight the role the symmetrized KL information plays in controlling the generalization error of the Gibbs algorithm.

下载PDF全文

下载文献需遵守相关版权规定

论文标题