论文标题
VC尺寸和基于无分布样品的测试
VC Dimension and Distribution-Free Sample-Based Testing
论文作者
论文摘要
我们认为,在与标准PAC学习设置相对应的基于无分布样本的模型中,确定哪些功能可以比学到的问题更有效地测试。我们的主要结果表明,虽然VC尺寸本身并不总是在该模型中测试一类功能所需的样本数量的数量,但它可以与与密切相关的变体组合,我们称之为“较低VC”(或LVC)维度,以在此样品复杂性上获得强大的下限。 我们使用此结果来获得强大的样品复杂性上的强大和最佳的下限,以测试间隔,半空间,半空间的相交,多项式阈值函数和决策树的工会。相反,我们表明,可以使用多个样本在多种方面小于PAC学习所需的样本数量的多个样本来测试两个天然函数类别,juntas和单调函数。 最后,我们还使用VC维度和属性测试之间的连接来建立新的下限,以测试半径凝聚力和线性约束系统的可行性。
We consider the problem of determining which classes of functions can be tested more efficiently than they can be learned, in the distribution-free sample-based model that corresponds to the standard PAC learning setting. Our main result shows that while VC dimension by itself does not always provide tight bounds on the number of samples required to test a class of functions in this model, it can be combined with a closely-related variant that we call "lower VC" (or LVC) dimension to obtain strong lower bounds on this sample complexity. We use this result to obtain strong and in many cases nearly optimal lower bounds on the sample complexity for testing unions of intervals, halfspaces, intersections of halfspaces, polynomial threshold functions, and decision trees. Conversely, we show that two natural classes of functions, juntas and monotone functions, can be tested with a number of samples that is polynomially smaller than the number of samples required for PAC learning. Finally, we also use the connection between VC dimension and property testing to establish new lower bounds for testing radius clusterability and testing feasibility of linear constraint systems.