论文标题

主动分类和不确定性比较查询

Active Classification with Uncertainty Comparison Queries

论文作者

Cui, Zhenghang, Sato, Issei

论文摘要

嘈杂的成对比较反馈已被合并,以提高交互式学习二进制分类器的整体查询复杂性。 \ textit {阳性比较oracle}用于提供给定两对数据点的反馈。因为不可能仅使用此Oracle \ textIt {而不知道分类阈值}来推断准确的标签,因此现有方法仍然依赖于传统\ textit {显式标签oracle},该{显式标签oracle}直接在给定数据点的标签上直接响应标签。现有方法对所有数据点进行分类并使用明确的标签Oracle找到分类阈值。但是,当前方法有两个缺点:(1)它们需要不必要的标签推理分类; (2)快速排序天真地适应嘈杂的反馈,并对实际的性能产生负面影响。为了避免这种效率低下并获取分类阈值的信息,我们提出了一个新的成对比较Oracle,涉及不确定性。该Oracle收到两个数据点作为输入和答案,其中一个具有更高的不确定性。然后,我们使用所提出的Oracle和阳性比较Oracle提出了有效的自适应标记算法。此外,我们还解决了与数据集大小相比,标签预算不足的情况,可以通过将所提出的算法插入活跃的学习算法来处理。此外,我们在理论上和经验上证实了提议的甲骨文的可行性以及所提出的算法的性能。

Noisy pairwise comparison feedback has been incorporated to improve the overall query complexity of interactively learning binary classifiers. The \textit{positivity comparison oracle} is used to provide feedback on which is more likely to be positive given a pair of data points. Because it is impossible to infer accurate labels using this oracle alone \textit{without knowing the classification threshold}, existing methods still rely on the traditional \textit{explicit labeling oracle}, which directly answers the label given a data point. Existing methods conduct sorting on all data points and use explicit labeling oracle to find the classification threshold. The current methods, however, have two drawbacks: (1) they needs unnecessary sorting for label inference; (2) quick sort is naively adapted to noisy feedback and negatively affects practical performance. In order to avoid this inefficiency and acquire information of the classification threshold, we propose a new pairwise comparison oracle concerning uncertainties. This oracle receives two data points as input and answers which one has higher uncertainty. We then propose an efficient adaptive labeling algorithm using the proposed oracle and the positivity comparison oracle. In addition, we also address the situation where the labeling budget is insufficient compared to the dataset size, which can be dealt with by plugging the proposed algorithm into an active learning algorithm. Furthermore, we confirm the feasibility of the proposed oracle and the performance of the proposed algorithm theoretically and empirically.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源