Synbench：使用合成数据对验证表示的任务无关基准测试

论文标题

Synbench：使用合成数据对验证表示的任务无关基准测试

SynBench: Task-Agnostic Benchmarking of Pretrained Representations using Synthetic Data

论文作者

Ko, Ching-Yun, Chen, Pin-Yu, Mohapatra, Jeet, Das, Payel, Daniel, Luca

论文摘要

在大规模数据上鉴定的微型模型的最新成功，在下游任务上进行了广泛的数据，从而导致深度学习的范式转变，从以任务为中心的模型设计到任务无关的表示和特定于任务的微调。由于预处理模型的表示形式被用作不同下游任务的基础，因此本文提出了一个新的任务无关框架\ textit {synbench}，以使用合成数据来测量预审预周化表示的质量。我们通过理论上衍生的有条件高斯混合物的鲁棒性 - 准确性权衡设置了参考。考虑到审慎的模型，使用高斯混合物合成的数据表示与我们的参考来推断质量进行比较。通过比较原始数据及其表示形式之间的面积曲线比率，Synbench为鲁棒性 - 优异性能基准测试提供了可量化的分数。我们的框架适用于采用连续数据输入的各种预审预告措施，并且独立于下游任务和数据集。实验结果通过几种验证的视觉变压器模型进行了评估，表明我们的合成分数很好地匹配了在下游任务上进行微调时，预训练模型的实际线性探测性能。此外，我们的框架可用于告知在预验证的表示形式上鲁棒线性探测的设计，以减轻下游任务中的稳健性 - 准确性权衡。

Recent success in fine-tuning large models, that are pretrained on broad data at scale, on downstream tasks has led to a significant paradigm shift in deep learning, from task-centric model design to task-agnostic representation learning and task-specific fine-tuning. As the representations of pretrained models are used as a foundation for different downstream tasks, this paper proposes a new task-agnostic framework, \textit{SynBench}, to measure the quality of pretrained representations using synthetic data. We set up a reference by a theoretically-derived robustness-accuracy tradeoff of the class conditional Gaussian mixture. Given a pretrained model, the representations of data synthesized from the Gaussian mixture are used to compare with our reference to infer the quality. By comparing the ratio of area-under-curve between the raw data and their representations, SynBench offers a quantifiable score for robustness-accuracy performance benchmarking. Our framework applies to a wide range of pretrained models taking continuous data inputs and is independent of the downstream tasks and datasets. Evaluated with several pretrained vision transformer models, the experimental results show that our SynBench score well matches the actual linear probing performance of the pre-trained model when fine-tuned on downstream tasks. Moreover, our framework can be used to inform the design of robust linear probing on pretrained representations to mitigate the robustness-accuracy tradeoff in downstream tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题