AGQA 2.0：组成时空推理的更新基准

论文标题

AGQA 2.0：组成时空推理的更新基准

AGQA 2.0: An Updated Benchmark for Compositional Spatio-Temporal Reasoning

论文作者

Grunde-McLaughlin, Madeleine, Krishna, Ranjay, Agrawala, Maneesh

论文摘要

先前的基准测试分析了模型对视频问题的答案，以衡量视觉构图推理。动作基因组问题回答（AGQA）就是这样的基准。 AGQA提供了具有平衡答案分布的培训/测试分开，以减少语言偏见的影响。但是，有些偏见仍然存在于几种AGQA类别中。我们介绍了AGQA 2.0，这是该基准测试的一种版本，具有多种改进，即更严格的平衡程序。然后，我们在所有实验的更新基准上报告结果。

Prior benchmarks have analyzed models' answers to questions about videos in order to measure visual compositional reasoning. Action Genome Question Answering (AGQA) is one such benchmark. AGQA provides a training/test split with balanced answer distributions to reduce the effect of linguistic biases. However, some biases remain in several AGQA categories. We introduce AGQA 2.0, a version of this benchmark with several improvements, most namely a stricter balancing procedure. We then report results on the updated benchmark for all experiments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题