论文标题
AGQA 2.0:组成时空推理的更新基准
AGQA 2.0: An Updated Benchmark for Compositional Spatio-Temporal Reasoning
论文作者
论文摘要
先前的基准测试分析了模型对视频问题的答案,以衡量视觉构图推理。动作基因组问题回答(AGQA)就是这样的基准。 AGQA提供了具有平衡答案分布的培训/测试分开,以减少语言偏见的影响。但是,有些偏见仍然存在于几种AGQA类别中。我们介绍了AGQA 2.0,这是该基准测试的一种版本,具有多种改进,即更严格的平衡程序。然后,我们在所有实验的更新基准上报告结果。
Prior benchmarks have analyzed models' answers to questions about videos in order to measure visual compositional reasoning. Action Genome Question Answering (AGQA) is one such benchmark. AGQA provides a training/test split with balanced answer distributions to reduce the effect of linguistic biases. However, some biases remain in several AGQA categories. We introduce AGQA 2.0, a version of this benchmark with several improvements, most namely a stricter balancing procedure. We then report results on the updated benchmark for all experiments.