论文标题
objaverse:带注释的3D对象的宇宙
Objaverse: A Universe of Annotated 3D Objects
论文作者
论文摘要
大量数据语料库,例如WebText,Wikipedia,概念标题,WebimageText和Laion,已推动了AI的最新进展。在此类数据集上训练的大型神经模型产生了令人印象深刻的结果,并最重要的是当今的基准。在这个大规模数据集家族中,显着的遗漏是3D数据。尽管在3D视觉中有很大的兴趣和潜在的应用,但高保真3D模型的数据集仍然是中型大小,对象类别的多样性有限。在解决这一差距时,我们提出了Objaverse 1.0,这是一个具有800K+(以及增长)3D模型的大型对象数据集,带有描述性字幕,标签和动画。 Objaverse在今天的第3D存储库,类别数量以及类别内实例的视觉多样性方面有所改善。我们通过四种不同的应用来展示OBJAVERSE的巨大潜力:培训生成3D模型,改善LVIS基准上的Tail类别细分,培训用于体现AI的开放式摄入量对象范围,并为视觉模型的稳健性分析创建新的基准。 Objaverse可以打开新的研究方向,并在整个AI领域启用新应用程序。
Massive data corpora like WebText, Wikipedia, Conceptual Captions, WebImageText, and LAION have propelled recent dramatic progress in AI. Large neural models trained on such datasets produce impressive results and top many of today's benchmarks. A notable omission within this family of large-scale datasets is 3D data. Despite considerable interest and potential applications in 3D vision, datasets of high-fidelity 3D models continue to be mid-sized with limited diversity of object categories. Addressing this gap, we present Objaverse 1.0, a large dataset of objects with 800K+ (and growing) 3D models with descriptive captions, tags, and animations. Objaverse improves upon present day 3D repositories in terms of scale, number of categories, and in the visual diversity of instances within a category. We demonstrate the large potential of Objaverse via four diverse applications: training generative 3D models, improving tail category segmentation on the LVIS benchmark, training open-vocabulary object-navigation models for Embodied AI, and creating a new benchmark for robustness analysis of vision models. Objaverse can open new directions for research and enable new applications across the field of AI.