人机协作的视听对象分类

论文标题

人机协作的视听对象分类

Audio-Visual Object Classification for Human-Robot Collaboration

论文作者

Xompero, A., Pang, Y. L., Patten, T., Prabhakar, A., Calli, B., Cavallaro, A.

论文摘要

人类机器人的协作需要对一个人操纵的容器的物理特性进行无接触式估计，例如在将内容倒入杯子或移动食物盒中。声学和视觉信号可用于估计此类物体的物理特性，这些物体的形状，材料和大小可能差异很大，并且也被人的手遮住。为了促进比较并刺激解决此问题的进展，我们提出了Corsmal挑战和数据集，以通过一组定义明确的性能得分来评估算法的性能。挑战的任务是对象（容器）的质量，容量和维度的估计以及其内容的类型和数量的分类。挑战的一个新颖特征是我们的实际仿真框架，用于可视化和评估人与人机移交中估计误差的影响。

Human-robot collaboration requires the contactless estimation of the physical properties of containers manipulated by a person, for example while pouring content in a cup or moving a food box. Acoustic and visual signals can be used to estimate the physical properties of such objects, which may vary substantially in shape, material and size, and also be occluded by the hands of the person. To facilitate comparisons and stimulate progress in solving this problem, we present the CORSMAL challenge and a dataset to assess the performance of the algorithms through a set of well-defined performance scores. The tasks of the challenge are the estimation of the mass, capacity, and dimensions of the object (container), and the classification of the type and amount of its content. A novel feature of the challenge is our real-to-simulation framework for visualising and assessing the impact of estimation errors in human-to-robot handovers.

下载PDF全文

下载文献需遵守相关版权规定

论文标题