论文标题
(人声)互动的先验?
Whither the Priors for (Vocal) Interactivity?
论文作者
论文摘要
基于语音的沟通通常被认为是人类和机器人可能会相互作用的最自然的方式之一,而最近的准确自动语音识别和可理解的语音综合的可用性使研究人员能够将高级现成的语言技术组件集成到其机器人平台中。尽管如此,由此产生的互动只是“自然”。它逐渐发展出一个简单的声音并不意味着用户会知道如何(或何时)与之交谈,并且由此产生的“对话”往往会被刺痛,单面且短。从表面上看,这些困难似乎是用户不熟悉机器人(和\ emph {vice festa})的巨大后果,并且人类长期使用的任何问题都将通过人类长期使用,并与机器人的“深度学习”相结合。但是,这里认为这种交流失败表明了更深层的不适:基本缺乏基本原理 - \ emph {priors} - 不仅是基于语音的互动的基础,而且是(一般而言)互动性。这不仅可以证明这一事实已经证明了当代口语系统已经需要培训数据集,而训练数据集比幼儿所经历的,而且还缺乏创建有效的交流性人类机器人互动的设计原则。该简短的立场论文确定了一些理论见解可能有助于克服这些短缺的关键领域。
Voice-based communication is often cited as one of the most `natural' ways in which humans and robots might interact, and the recent availability of accurate automatic speech recognition and intelligible speech synthesis has enabled researchers to integrate advanced off-the-shelf spoken language technology components into their robot platforms. Despite this, the resulting interactions are anything but `natural'. It transpires that simply giving a robot a voice doesn't mean that a user will know how (or when) to talk to it, and the resulting `conversations' tend to be stilted, one-sided and short. On the surface, these difficulties might appear to be fairly trivial consequences of users' unfamiliarity with robots (and \emph{vice versa}), and that any problems would be mitigated by long-term use by the human, coupled with `deep learning' by the robot. However, it is argued here that such communication failures are indicative of a deeper malaise: a fundamental lack of basic principles -- \emph{priors} -- underpinning not only speech-based interaction in particular, but (vocal) interactivity in general. This is evidenced not only by the fact that contemporary spoken language systems already require training data sets that are orders-of-magnitude greater than that experienced by a young child, but also by the lack of design principles for creating effective communicative human-robot interaction. This short position paper identifies some of the key areas where theoretical insights might help overcome these shortfalls.