论文标题
在美国手语中搜索手指的内容
Searching for fingerspelled content in American Sign Language
论文作者
论文摘要
手语视频的自然语言处理 - 包括识别,翻译和搜索等任务对于使聋人可以访问人工智能技术,并且近年来正在获得研究兴趣。在本文中,我们解决了在原始手语视频中搜索手指的钥匙字或键短语的问题。这是一项重要的任务,因为通常通过手指销售来传达手语中的重要内容,据我们所知,这项任务以前尚未研究过。我们为此任务提出了一个端到端模型FSS-NET,该模型共同检测手指并将其与文本序列匹配。我们的实验在野外的大型公共数据集上进行了,显示了手指检测作为搜索和检索模型的组成部分的重要性。我们的模型大大优于基线方法,该方法改编自相关任务的先前工作
Natural language processing for sign language video - including tasks like recognition, translation, and search - is crucial for making artificial intelligence technologies accessible to deaf individuals, and is gaining research interest in recent years. In this paper, we address the problem of searching for fingerspelled key-words or key phrases in raw sign language videos. This is an important task since significant content in sign language is often conveyed via fingerspelling, and to our knowledge the task has not been studied before. We propose an end-to-end model for this task, FSS-Net, that jointly detects fingerspelling and matches it to a text sequence. Our experiments, done on a large public dataset of ASL fingerspelling in the wild, show the importance of fingerspelling detection as a component of a search and retrieval model. Our model significantly outperforms baseline methods adapted from prior work on related tasks