论文标题

注意光束:图像字幕方法

Attention Beam: An Image Captioning Approach

论文作者

Shrimal, Anubhav, Chakraborty, Tanmoy

论文摘要

图像字幕的目的是生成给定图像的文本描述。尽管对于人类来说似乎是一件容易的事,但对于机器而言,它具有挑战性,因为它需要理解图像(计算机视觉)并因此为图像(自然语言理解)产生类似人类的描述。最近,基于编码器的体系结构已获得图像字幕的最新结果。在这里,我们在基于编码器的架构的顶部介绍了光束搜索的启发式,该架构在三个基准数据集上提供了更好的标题:FlickR8K,FlickR30K和MS Coco。

The aim of image captioning is to generate textual description of a given image. Though seemingly an easy task for humans, it is challenging for machines as it requires the ability to comprehend the image (computer vision) and consequently generate a human-like description for the image (natural language understanding). In recent times, encoder-decoder based architectures have achieved state-of-the-art results for image captioning. Here, we present a heuristic of beam search on top of the encoder-decoder based architecture that gives better quality captions on three benchmark datasets: Flickr8k, Flickr30k and MS COCO.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源