g-semtmo：具有可训练的语义图的音调映射

论文标题

g-semtmo：具有可训练的语义图的音调映射

G-SemTMO: Tone Mapping with a Trainable Semantic Graph

论文作者

Goswami, Abhishek, Bernard, Erwan, Hauser, Wolf, Dufaux, Frederic, Mantiuk, Rafal

论文摘要

需要在动态功能有限的媒体上渲染具有高动态范围（HDR）的图像，需要进行音调映射操作员（TMO）。 TMO压缩动态范围，目的是保留场景的视觉感知提示。以前的文献已经确定了TMOS语义意识到的好处，了解场景中的内容以更好地保留提示。专家摄影师分析了场景的语义和上下文信息，并决定音调转换或本地亮度调整。这个过程可以视为语调映射的手动类比。在这项工作中，我们从专家摄影师的方法中汲取了灵感，并提出了基于图的语义感知音调映射操作员G-Semtmo。我们以图形的形式捕获其语义段的空间布置的形式，利用语义信息以及场景的上下文信息。使用图形卷积网络（GCN），我们预测称为语义提示的中间参数，并使用这些参数将其在图像中不同语义段的本地应用于本地。此外，我们还介绍了LOCHDR，这是一个由具有本地色调增强功能的专家照相偶像手动映射的781 HDR图像音调的数据集。我们进行消融研究，以表明我们的方法G-Semtmo \ footNote {代码和数据集将与手稿的最终版本一起发表，可以通过利用与基于经典和学习的TMOS相比，通过利用语义图并产生更好的语义图，从一对输入线性和手动修饰的图像中学习全球和本地音调转换。我们还进行消融实验以验证使用GCN的优势。

A Tone Mapping Operator (TMO) is required to render images with a High Dynamic Range (HDR) on media with limited dynamic capabilities. TMOs compress the dynamic range with the aim of preserving the visually perceptual cues of the scene. Previous literature has established the benefits of TMOs being semantic aware, understanding the content in the scene to preserve the cues better. Expert photographers analyze the semantic and the contextual information of a scene and decide tonal transformations or local luminance adjustments. This process can be considered a manual analogy to tone mapping. In this work, we draw inspiration from an expert photographer's approach and present a Graph-based Semantic-aware Tone Mapping Operator, G-SemTMO. We leverage semantic information as well as the contextual information of the scene in the form of a graph capturing the spatial arrangements of its semantic segments. Using Graph Convolutional Network (GCN), we predict intermediate parameters called Semantic Hints and use these parameters to apply tonal adjustments locally to different semantic segments in the image. In addition, we also introduce LocHDR, a dataset of 781 HDR images tone mapped manually by an expert photo-retoucher with local tonal enhancements. We conduct ablation studies to show that our approach, G-SemTMO\footnote{Code and dataset to be published with the final version of the manuscript}, can learn both global and local tonal transformations from a pair of input linear and manually retouched images by leveraging the semantic graphs and produce better results than both classical and learning based TMOs. We also conduct ablation experiments to validate the advantage of using GCN.

下载PDF全文

下载文献需遵守相关版权规定

论文标题