高效损失图像通过自适应邻里信息汇总编码

论文标题

高效损失图像通过自适应邻里信息汇总编码

High-Efficiency Lossy Image Coding Through Adaptive Neighborhood Information Aggregation

论文作者

Lu, Ming, Chen, Fangdong, Pu, Shiliang, Ma, Zhan

论文摘要

对具有出色压缩性能和计算吞吐量的学习有损图像编码（LIC）的追求很具有挑战性。其背后的重要因素是如何在变换和熵编码模块中智能探索自适应邻域信息聚合（ANIA）。为此，首先提出了综合卷积和自我注意（ICSA）单元，以形成一个内容自适应转换，以动态地表征和嵌入任何输入的邻里信息。然后，设计了一个多阶段上下文模型（MCM），以逐步使用预先安排的空间通道顺序，以并行准确概率估计。 ICSA和MCM堆叠在各种自动编码器（VAE）体系结构下，以通过端到端的学习来得出输入图像的速率 - 延伸优化的紧凑型表示。我们的方法报告超过了柯达，CLIC和TECNICK数据集的VVC内部和其他普遍的LIC方法的最先进的压缩性能；更重要的是，与最流行的LIC方法相比，我们的方法使用可比大小的型号提供了$> $ 60 $ \ times $ $ $ \ times $ $ \ times $。所有材料均可在https://njuvision.github.io/tinylic上公开访问，用于可复制研究。

Questing for learned lossy image coding (LIC) with superior compression performance and computation throughput is challenging. The vital factor behind it is how to intelligently explore Adaptive Neighborhood Information Aggregation (ANIA) in transform and entropy coding modules. To this end, Integrated Convolution and Self-Attention (ICSA) unit is first proposed to form a content-adaptive transform to characterize and embed neighborhood information dynamically of any input. Then a Multistage Context Model (MCM) is devised to progressively use available neighbors following a pre-arranged spatial-channel order for accurate probability estimation in parallel. ICSA and MCM are stacked under a Variational AutoEncoder (VAE) architecture to derive rate-distortion optimized compact representation of input image via end-to-end learning. Our method reports state-of-the-art compression performance surpassing the VVC Intra and other prevalent LIC approaches across Kodak, CLIC, and Tecnick datasets; More importantly, our method offers $>$60$\times$ decoding speedup using a comparable-size model when compared with the most popular LIC method. All materials are made publicly accessible at https://njuvision.github.io/TinyLIC for reproducible research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题