论文标题
视频压缩的多速率自适应转换编码
Multi-rate adaptive transform coding for video compression
论文作者
论文摘要
当代有损耗的图像和视频编码标准依赖于转换编码,该过程将像素映射到替代表示形式以促进有效的数据压缩。尽管具有深层神经网络的端到端优化压缩性能令人印象深刻,但这些模型的高计算和空间需求使它们无法取代常规视频编解码器中相对简单的转换编码。在这项研究中,我们提出了学习的转换和熵编码,可以用作(非)线性液线替换,或者对现有编解码器中线性变换的增强。这些变换可以是多速率的,从而允许单个模型沿整个速率延伸曲线运行。为了证明我们的框架的实用性,我们使用学习的量化矩阵和自适应熵编码来增强DCT,以压缩框架内AV1块预测残差。我们报告了比较复杂的非线性变换的大量BD率和感知质量的改进,其计算成本的一部分。
Contemporary lossy image and video coding standards rely on transform coding, the process through which pixels are mapped to an alternative representation to facilitate efficient data compression. Despite impressive performance of end-to-end optimized compression with deep neural networks, the high computational and space demands of these models has prevented them from superseding the relatively simple transform coding found in conventional video codecs. In this study, we propose learned transforms and entropy coding that may either serve as (non)linear drop-in replacements, or enhancements for linear transforms in existing codecs. These transforms can be multi-rate, allowing a single model to operate along the entire rate-distortion curve. To demonstrate the utility of our framework, we augmented the DCT with learned quantization matrices and adaptive entropy coding to compress intra-frame AV1 block prediction residuals. We report substantial BD-rate and perceptual quality improvements over more complex nonlinear transforms at a fraction of the computational cost.