Mofusion：基于Denoisising-扩散运动合成的框架

论文标题

Mofusion：基于Denoisising-扩散运动合成的框架

MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis

论文作者

Dabral, Rishabh, Mughal, Muhammad Hamza, Golyanik, Vladislav, Theobalt, Christian

论文摘要

人类运动合成的常规方法要么是确定性的，要么在运动多样性和运动质量之间的权衡方面挣扎。为了响应这些局限性，我们引入了植物性，即，基于高质量的有条件运动合成的新的基于Denoisising扩散的框架，该框架可能会基于一系列条件上下文（例如音乐和文本）产生长期，时间上的且具有暂时性和语义上准确的动作。我们还提出了通过我们计划的加权策略在运动扩散框架内引入运动合理性的众所周知的运动损失的方法。学到的潜在空间可用于几种交互式运动编辑应用程序（例如Inbeting，种子条件和基于文本的编辑），从而为虚拟角色动画和机器人技术提供了至关重要的能力。通过全面的定量评估和感知性用户研究，我们证明了与艺术品对文献中既定基准的最新状态相比的有效性。我们敦促读者观看我们的补充视频，然后访问https://vcai.mpi-inf.mpg.de/projects/mofusion。

Conventional methods for human motion synthesis are either deterministic or struggle with the trade-off between motion diversity and motion quality. In response to these limitations, we introduce MoFusion, i.e., a new denoising-diffusion-based framework for high-quality conditional human motion synthesis that can generate long, temporally plausible, and semantically accurate motions based on a range of conditioning contexts (such as music and text). We also present ways to introduce well-known kinematic losses for motion plausibility within the motion diffusion framework through our scheduled weighting strategy. The learned latent space can be used for several interactive motion editing applications -- like inbetweening, seed conditioning, and text-based editing -- thus, providing crucial abilities for virtual character animation and robotics. Through comprehensive quantitative evaluations and a perceptual user study, we demonstrate the effectiveness of MoFusion compared to the state of the art on established benchmarks in the literature. We urge the reader to watch our supplementary video and visit https://vcai.mpi-inf.mpg.de/projects/MoFusion.

下载PDF全文

下载文献需遵守相关版权规定

论文标题