Loki: Representation over Architecture for Diffusion-Based Portrait Animation 文章

ArXiv CS.CV2026-05-26NEWSen作者: Pouyan Navard, Sernam Lim

摘要

arXiv:2605.24176v1 Announce Type: new Abstract: Portrait animation transfers a driver clip's facial expression and head pose onto a single reference image while preserving the reference's identity. State-of-the-art diffusion systems address this by stacking trained modules for expression, pose, and identity in turn, paying for it in trainable parameters, proprietary corpora, and residual entanglement between the very axes the system is meant to control independently. This complexity compensates for an upstream choice -- learning facial expression and head pose from RGB, a representation in which identity, pose, and expression are inseparable without being learned apart. Loki steps out of RGB on the conditioning path. Driver expression and head pose are encoded by a face model whose parameter axes are identity-orthogonal by construction, then rasterised into a spatial map that the diffusion backbone consumes natively.

相关公司

暂无数据

相关人物

暂无数据

相关技术

暂无数据