Hybrid Autoregressive-Diffusion Model for Real-Time Sign Language Production 文章

ArXiv CS.CV2026-06-03NEWSen作者: Maoxiao Ye, Xinfeng Ye, Mano Manoharan

详细信息

来源站点: ArXiv CS.CV
作者: Maoxiao Ye, Xinfeng Ye, Mano Manoharan
文章类型: NEWS
语言: en
发布日期: 2026-06-03

摘要

arXiv:2507.09105v4 Announce Type: replace Abstract: Earlier Sign Language Production (SLP) models typically relied on autoregressive decoding, which naturally preserves temporal causality but suffers from error accumulation at inference time. More recent diffusion-based approaches improve generation quality through iterative denoising, yet their sequence-level refinement process introduces substantial latency. To address this trade-off, we propose HybridSign, a hybrid autoregressive-diffusion model for low-latency sign language production that combines causal frame generation with flow-based diffusion refinement. A Multi-Scale Pose Representation module captures fine-grained articulator features, while a Confidence-Aware Causal Attention mechanism leverages joint-level confidence scores to improve robustness under noisy 2D pose observations.

Hybrid Autoregressive-Diffusion Model for Real-Time Sign Language Production 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品查看全部 (1)

相关技术查看全部 (2)