Hybrid Autoregressive-Diffusion Model for Real-Time Sign Language Production 文章

ArXiv CS.CV2026-06-03NEWSen作者: Maoxiao Ye, Xinfeng Ye, Mano Manoharan

详细信息

来源站点
ArXiv CS.CV
作者
Maoxiao Ye, Xinfeng Ye, Mano Manoharan
文章类型
NEWS
语言
en
发布日期
2026-06-03

摘要

arXiv:2507.09105v4 Announce Type: replace Abstract: Earlier Sign Language Production (SLP) models typically relied on autoregressive decoding, which naturally preserves temporal causality but suffers from error accumulation at inference time. More recent diffusion-based approaches improve generation quality through iterative denoising, yet their sequence-level refinement process introduces substantial latency. To address this trade-off, we propose HybridSign, a hybrid autoregressive-diffusion model for low-latency sign language production that combines causal frame generation with flow-based diffusion refinement. A Multi-Scale Pose Representation module captures fine-grained articulator features, while a Confidence-Aware Causal Attention mechanism leverages joint-level confidence scores to improve robustness under noisy 2D pose observations.

相关事件

暂无数据

相关公司

暂无数据

相关人物

暂无数据