Block-Based Double Decoders 事件
PRODUCT_LAUNCH2026-06-01影响: MEDIUM
Block-Based Double Decoders arXiv:2605.18807v2 Announce Type: replace-cross Abstract: Encoder-decoder models offer substantial inference-time savings over decoder-only models, but their pretraining objectives suffer from sparse supervision and dynamic sequence lengths, keeping them out of practice at scale. We propose block-based double decoders, a novel transformer architecture that utilizes doubly-causal block-based attention masks to train with full loss supervision and static sequence packi