ArcVQ-VAE: A Spherical Vector Quantization Framework with ArcCosine Additive Margin 文章

ArXiv CS.CV2026-05-28NEWSen作者: Jaeyung Kim, YoungJoon Yoo

摘要

arXiv:2605.13517v2 Announce Type: replace Abstract: Vector Quantized Variational Autoencoder (VQ-VAE) has become a fundamental framework for learning discrete representations in image modeling. However, VQ-VAE models must tokenize entire images using a finite set of codebook vectors, and this capacity limitation restricts their ability to capture rich and diverse representations. In this paper, we propose ArcCosine Additive Margin VQ-VAE (ArcVQ-VAE), a novel vector quantization framework that introduces a spherical angular-margin prior (SAMP) for the codebook of a conventional VQ-VAE. The proposed SAMP consists of Ball-Bounded Norm Regularization, which constrains all codebook vectors within a time-dependent Euclidean ball, and ArcCosine Additive Margin Loss, which encourages greater angular separability among latent vectors.