Scene-Centric Unsupervised Video Panoptic Segmentation 文章

ArXiv CS.CV2026-06-04NEWSen作者: Christoph Reich, Oliver Hahn, Nikita Araslanov, Laura Leal-Taix\'e, Christian Rupprecht, Daniel Cremers, Stefan Roth

查看原文 →

关系图谱

摘要

arXiv:2606.04925v1 Announce Type: new Abstract: Video panoptic segmentation (VPS) aims to jointly detect, segment, and track all objects while partitioning the video into semantically consistent regions. We introduce the task setting of unsupervised VPS, omitting any human supervision. Existing unsupervised scene understanding works mainly focused on image segmentation tasks; the video domain remains underexplored. We propose VideoCUPS, the first unsupervised VPS approach. VideoCUPS generates temporally consistent panoptic video pseudo-labels from scene-centric videos by exploiting unsupervised depth, motion, and visual cues. Training on these pseudo-labels using a novel Video DropLoss yields an accurate, unsupervised VPS model. To benchmark progress, we introduce a comprehensive evaluation protocol and four competitive baselines, extending state-of-the-art unsupervised panoptic image and instance video segmentation models to VPS.

Scene-Centric Unsupervised Video Panoptic Segmentation 文章

摘要

相关事件查看全部 (2)

相关公司

相关人物

相关产品查看全部 (3)

相关技术查看全部 (4)