Tetris: Tile-level Sampling for Efficient and High-Fidelity Video Object Tracking 文章

ArXiv CS.CV2026-05-26NEWSen作者: Chanwut Kittivorawong, Alena Chao, Charlie Si, Alvin Cheung

摘要

arXiv:2605.25538v1 Announce Type: new Abstract: Track materialization converts raw video into reusable object tracks that downstream queries can run against without rerunning tracking, but extracting those tracks efficiently and with high fidelity remains expensive. Prior systems reduce cost through temporal frame sampling, erasing the inter-frame motion that fine-grained tracking requires. In stationary video, however, large portions of each frame contain no objects of interest, and the remaining regions tolerate different sampling rates. We present Tetris, a track-extraction system that decomposes videos into a tile-based polyomino data model, enabling fine-grained spatiotemporal pruning that reduces detector calls with minimal fidelity loss.