Clustering Guided Domain-Specific Pretrained Foundation Model Very High-Resolution Arctic Remote Sensing 文章

ArXiv CS.CV2026-06-01NEWSen作者: Amal S. Perera, Chandi Witharana, Elias Manos, Michael Pimenta, Anna K. Liljedahl

摘要

arXiv:2605.30467v1 Announce Type: new Abstract: This study introduces a novel Arctic-focused remote sensing foundation model (RSFM) by combining diversity-aware regional-scale image curation with masked autoencoder (MAE) self-supervised pretraining of a Vision Transformer (ViT) encoder for very-high-spatial-resolution (VHSR) satellite image analysis. Spectral and acquisition-metadata descriptors were used in a scalable affinity-propagation clustering workflow to select approximately 3 million chips from 267 TB of Vantor VHSR imagery This curation strategy was designed to reduce oversampling of visually repetitive or low-information areas while preserving broad scene diversity across the study domain. We pretrained a ViT-Large encoder on the curated corpus using a domain-adapted MAE reconstruction objective, producing Arctic-specific transformer weights for downstream feature mapping.