详细信息
- 来源站点
- ArXiv CS.CV
- 作者
- Riccardo Carraro, Anna Briotto, Endi Hysa, Marco Fiorucci, Lamberto Ballan
- 文章类型
- NEWS
- 语言
- en
- 发布日期
- 2026-06-08
摘要
arXiv:2605.14166v2 Announce Type: replace Abstract: Face image super-resolution aims to recover high-resolution facial images from severely degraded inputs. Under extreme upscaling factors, fine facial details are often lost, making accurate reconstruction challenging. Existing methods typically rely on heavy network architectures, adversarial training schemes, or separate alignment networks, increasing model complexity and computational cost. To address these issues, we propose a lightweight U-Net based-architecture designed to reconstructs $128{ \times }128$ facial images from severely degraded $16{ \times }16$ inputs, achieving an $8 \times $ magnification. A key contribution is a novel auxiliary-training-free supervision strategy that leverages heatmaps generated by YOLO-World, an open-vocabulary object detector, to localize key facial features such as eyes, nose, and mouth.