Revisiting LLM Adaptation for 3D CT Report Generation: A Study of Scaling and Diagnostic Priors 文章

ArXiv CS.CV2026-06-17NEWSen作者: Vanshali Sharma, Andrea M. Bejar, Halil Ertugrul Aktas, Quoc-Huy Trinh, Debesh Jha, Gorkem Durak, Ulas Bagci

详细信息

来源站点
ArXiv CS.CV
作者
Vanshali Sharma, Andrea M. Bejar, Halil Ertugrul Aktas, Quoc-Huy Trinh, Debesh Jha, Gorkem Durak, Ulas Bagci
文章类型
NEWS
语言
en
发布日期
2026-06-17

摘要

arXiv:2606.17213v1 Announce Type: cross Abstract: Recent advances in multimodal learning, including large language models (LLMs) and vision-language models (VLMs), have demonstrated strong adaptability to natural images. However, extending their use to the medical domain, particularly for volumetric (3D) images, is challenging due to high computational complexity, volumetric dependencies and the semantic gap between visual features and clinical terminology. Naively fine-tuning LLMs on limited medical data often leads to overfitting and clinical hallucination, where linguistic fluency is prioritized over clinical factuality. In this study, we investigate parameter-efficient adaptation strategies for volumetric CT report generation and introduce RAD3D-Prefix, a lightweight diagnostic-prior conditioning framework that minimizes the need for extensive parameter training.

相关事件

暂无数据

相关公司

暂无数据

相关人物

暂无数据