Multimodal LLMs See Sentiment 文章

ArXiv CS.CV2026-05-29NEWSen作者: Neemias B. da Silva, John Harrison, Rodrigo Minetto, Myriam R. Delgado, Bogdan T. Nassu, Thiago H. Silva

查看原文 →

关系图谱

摘要

arXiv:2508.16873v3 Announce Type: replace Abstract: Understanding how visual content conveys sentiment is increasingly important in a digital landscape dominated by imagery. However, sentiment perception depends on complex scene-level semantics, making this a challenging task for computational models. This paper examines how Multimodal Large Language Models (MLLMs) perform sentiment analysis in images through a systematic, evaluation-driven study encompassing three perspectives: (i) direct sentiment classification from images using MLLMs; (ii) sentiment analysis on MLLM-generated descriptions using pre-trained LLMs; and (iii) fine-tuning these LLMs on sentiment-labeled descriptions to assess performance and generalization. Experiments on a recent benchmark show that a two-stage MLLM description-mediated pipeline can substantially improve prediction accuracy under several evaluation settings, particularly when the LLM component is fine-tuned.

Multimodal LLMs See Sentiment 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术查看全部 (2)