Personal AI Agent for Camera Roll VQA 文章

ArXiv CS.CV2026-06-05NEWSen作者: Thao Nguyen, Krishna Kumar Singh, Donghyun Kim, Yong Jae Lee, Yuheng Li

摘要

arXiv:2606.05275v1 Announce Type: new Abstract: We study the personal camera roll visual question answering setting. In this setting, a conversational AI assistant can access a user's personal camera roll and retrieve relevant photos to answer queries, ranging from simple factual questions (e.g., ``Name of the food I tried yesterday?'') to more open-ended ones (e.g., ``Recommend some dishes I have never eaten before''). Given the vast nature of the personal camera roll (i.e., multiple years, hundreds to thousands of photos), a successful AI assistant needs to understand a long-horizon, highly personalized visual content stream in order to navigate and locate the correct and/or relevant information. To support this, we collect and manually annotate questions that mimic real-world usage. The final dataset, camroll, contains 50 users, 31,476 images, and 2,500 QA pairs.

相关事件查看全部 (1)

Personal AI Agent for Camera Roll VQA
2026-06-05PRODUCT_LAUNCH影响: MEDIUM

相关公司

暂无数据

相关人物

暂无数据