Time-frequency localization of bird calls in dense soundscapes 文章

ArXiv CS.CV2026-06-10NEWSen作者: Simen Hexeberg, Fanghui Tong, Hari Vishnu, Mandar Chitre

详细信息

来源站点
ArXiv CS.CV
作者
Simen Hexeberg, Fanghui Tong, Hari Vishnu, Mandar Chitre
文章类型
NEWS
语言
en
发布日期
2026-06-10

摘要

arXiv:2606.10407v1 Announce Type: cross Abstract: Passive acoustic monitoring enables large-scale observation of wildlife, but most bioacoustic classifiers only predict species presence in a time window without localizing vocalizations precisely in time or frequency, limiting downstream analyses. We formulate bird vocalization detection as an object detection task on spectrograms and train YOLO11 models to localize bird calls in dense tropical soundscapes from Singapore. We additionally introduce an open-source browser-based annotation tool and propose Intersection over Minimum (IoMin), an evaluation metric that better handles ambiguous acoustic boundaries than standard IoU and is better suited to the problem at hand. The best YOLO model nearly doubles baseline performance on in-distribution soundscapes from Singapore (81.8% vs. 42.1% IoMin@50 F1-score) while still outperforming the baseline on unseen out-of-distribution recordings from Hawaii (58.6% vs. 48.6%).

相关事件

暂无数据

相关公司

暂无数据

相关人物

暂无数据