Unifying Speech Editing Detection and Content Localization via Prior-Enhanced Audio LLMs 文章

ArXiv CS.AI2026-05-26NEWSen作者: Jun Xue, Yi Chai, Yanzhen Ren, Jinshen He, Zhiqiang Tang, Zhuolin Yi, Yihuan Huang, Yuankun Xie, Yujie Chen

详细信息

来源站点: ArXiv CS.AI
作者: Jun Xue, Yi Chai, Yanzhen Ren, Jinshen He, Zhiqiang Tang, Zhuolin Yi, Yihuan Huang, Yuankun Xie, Yujie Chen
文章类型: NEWS
语言: en
发布日期: 2026-05-26

原文

摘要

arXiv:2601.21463v3 Announce Type: replace-cross Abstract: Existing speech editing detection (SED) datasets are predominantly constructed using manual splicing or limited editing operations, resulting in restricted diversity and poor coverage of realistic editing scenarios. Meanwhile, current SED methods rely heavily on frame-level supervision to detect observable acoustic anomalies, which fundamentally limits their ability to handle deletion-type edits, where the manipulated content is entirely absent from the signal. To address these challenges, we present a unified framework that bridges speech editing detection and content localization through a generative formulation based on Audio Large Language Models (Audio LLMs). We first introduce AiEdit, https://huggingface.

Unifying Speech Editing Detection and Content Localization via Prior-Enhanced Audio LLMs 文章

详细信息

摘要

相关事件

相关公司查看全部 (2)

相关人物

相关产品查看全部 (11)

相关技术查看全部 (24)