Evaluating using Mock Tool Calls to Quarantine Untrusted Prompt Inputs 事件
PRODUCT_LAUNCH2026-06-01影响: MEDIUM
Evaluating using Mock Tool Calls to Quarantine Untrusted Prompt Inputs arXiv:2605.30521v1 Announce Type: new Abstract: Large language models must frequently process untrusted inputs, such as judging an answer from another model or running tasks like spam and harm classifiers while under adversarial pressure. These inputs are often string-formatted directly into a prompt template, leaving systems fragile to manipulation. Current LLM specs from major providers like OpenAI distinguish trustworthin
相关产品查看全部 (10)
相关报道查看全部 (1)
Evaluating using Mock Tool Calls to Quarantine Untrusted Prompt Inputs
ArXiv CS.CL2026-06-01