APT-Agent: Automated Penetration Testing using Large Language Models 文章

ArXiv CS.AI2026-05-26NEWSen作者: William Guanting Li (University of Queensland), Alsharif Abuadbba (CSIRO Data61), Kristen Moore (CSIRO Data61), Dan Dongseong Kim (University of Queensland)

摘要

arXiv:2605.24949v1 Announce Type: cross Abstract: Penetration testing is essential to securing modern web infrastructures, yet traditional manual methods struggle to keep pace with their scale and complexity. Large Language Models (LLMs) offer new opportunities for automating these tasks, but existing approaches face two persistent challenges: hallucination of technical entities and insufficient long-term contextual memory. To address these issues, we present APT-Agent, a fully automated LLM-driven penetration testing framework that systematically orchestrates reconnaissance, exploitation, and exfiltration. APT-Agent introduces a hybrid rectification module to recover hallucinated commands and a command-specific memory architecture to preserve operational context across multi-step attack sequences. We evaluate our APT-Agent on Metasploitable 2 against seven vulnerable services spanning web, database, and network protocols. APT-Agent achieves an 84.