Classifying phishing URLs using recurrent neural networks 论文
摘要
As the technical skills and costs associated with the deployment of phishing attacks decrease, we are witnessing an unprecedented level of scams that push the need for better methods to proactively detect phishing threats. In this work, we explored the use of URLs as input for machine learning models applied for phishing site prediction. In this way, we compared a feature-engineering approach followed by a random forest classifier against a novel method based on recurrent neural networks. We determined that the recurrent neural network approach provides an accuracy rate of 98.7% even without the need of manual feature creation, beating by 5% the random forest method. This means it is a scalable and fast-acting proactive detection system that does not require full content analysis.