Exploring Phishing Detection Using Search Engine Optimization and Uniform Resource Locator based Information
Abstract
Phishing attacks are the work of social engineering. They are used to trick users to obtain their sensitive/private information using malicious links, websites, and electronic messages. In this thesis, phishing attack detection is explored using information based on uniform resource locators (URLs) and third-party search engine optimization (SEO) tools. A supervised learning approach is used to detect phishing websites. Evaluations are performed using real-world data and a Decision Tree model, which optimized using the Tree-based Pipeline Optimization Tool (TPOT) via Automated Machine Learning (AutoML). The results obtained are not only better than the state-of-the-art models in the literature, but also achieve a 97% detection rate. To utilize the proposed model, the best-performing pipeline from TPOT is embedded to a web API for future remote access.