STAP Journal of Security Risk Management

ISSN: 3080-9444 (Online)

PhishGuard: AI-Driven Graph-Based Analysis for Smarter Email Security

by 

Harchana Ramesh ;

Noris Ismail ;

Nor Azlina Abd Rahman ;

Aitizaz Ali

PDF logoPDF

Published: 2026

Abstract

This research presents a phishing detection system that integrates graph analytics and machine learning to improve email security. As phishing tactics become more sophisticated, traditional filters often fail to detect such threats effectively. This project proposes a dual-model solution: a RoBERTa-based transformer is used to classify the email body content, while a Neo4j-powered graph model analyses sender-receiver domain relationships using graph metrics such as PageRank, ArticleRank, and Degree Centrality. The rule-based system intelligently combines the predictions of the two models. Highly confident RoBERTa results are accepte d directly, whereas for the remaining cases, scores from the graph model are applied. For mid-confidence cases, a fixed rule-based thresholding logic is used to ensure robust classification. For real-time detection, a web interface was developed using Streamlit, integrating Gmail API and Google Apps Script for email quarantine. The system achieved an F1 score above 0.99 in testing, marking it as a fully stable system for spam identification. By combining content and relational signals, the work advances email security and accordingly fulfils Sustainable Development Goal 9 by fostering innovation infrastructure in digital safety.

Keywords

Phishing DetectionRoBERTaGraph AnalyticsNeo4jText ClassificationSDG 9

References

  1. Hewage, C., Khan, I. A., Nawaf, L., & Alkhalil, Z. (2021, February). Phishing attacks: A recent comprehensive study and a new anatomy. ResearchGate.https://www.researchgate.net/publication/349312504_Phishing_Attacks_A_Recent_Comprehensive_Study_and_a_New_Anatomy
  2. Adil, M., Farouk, A., Ali, A., Song, H., & Jin, Z. (2025). Securing tomorrow of next-generation technologies with biometrics, state-of-the-art techniques, open challenges, and future research directions. Computer Science Review, 57, 100750.
  3. Al-Maari, A. A., Abdulnabi, M., Nathan, Y., Ali, A., Ali, U., & Khan, M. (2025). Optimized credit card fraud detection leveraging ensemble machine learning methods. Engineering, Technology & Applied Science Research, 15(3), 22287–22294.
  4. Addula, S. R., & Ali, A. (2025). A novel permissioned blockchain approach for scalable and privacy-preserving IoT authentication. Journal of Cyber Security and Risk Auditing, 2025(4), 222–237.
  5. Frederick, N., & Ali, A. (2024). Enhancing DDoS attack detection and mitigation in SDN using advanced machine learning techniques. Journal of Cyber Security and Risk Auditing, 2024(1), 23–37.
  6. Nadeem, M., Zahra, S. W., Abbasi, M. N., Arshad, A., Riaz, S., & Ahmed, W. (2023, September). Phishing attack, its detections and prevention techniques. ResearchGate. https://www.researchgate.net/publication/374848676_Phishing_Attack_Its_Detections_and_Prevention_Techniques
  7. Putra, F. P., Ubaidi, U., Zulfikri, A., Arifin, G., & Ilhamsyah, R. M. (2024, August). Analysis of phishing attack trends, impacts and prevention methods: Literature study. ResearchGate. https://www.researchgate.net/publication/383193964_Analysis_of_Phishing_Attack_Trends_Impacts_and_Prevention_Methods_Literature_Study
  8. SiteGround. (2024). What are email protocols (POP3, SMTP and IMAP) and their default ports? https://www.siteground.com/tutorials/email/protocols-pop3-smtp-imap/
  9. Booker, E. Z. (2024). How email systems are designed? OpenGenus. https://iq.opengenus.org/how-email-systems-are-designed
  10. Aleksic, M. (2022, April 14). IMAP vs. POP3 vs. SMTP: What are the differences? PhoenixNAP. https://phoenixnap.com/kb/imap-vs-pop3-vs-smtp
  11. Alkhalil, Z., Hewage, C., Nawaf, L., & Khan, I. (2021, February). Phishing attacks: A recent comprehensive study and a new anatomy. ResearchGate. https://www.researchgate.net/publication/349312504_Phishing_Attacks_A_Recent_Comprehensive_Study_and_a_New_Anatomy
  12. Barracuda. (2022, March). Spear phishing: Top threats and trends. https://www.barracudamsp.com/content/dam/barracuda-msp/docs/resources/pdf/reports/RP-Spear-Phishing-vol7.pdf
  13. Petrosyan, A. (2023, March 17). Volume of spear phishing and whaling attacks on organizations worldwide in 2021. Statista. https://www.statista.com/statistics/1147426/volume-phishing-attacks-organizations-face-it-professionals/
  14. Colback, L. (2024). Technology and cyber crime: how to keep out the bad guys. Financial Times. https://www.ft.com/content/8a79ab25-c902-4110-bcb8-be2fd422f6bf
  15. IBM. (2024). What is machine learning (ML)? https://www.ibm.com/topics/machine-learning
  16. Chugani, V. (2024). Industries in focus: Machine learning for cybersecurity threat detection. Machine Learning Mastery. https://machinelearningmastery.com/industries-in-focus-machine-learning-for-cybersecurity-threat-detection/
  17. Ballejos, L. (2024, October 16). The role of machine learning in cybersecurity. NinjaOne. https://www.ninjaone.com/blog/machine-learning-in-cybersecurity/
  18. Altwaijry, N., Al-Turaiki, I., Alotaibi, R., & Alakeel, F. (2024, March 24). Advancing phishing email detection: A comparative study of deep learning models. MDPI. https://www.mdpi.com/1424-8220/24/7/2077
  19. Wolert, R., & Rawski, M. (2023, June). Email phishing detection with BLSTM and word embeddings. ResearchGate. https://www.researchgate.net/publication/377592515_Email_Phishing_Detection_with_BLSTM_and_Word_Embeddings.
  20. Chessa, M., Panebianco, M., Corbu, S., Lussu, M., Dessì, A., Pintus, R., ... & Fanos, V. (2021, July). Urinary metabolomics study of patients with bicuspid aortic valve disease. ResearchGate. https://www.researchgate.net/publication/353215356_Urinary_Metabolomics_Study_of_Patients_with_Bicuspid_Aortic_Valve_Disease
  21. Zhou, H., Xiao, X., Ali, A., Ali, A., Han, D., Zheng, W., ... & Zhou, Q. (2022, March). Integration of GWAS and transcriptome analyses to identify SNPs and candidate genes for aluminum tolerance in rapeseed (Brassica napus L.). ResearchGate. https://www.researchgate.net/publication/359389739_Integration_of_GWAS_and_transcriptome_analyses_to_identify_SNPs_and_candidate_genes_for_aluminum_tolerance_in_rapeseed_Brassica_napus_L
  22. Hugging Face. (2025). Transformers documentation. https://huggingface.co/docs/transformers/index
  23. Adil, M., Abulkasim, H., Ali, A., Song, H., Farouk, A., & Jin, Z. (2024). Role of 5G and 6G technologies in metaverse, quality of service challenges and future research directions. IEEE Network.
  24. Alkhdour, T. A. Y. S., Alrawashdeh, R. A. N. A., Almaiah, M. O., Alali, R. O., Salloum, S. A., & Aldahyani, T. H. (2024). A new technique for detecting email spam risks using LSTM-particle swarm optimization algorithms. Journal of Theoretical and Applied Information Technology, 102(14).
  25. Ali, A. (2025). The impact of AI-generated content on customer and patient service optimization with clinical decision support. Babylonian Journal of Artificial Intelligence, 2025, 107–116.
  26. Naveed, F., Masih, A., Mahmood, J., Ahmed, M., Ali, A., Saddiqa, A., ... & Agbozo, E. (2025). Sustainable AI for plant disease classification using ResNet18 in few-shot learning. Array, 26, 100395.
  27. Ullah, R., Sarwar, N., Alatawi, M. N., Alsadhan, A. A., Alwageed, H. S., Khan, M., & Ali, A. (2025). Advancing personalized diagnosis and treatment using deep learning architecture. Frontiers in Medicine, 12, 1545528.