U-papers :: View publication

View publication

Title	Malicious Domain Names Detection with DeepDGA, a Hybrid Character and Word Embeddings Deep Learning Architecture
Authors	Lucas Torrealba, Pedro Casas-Hernandez, Diego García, Javier Bustos-Jiménez, Ivana Bachmann
Publication date	2025
Abstract	The rapid expansion of the Internet has enabled cybercriminal operations at unprecedented scale. A recurring tactic is the use of algorithmically generated domains (AGDs) created by domain generation algorithms (DGAs) to orchestrate botnet command-and-control, host phishing content, and distribute malware. Traditional defenses such as blocklists and heuristic rules are brittle against new domains and evolving attacker strategies. We present DeepDGA, a hybrid deep learning architecture that fuses character-level and word-level representations to detect both pseudo-random and dictionary-based DGAs. Character-level embeddings processed by a BiLSTM capture subword patterns and entropy; word-level embeddings derived from a dom2words tokenization and Word2Vec capture linguistic regularities exploited by dictionary-based DGAs. Evaluations on a public benchmark with more than 670,000 domains, including 25 DGA families and benign top-popular domains, demonstrate the superiority of DeepDGA. The model achieves precision and recall above 0.97 for dictionary-based DGAs, and even higher (above 0.98) for pseudo-random DGAs, consistently outperforming state-of-the-art methods across multiple metrics. DeepDGA's effectiveness, particularly in detecting the more challenging dictionary-based DGAs, highlights the benefit of combining diverse embedding strategies into the same deep learning architecture
Pages	1-6
Conference name	International Conference on Network and Service Management
Publisher	Austrian Institute of Technology
Reference URL