DGA-Detect: Using Machine Learning for Collaborative DGA Detection
2019-10-21, 14:40–15:00, Hollenfels

Combining visit statistics from different sharing partners with domains from DGArchive we leverage machine learning to pre-filter suspicious domains for further annotation and correlation in a dedicated MISP instance. This open source stack allows us to pinpoint domains which are most likely generated by a domain generation algorithm (DGA).


The DGA-Detect project allows us to find suspicious domains most likely generated by a domain generation algorithm (DGA) by collecting and analyzing data from several sharing partners. The system uses the daily count of domains at a sharing partner and the corresponding data from DGArchive to learn a tailor-made classifier which can be used in several ways:

Each contributing sharing partner has its own classifier which we use as an ensemble of experts to lower the false positive rate of the overall system. By using the classifiers as effective filters we just have to collect the potentially suspicious domains of the daily update in a dedicated MISP instance which is used as central data storage for pooling, annotating and correlation analysis. We finally select the suspicious domains for alerting based on criteria built from meta data of the domains (for example: last seen, number of sightings, ...) directly from the MISP instance.

See also: DGA-Detect source code