Khodjaeva, Yulduz2021-12-172021-12-172021-12-17http://hdl.handle.net/10222/81120The thesis proposes the concept of "entropy of a flow" to augment flow statistical features for DNS tunnelling detection, specifically DNS over HTTPS traffic. To achieve this, the use of flow exporters, namely Argus, DoHlyzer and Tranalyzer2 are explored. Flow features are then augmented with the flow entropy, calculated in three different ways: entropy over all packets of a flow, entropy over the first 96 bytes of a flow, entropy over the first n-packets of a flow. These features are provided as input to five machine learning classifiers, specifically Decision Tree, Random Forest, Logistic Regression, Support Vector Machine and Naive Bayes to detect malicious behaviours in different publicly available datasets. Evaluations show that the Decision Tree algorithm could reach an F-measure of approximately 99.7% when flow statistical features are augmented with the flow entropy of the first four packets. This model is then optimized using TPOT-AutoML, where the Random Forest classifier provided the best pipeline configuration for the same features.enflow entropyDNS tunnellingMachine LearningAutoMLDNS over HTTPSDetecting malicious DNS tunnels via network flow entropy