An Investigation of a Multi-Objective Genetic Algorithm applied to Encrypted Traffic Identification
This work explores the use of a Multi-Objective Genetic Algorithm (MOGA) for both, feature selection and cluster count optimization, for an unsupervised machine learning technique, K-Means, applied to encrypted traffic identification (SSH). The performance of the proposed model is benchmarked against other unsupervised learning techniques existing in the literature: Basic K-Means, semi-supervised K-Means, DBSCAN, and EM. Results show that the proposed MOGA, not only outperforms the other models, but also provides a good trade off in terms of detection rate, false positive rate, and time to build and run the model. A hierarchical version of the proposed model is also implemented, to observe the gains, if any, obtained by increasing cluster purity by means of a second layer of clusters. Results show that with the hierarchical MOGA, significant gains are observed in terms of the classification performances of the system.