INVESTIGATING A BEHAVIOUR ANALYSIS-BASED EARLY WARNING SYSTEM TO IDENTIFY BOTNETS USING MACHINE LEARNING ALGORITHMS
Botnets represent one of the more aggressive threats against cyber security and botnet traffic analysis is one of the main approaches to study and investigate such threats. Botnets employ different techniques (e.g. fluxing and encryption), topologies (e.g. centralized and de-centralized) and communication protocols (e.g. HTTP and DNS) in different stages of their lifecycle. Therefore, identifying the botnets has become very challenging given that they can upgrade their methodology automatically at any time for one reason or another. To this end, different approaches are proposed for botnet traffic analysis and detection based on various botnet behaviours and structures. Hence, the main focus of this thesis is to investigate various botnet detection approaches based on the technique used and the available data. Specifically, two main categories of solutions are explored: application data analysis-based solutions and network analysis-based solutions. In the application data analysis category, two different approaches are explored: one with a priori knowledge and the other one without any a priori knowledge. On the other hand, flow-based botnet detection approaches are explored in the network analysis-based category focused on using minimum a priori knowledge. In this case, various feature extraction methods, machine learning algorithms, protocol filtering, non-numeric feature representation, normal behaviour representation and time generalization issues are investigated. Finally, a flow-based early warning system is proposed. The effectiveness of the solutions is shown on several botnet data sets from IRC botnets to peer-to-peer botnets. Results indicate that the proposed solutions can detect botnet behaviour with good performances. Moreover, two botnet detection systems from the literature and two publicly available malicious behaviour detection systems are employed for further evaluation of the proposed early warning system. The results indicate that the proposed system outperformed these four systems. Last but not least, the proposed system is evaluated as well on botnets in cellular networks on an exploratory basis. It is shown that the proposed system demonstrates promising performance under such circumstances as well.