An Investigation on Detecting Applications Hidden in SSL Streams using Machine Learning Techniques
MetadataShow full item record
The importance of knowing what type of traffic is flowing through a network is paramount to its success. Traffic shaping, Quality of Service, identifying critical business applications, Intrusion Detection Systems, as well as network administra- tion activities all require the base knowledge of what traffic is flowing over a network before any further steps can be taken. With SSL traffic on the rise due to applica- tions securing or concealing their traffic, the ability to determine what applications are running within a network is getting more and more difficult. Traditional methods of traffic classification through port numbers or deep packet inspection have been deemed inadequate by researchers thus making way for new methods. The purpose of this thesis is to investigate if a machine learning approach can be used with flow features to identify SSL in a given network trace. To this end, different machine learning methods are investigated without the use of port numbers, Internet Protocol addresses, or payload information. Various machine learning models are investigated including AdaBoost, Naive Bayes, RIPPER, and C4.5. The robustness of the results are tested against unseen datasets during training. Moreover, the proposed approach is compared to the Wireshark traffic analysis tool. Results show that the proposed ap- proach is very promising in identifying SSL traffic from a given network trace without using port numbers, Internet protocol addresses, or payload information.