An Investigation on Detecting Applications Hidden in SSL Streams using Machine Learning Techniques
Date
2010-09-09
Authors
McCarthy, Curtis
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The importance of knowing what type of traffic is flowing through a network is
paramount to its success. Traffic shaping, Quality of Service, identifying critical
business applications, Intrusion Detection Systems, as well as network administra-
tion activities all require the base knowledge of what traffic is flowing over a network
before any further steps can be taken. With SSL traffic on the rise due to applica-
tions securing or concealing their traffic, the ability to determine what applications
are running within a network is getting more and more difficult. Traditional methods
of traffic classification through port numbers or deep packet inspection have been
deemed inadequate by researchers thus making way for new methods. The purpose
of this thesis is to investigate if a machine learning approach can be used with flow
features to identify SSL in a given network trace. To this end, different machine
learning methods are investigated without the use of port numbers, Internet Protocol
addresses, or payload information. Various machine learning models are investigated
including AdaBoost, Naive Bayes, RIPPER, and C4.5. The robustness of the results
are tested against unseen datasets during training. Moreover, the proposed approach
is compared to the Wireshark traffic analysis tool. Results show that the proposed ap-
proach is very promising in identifying SSL traffic from a given network trace without
using port numbers, Internet protocol addresses, or payload information.
Description
Keywords
SSL, Traffic Classification, Machine Learning, TLS, Secure Sockets Layer, Transport Layer Security