Show simple item record

dc.contributor.authorKhanchi, Sara
dc.date.accessioned2019-12-12T14:22:14Z
dc.date.available2019-12-12T14:22:14Z
dc.date.issued2019-12-12T14:22:14Z
dc.identifier.urihttp://hdl.handle.net/10222/76773
dc.description.abstractAlgorithms for constructing classification models in streaming data scenarios are attracting more attention in the era of artificial intelligence and machine learning for data analysis. The huge volumes of streaming data necessitate a learning framework with timely and accurate processing. For a streaming classifier to be deployed in the real world, multiple challenges exist such as 1) Concept drift, 2) Imbalanced data; and 3) Costly labelling processes. These challenges become more crucial when they occur in sensitive fields of operation such as network security. The objective of this thesis is to provide a team-based genetic programming (GP) framework to explore and address these challenges with regard to network-based services. The GP classifier incrementally introduces changes to the model throughout the course of the stream to adapt to the content of the stream. The framework is based on an active learning approach where the learning process happens in interaction with a data subset to build a model. Thus, the design of the system is founded on the introduction of sampling and archiving policies to decouple the stream distribution from the training data subset. These policies work with no prior information on the distribution of classes and true labels. Benchmarking is conducted with real-world network security datasets with label budgets in the order of 5 to 0.5 percent and significant class imbalance. Evaluations for the detection of minor classes have been performed that represent the classifier behaviour in case of attacks. Comparisons to the current streaming algorithms and specifically network state-of-the-art frameworks for streaming processing under label budgets demonstrate the effectiveness of the proposed GP framework to address the challenges related to streaming data. Furthermore, the applicability of the proposed framework in network and security analytics is demonstrated.en_US
dc.language.isoenen_US
dc.subjectBotnet behaviour detectionen_US
dc.subjectMachin learningen_US
dc.subjectCybersecurityen_US
dc.subjectGenetic programmingen_US
dc.titleStream Genetic Programming for Botnet Detectionen_US
dc.date.defence2019-11-22
dc.contributor.departmentFaculty of Computer Scienceen_US
dc.contributor.degreeDoctor of Philosophyen_US
dc.contributor.external-examinerDr. Uyen Trang Nguyenen_US
dc.contributor.graduate-coordinatorDr. Michael McAllisteren_US
dc.contributor.thesis-readerDr. Sirini Sampallien_US
dc.contributor.thesis-readerDr. Andrew McIntyreen_US
dc.contributor.thesis-supervisorDr. Malcolm Heywooden_US
dc.contributor.thesis-supervisorDr. Nur Zincir-Heywooden_US
dc.contributor.ethics-approvalNot Applicableen_US
dc.contributor.manuscriptsNot Applicableen_US
dc.contributor.copyright-releaseNot Applicableen_US
 Find Full text

Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record