COUGAR: A System for Clustering Unknown Malware Using Genetic Algorithm Routines
dc.contributor.author | Wilkins, Zachary | |
dc.contributor.copyright-release | Not Applicable | en_US |
dc.contributor.degree | Master of Computer Science | en_US |
dc.contributor.department | Faculty of Computer Science | en_US |
dc.contributor.ethics-approval | Not Applicable | en_US |
dc.contributor.external-examiner | n/a | en_US |
dc.contributor.graduate-coordinator | Michael McAllister | en_US |
dc.contributor.manuscripts | Not Applicable | en_US |
dc.contributor.thesis-reader | Malcolm I. Heywood | en_US |
dc.contributor.thesis-reader | Tami Meredith | en_US |
dc.contributor.thesis-supervisor | Nur Zincir-Heywood | en_US |
dc.contributor.thesis-supervisor | Frédéric Massicotte | en_US |
dc.date.accessioned | 2020-12-08T17:28:20Z | |
dc.date.available | 2020-12-08T17:28:20Z | |
dc.date.defence | 2020-11-12 | |
dc.date.issued | 2020-12-08T17:28:20Z | |
dc.description.abstract | Malicious software is a persistent threat across our digital platforms. With unending malware growth, and increasingly higher profile attacks, organizations across the world are ramping up their cyber defence capabilities. Cluster analysis is one such tool for understanding the threats faced. By organizing seemingly disconnected samples according to their behaviours, attack patterns can be discerned and defended against. But given the volume of malware, an automated approach is necessary to scale. In this thesis, I design and implement a system called COUGAR which uses a multi-objective genetic algorithm to automatically optimize clustering algorithms. The clustering algorithms are applied to low-dimensional embeddings derived from high-dimensional malware behavioural data. The system employs function imports extracted from malicious binaries, but is flexible enough to accommodate many other features derived from static or dynamic malware analysis. After the optimization process completes, the system generates signatures for each cluster which prioritize usability and comprehensible signature components. The experiments indicate that any of the chosen clustering algorithms can produce at least satisfactory results, with density-based approaches generating especially successful clusters, achieving an F-Score of 0.79 and V-Measure of 0.88. The resulting signatures are very representative of their respective clusters, with the vast majority achieving representation scores of at least 90%. | en_US |
dc.identifier.uri | http://hdl.handle.net/10222/80075 | |
dc.language.iso | en | en_US |
dc.subject | Cyber security | en_US |
dc.subject | Machine learning | en_US |
dc.subject | Malware | en_US |
dc.subject | Clustering | en_US |
dc.subject | Cyber attack | en_US |
dc.subject | Evolution | en_US |
dc.title | COUGAR: A System for Clustering Unknown Malware Using Genetic Algorithm Routines | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Wilkins-Zachary-MCSc-CS-December-2020.pdf
- Size:
- 11.94 MB
- Format:
- Adobe Portable Document Format
- Description:
- Thesis as PDF/A-1b
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed upon to submission
- Description: