Repository logo
 

Comparative analysis and Evaluation of techniques for Generating High-Quality synthetic Datasets for Industrial Control Systems

dc.contributor.authorSwaminath Ganesh, Gautam
dc.contributor.copyright-releaseNot Applicableen_US
dc.contributor.degreeMaster of Computer Scienceen_US
dc.contributor.departmentFaculty of Computer Scienceen_US
dc.contributor.ethics-approvalNot Applicableen_US
dc.contributor.external-examinern/aen_US
dc.contributor.graduate-coordinatorMichael McAllisteren_US
dc.contributor.manuscriptsNot Applicableen_US
dc.contributor.thesis-readerYujie Tangen_US
dc.contributor.thesis-readerJaume Maneroen_US
dc.contributor.thesis-supervisorSrinivas Sampallien_US
dc.date.accessioned2023-08-14T17:44:15Z
dc.date.available2023-08-14T17:44:15Z
dc.date.defence2023-08-08
dc.date.issued2023-08-10
dc.description.abstractIndustrial Control Systems (ICSs) and SCADA networks are vital for managing complex industrial infrastructures, ensuring smooth operations across applications. Rising cyber threats prompt the exploration of machine learning and deep learning techniques, utilizing neural networks to detect and predict attacks. However, limited training data and biased outcomes undermine these models' accuracy. Privacy concerns add complexity. Synthetic data generation emerges as a research focus. The goal is to replicate real data's statistical features for augmentation, privacy, and model development. Balancing realism and confidentiality is crucial. Evaluating synthetic data is challenging. Existing methods cater to specific applications, demanding an unbiased, diverse, standardized evaluation. This thesis performs a comprehensive comparative analysis of synthetic data generation for ICS datasets. It proposes an evaluation framework using visualization and statistics. Three models—GANs, VAEs, GMMs—are compared, assessing Fidelity, Privacy, Diversity, Interpretability, and Utility. The aim is to guide researchers and practitioners in method selection for ICS applications, promoting diverse, unbiased datasets. The analysis highlights the strengths, limitations, and trade-offs of synthetic data techniques for ICS datasets. Findings aid optimal high-quality synthetic data generation, enabling privacy-preserving research. Diverse synthetic datasets facilitate experimentation, and validation, bolstering ICS robustness. This research advances ICS understanding, fostering secure and efficient development.en_US
dc.identifier.urihttp://hdl.handle.net/10222/82778
dc.language.isoenen_US
dc.subjectSynthetic dataen_US
dc.subjectGenerative AIen_US
dc.subjectGAN modelen_US
dc.subjectGMM modelen_US
dc.subjectVAE Modelen_US
dc.subjectICSen_US
dc.subjectSCADAen_US
dc.subjectComparative Analysisen_US
dc.titleComparative analysis and Evaluation of techniques for Generating High-Quality synthetic Datasets for Industrial Control Systemsen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
GautamSwaminathGanesh2023.pdf
Size:
1.03 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: