Evolving Linear Controllers from YoLo State Capture

Hu, Zhengping

Evolving Linear Controllers from YoLo State Capture

dc.contributor.author	Hu, Zhengping
dc.contributor.copyright-release	Not Applicable
dc.contributor.degree	Master of Computer Science
dc.contributor.department	Faculty of Computer Science
dc.contributor.ethics-approval	Not Applicable
dc.contributor.external-examiner	Dr. Andrew McIntyre
dc.contributor.manuscripts	Not Applicable
dc.contributor.thesis-reader	Dr. Yannick Marchand
dc.contributor.thesis-supervisor	Dr. Malcolm Heywood
dc.date.accessioned	2026-04-23T18:21:04Z
dc.date.available	2026-04-23T18:21:04Z
dc.date.defence	2026-04-21
dc.date.issued	2026-04-23
dc.description	Submitted in partial fulfillment of the requirements for the degree of Master of Computer Science.
dc.description.abstract	End-to-end deep reinforcement learning (DRL) has become a prominent paradigm for visual control, with wide application in robotics and autonomous systems. However, its monolithic architecture often presents challenges regarding interpretability, computational overhead, and deployment on resource-constrained edge devices. This thesis investigates a decoupled perception-decision framework that combines real-time object detection (YOLOv11n), Heuristic State Rectification (HSR), and Genetic Algorithms (GA). Unlike standard DRL methods that rely on large convolutional networks to map raw pixels directly to actions, our methodology compresses high-dimensional visual inputs into physically semantic, low-dimensional states, empowering minimalist controllers (e.g., linear models with fewer than 50 parameters) to handle complex non-linear dynamics. To address the inherent heteroscedasticity of visual noise in camera-based perception, we introduce a Dynamic Seed Resampling mechanism. Acting as an adaptive regularization strategy, it prevents the agent from overfitting to specific environmental initializations, thereby enhancing the robust generalization of the trained agents. Extensive evaluations against a Proximal Policy Optimization (PPO) baseline across five classic control benchmarks (CartPole, Acrobot, MountainCar, Pendulum, and LunarLander) demonstrates the efficacy of our approach. While the PPO baseline exhibited sensitivity to initialization and encountered bottlenecks in sparse-reward environments, our decoupled framework achieved highly consistent success rates on discrete action tasks with significantly reduced computational cost and thermal output. Furthermore, through an analysis of spatial drift, we empirically quantify the ``Visual Noise Barrier.'' The results elucidate that while the Vision-HSR-GA framework excels in dynamic macroscopic control, achieving absolute zero-velocity precision is fundamentally bottlenecked by the Signal-to-Noise Ratio (SNR) of the frontend perception module. Ultimately, this research validates the proposed framework as a robust, interpretable, and hardware-friendly alternative for visual control tasks.
dc.identifier.uri	https://hdl.handle.net/10222/86041
dc.language.iso	en
dc.subject	YoLo
dc.subject	Genetic Algorithm
dc.subject	Reinforcement Learning
dc.title	Evolving Linear Controllers from YoLo State Capture

Files

Original bundle

Now showing 1 - 1 of 1

Name:: ZhengpingHu2026.pdf
Size:: 6.45 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 2.12 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Faculty of Graduate Studies Online Theses