Evolving Linear Controllers from YoLo State Capture
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
End-to-end deep reinforcement learning (DRL) has become a prominent paradigm for visual control, with wide application in robotics and autonomous systems. However, its monolithic architecture often presents challenges regarding interpretability, computational overhead, and deployment on resource-constrained edge devices. This thesis investigates a decoupled perception-decision framework that combines real-time object detection (YOLOv11n), Heuristic State Rectification (HSR), and Genetic Algorithms (GA). Unlike standard DRL methods that rely on large convolutional networks to map raw pixels directly to actions, our methodology compresses high-dimensional visual inputs into physically semantic, low-dimensional states, empowering minimalist controllers (e.g., linear models with fewer than 50 parameters) to handle complex non-linear dynamics.
To address the inherent heteroscedasticity of visual noise in camera-based perception, we introduce a Dynamic Seed Resampling mechanism. Acting as an adaptive regularization strategy, it prevents the agent from overfitting to specific environmental initializations, thereby enhancing the robust generalization of the trained agents.
Extensive evaluations against a Proximal Policy Optimization (PPO) baseline across five classic control benchmarks (CartPole, Acrobot, MountainCar, Pendulum, and LunarLander) demonstrates the efficacy of our approach. While the PPO baseline exhibited sensitivity to initialization and encountered bottlenecks in sparse-reward environments, our decoupled framework achieved highly consistent success rates on discrete action tasks with significantly reduced computational cost and thermal output. Furthermore, through an analysis of spatial drift, we empirically quantify the ``Visual Noise Barrier.'' The results elucidate that while the Vision-HSR-GA framework excels in dynamic macroscopic control, achieving absolute zero-velocity precision is fundamentally bottlenecked by the Signal-to-Noise Ratio (SNR) of the frontend perception module. Ultimately, this research validates the proposed framework as a robust, interpretable, and hardware-friendly alternative for visual control tasks.
Description
Submitted in partial fulfillment of the requirements for the degree of Master of Computer Science.
Keywords
YoLo, Genetic Algorithm, Reinforcement Learning
