Show simple item record

dc.contributor.authorKelly, Stephen
dc.date.accessioned2018-06-21T16:04:28Z
dc.date.available2018-06-21T16:04:28Z
dc.date.issued2018-06-21T16:04:28Z
dc.identifier.urihttp://hdl.handle.net/10222/73979
dc.description.abstractAlgorithms that learn through environmental interaction and delayed rewards, or reinforcement learning, increasingly face the challenge of scaling to dynamic, high-dimensional environments. Video games model these types of real-world decision-making and control scenarios while being simple enough to implement within experiments. This work demonstrates how emergent modularity and open-ended evolution allow genetic programming (GP) to discover strategies for difficult gaming scenarios while maintaining relatively low model complexity. Two related learning algorithms are considered: Policy Trees and Tangled Program Graphs (TPG). In the case of Policy Trees, a methodology for transfer learning is proposed which specifically leverages both structural and behavioural modularity in the learner representation. The utility of the approach is empirically evaluated in two challenging task domains: RoboCup Soccer and Ms. Pac-Man. In RoboCup, decision-making policies are first evolved for simple subtasks and then reused within a policy hierarchy in order to learn the more complex task of Half-Field Offense. The same methodology is applied to Ms. Pac-Man, in which case the use of task-agnostic diversity maintenance enables the automatic discovery of suitable sub-policies, removing the need for a prior human-specified task decomposition. In both task domains, the final GP decision-making policies reach state-of-the-art levels of play while being significantly less complex than solutions from temporal difference methods and neuroevolution. TPG takes a more open-ended approach to modularity, emphasizing the ability to adaptively complexify policies through interaction with the task environment. The challenging Atari video game environment is used to show that this approach builds decision-making policies that broadly match the quality of several deep learning methods while being several orders of magnitude less computationally demanding, both in terms of sample efficiency and model complexity. Finally, the approach is capable of evolving solutions to multiple game titles simultaneously with no additional computational cost. In this case, agent behaviours for an individual game as well as single agents capable of playing up to 5 games emerge from the same evolutionary run.en_US
dc.language.isoenen_US
dc.subjectgenetic programmingen_US
dc.subjectemergent modularityen_US
dc.subjectcooperative coevolutionen_US
dc.subjectreinforcement learningen_US
dc.subjectmulti-task learningen_US
dc.subjectvideo gamesen_US
dc.titleScaling Genetic Programming to Challenging Reinforcement Tasks through Emergent Modularityen_US
dc.date.defence2018-06-08
dc.contributor.departmentFaculty of Computer Scienceen_US
dc.contributor.degreeDoctor of Philosophyen_US
dc.contributor.external-examinerDr. Wolfgang Banzhafen_US
dc.contributor.graduate-coordinatorDr. Norbert Zehen_US
dc.contributor.thesis-readerDr. Thomas Trappenbergen_US
dc.contributor.thesis-readerDr. Dirk Arnolden_US
dc.contributor.thesis-supervisorDr. Malcolm Heywooden_US
dc.contributor.ethics-approvalNot Applicableen_US
dc.contributor.manuscriptsNot Applicableen_US
dc.contributor.copyright-releaseNot Applicableen_US
 Find Full text

Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record