Scaling Genetic Programming to Challenging Reinforcement Tasks through Emergent Modularity

Kelly, Stephen

dc.contributor.author	Kelly, Stephen
dc.date.accessioned	2018-06-21T16:04:28Z
dc.date.available	2018-06-21T16:04:28Z
dc.date.issued	2018-06-21T16:04:28Z
dc.identifier.uri	http://hdl.handle.net/10222/73979
dc.description.abstract	Algorithms that learn through environmental interaction and delayed rewards, or reinforcement learning, increasingly face the challenge of scaling to dynamic, high-dimensional environments. Video games model these types of real-world decision-making and control scenarios while being simple enough to implement within experiments. This work demonstrates how emergent modularity and open-ended evolution allow genetic programming (GP) to discover strategies for difficult gaming scenarios while maintaining relatively low model complexity. Two related learning algorithms are considered: Policy Trees and Tangled Program Graphs (TPG). In the case of Policy Trees, a methodology for transfer learning is proposed which specifically leverages both structural and behavioural modularity in the learner representation. The utility of the approach is empirically evaluated in two challenging task domains: RoboCup Soccer and Ms. Pac-Man. In RoboCup, decision-making policies are first evolved for simple subtasks and then reused within a policy hierarchy in order to learn the more complex task of Half-Field Offense. The same methodology is applied to Ms. Pac-Man, in which case the use of task-agnostic diversity maintenance enables the automatic discovery of suitable sub-policies, removing the need for a prior human-specified task decomposition. In both task domains, the final GP decision-making policies reach state-of-the-art levels of play while being significantly less complex than solutions from temporal difference methods and neuroevolution. TPG takes a more open-ended approach to modularity, emphasizing the ability to adaptively complexify policies through interaction with the task environment. The challenging Atari video game environment is used to show that this approach builds decision-making policies that broadly match the quality of several deep learning methods while being several orders of magnitude less computationally demanding, both in terms of sample efficiency and model complexity. Finally, the approach is capable of evolving solutions to multiple game titles simultaneously with no additional computational cost. In this case, agent behaviours for an individual game as well as single agents capable of playing up to 5 games emerge from the same evolutionary run.	en_US
dc.language.iso	en	en_US
dc.subject	genetic programming	en_US
dc.subject	emergent modularity	en_US
dc.subject	cooperative coevolution	en_US
dc.subject	reinforcement learning	en_US
dc.subject	multi-task learning	en_US
dc.subject	video games	en_US
dc.title	Scaling Genetic Programming to Challenging Reinforcement Tasks through Emergent Modularity	en_US
dc.date.defence	2018-06-08
dc.contributor.department	Faculty of Computer Science	en_US
dc.contributor.degree	Doctor of Philosophy	en_US
dc.contributor.external-examiner	Dr. Wolfgang Banzhaf	en_US
dc.contributor.graduate-coordinator	Dr. Norbert Zeh	en_US
dc.contributor.thesis-reader	Dr. Thomas Trappenberg	en_US
dc.contributor.thesis-reader	Dr. Dirk Arnold	en_US
dc.contributor.thesis-supervisor	Dr. Malcolm Heywood	en_US
dc.contributor.ethics-approval	Not Applicable	en_US
dc.contributor.manuscripts	Not Applicable	en_US
dc.contributor.copyright-release	Not Applicable	en_US

Find Full text

Files in this item

Name:: Kelly-Stephen-PhD-CSCI-June-20 ...
Size:: 4.094Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Faculty of Graduate Studies Online Theses

Show simple item record