EVALUATING SIMPLE REACTIVE AGENTS IN VISUAL REINFORCEMENT LEARNING TASKS

Bayer, Caleidgh Grace

EVALUATING SIMPLE REACTIVE AGENTS IN VISUAL REINFORCEMENT LEARNING TASKS

dc.contributor.author	Bayer, Caleidgh Grace
dc.contributor.copyright-release	Not Applicable	en_US
dc.contributor.degree	Master of Computer Science	en_US
dc.contributor.department	Faculty of Computer Science	en_US
dc.contributor.ethics-approval	Not Applicable	en_US
dc.contributor.external-examiner	n/a	en_US
dc.contributor.graduate-coordinator	Dr. Michael McAllister	en_US
dc.contributor.manuscripts	Not Applicable	en_US
dc.contributor.thesis-reader	Dr. Xiao Luo	en_US
dc.contributor.thesis-reader	Dr. Garnett Wilson	en_US
dc.contributor.thesis-supervisor	Dr. Malcolm Heywood	en_US
dc.date.accessioned	2023-08-25T14:33:34Z
dc.date.available	2023-08-25T14:33:34Z
dc.date.defence	2023-08-21
dc.date.issued	2023-08-25
dc.description.abstract	Visual formulations of reinforcement learning tasks are potentially challenging because (1) the state space is large and composed from pixels (so unlikely to be directly correlated with actions), (2) the underlying task might be partially observable despite the high dimensionality, and (3) rewards can be sparse, so do not necessarily discriminate between useful and not useful decisions. In this thesis we compare the classic deep Q-network (a temporal difference reinforcement learning approach) with tangled program graphs (TPG) (a genetic programming approach) under complete and partially observable visual reinforcement learning tasks from ViZDoom. We demonstrate that TPG is particularly effective at imparting structure on the partially observable task (resulting in a general policy for navigating a labyrinth), but is relatively poor at solving a fully observable (aiming) task. Conversely, DQN is very effective when presented with the complete information aiming task, but is unable to discover general solutions to the partially observable navigation task. We attribute these preferences to the different approaches TPG and DQN assume for addressing representation/feature construction versus credit assignment.	en_US
dc.identifier.uri	http://hdl.handle.net/10222/82837
dc.language.iso	en	en_US
dc.subject	reinforcement learning	en_US
dc.subject	genetic programming	en_US
dc.title	EVALUATING SIMPLE REACTIVE AGENTS IN VISUAL REINFORCEMENT LEARNING TASKS	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: CaleidghGraceBayer2023.pdf
Size:: 145.86 MB
Format:: Adobe Portable Document Format
Description:: Main Article

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Faculty of Graduate Studies Online Theses