QTRB: TEAM-BASED REGION BUILDING USING Q-LEARNING TO DERIVE POLICY ON PROGRAMS PARAMETERIZED BY LOCAL REWARD SIGNAL

Sealy, Noah

QTRB: TEAM-BASED REGION BUILDING USING Q-LEARNING TO DERIVE POLICY ON PROGRAMS PARAMETERIZED BY LOCAL REWARD SIGNAL

dc.contributor.author	Sealy, Noah
dc.contributor.copyright-release	Not Applicable	en_US
dc.contributor.degree	Master of Computer Science	en_US
dc.contributor.department	Faculty of Computer Science	en_US
dc.contributor.ethics-approval	Not Applicable	en_US
dc.contributor.external-examiner	n/a	en_US
dc.contributor.graduate-coordinator	Dr. Michael McAllister	en_US
dc.contributor.manuscripts	Not Applicable	en_US
dc.contributor.thesis-reader	Dr. Vlado Keselj	en_US
dc.contributor.thesis-reader	Dr. Garnett Wilson	en_US
dc.contributor.thesis-reader	Dr. Dirk Arnold	en_US
dc.contributor.thesis-supervisor	Dr. Malcolm Heywood	en_US
dc.date.accessioned	2023-04-26T14:35:24Z
dc.date.available	2023-04-26T14:35:24Z
dc.date.defence	2023-04-11
dc.date.issued	2023-04-24
dc.description.abstract	While attempting to solve 2-dimensional grid world maze tasks, it was observed that genetic programming is limited by its random initialization and no use of local reward. This thesis proposes a hybrid algorithm called QTRB, team-based region building with q-learning, which attempts to integrate genetic programming and reinforcement learning to use local reward during evolution. During evolution, QTRB constructs programs based directly on local environmental reward; programs are then passed to a reinforcement learning agent to learn on as a model. QTRB was tested to solve variously sized 2-dimensional maze tasks, hypothesizing that policy can be derived from an agent learning from this model. The results suggest that QTRB can derive policy on the given tasks, with fewer direct environment queries than traditional q-learning as the task size scales.	en_US
dc.identifier.uri	http://hdl.handle.net/10222/82533
dc.language.iso	en	en_US
dc.subject	genetic programming	en_US
dc.subject	reinforcement learning	en_US
dc.subject	local reinforcement	en_US
dc.subject	hybrid algorithms	en_US
dc.subject	qtrb	en_US
dc.title	QTRB: TEAM-BASED REGION BUILDING USING Q-LEARNING TO DERIVE POLICY ON PROGRAMS PARAMETERIZED BY LOCAL REWARD SIGNAL	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: NoahSealy2023.pdf
Size:: 14.82 MB
Format:: Adobe Portable Document Format
Description:: Main thesis

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Faculty of Graduate Studies Online Theses