Repository logo

Providing Real-Valued Actions for Tangled Program Graphs Under the CartPole Benchmark

Loading...
Thumbnail Image

Authors

Wright, Matthew

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The Tangled Program Graph framework (TPG) is a genetic programming approach to reinforcement learning. Canonical TPG is limited to performing discrete actions. This thesis investigates mechanisms by which TPG might perform real-valued actions. Two approaches are proposed. In the first, a decision-making network extracts state from TPG's internal structure. A gradient-based learning method tailors the network to this representation. In the second, TPG is modified to generate a state representation in an external matrix visible to the decision-making network. No additional learning algorithm is used to configure the decision-making network. Instead, TPG adapts to use the default configuration. This thesis applies these approaches to a modified version of the classic CartPole environment that accepts real-valued actions. This enables the comparison between discrete action configurations of the task and the real-valued formulation. Results indicate that there is no additional complexity in TPG solutions under real-valued action versus discrete action configurations.

Description

Extending TPG to perform real-valued actions.

Keywords

genetic programming, machine learning, reinforcement learning

Citation