This repo contains code for my master's thesis, Synthesis of reward functions from natural language descriptions.
The core task it accomplishes is captured by this workflow:
- The user describes the task in English.
- The language model is invoked to translate the user's description into a reward machine formalism.
- QRM and the reward machine from the previous step are used to train the agent to perform the task described by the user in the first step.
- The results are visually demonstrated to the user in a small number of episodes.