Research Questions

Back To Index

The main questions I wish to explore are why non-deterministic elements (aka randomness) are not an inherent property of all Environments and whether or not non-deterministic elements are a good thing in Reinforcement Learning.

Introducing non-deterministic elements into Reinforcement Learning is important, because it’s almost impossible to determine certain outcomes or placement of objects during a real time scenario, and forcing an Agent into a scenario where everything goes perfect every time leads to narrowing on the Agent properly learning how to deal with a scenario. For example, if you were teaching an Agent how to make a turn on a road, your Environment wouldn’t be able to account for all real life scenarios that could arise, such as animals being in the way, other cars or possibly people trying to cross the road at the same time.

For this project, I will be using Model-Free Agents, meaning they are not based on a pre-existing model and learn their Environments from scratch. Environments are play areas that Agents interact with in order to gain rewards. If an Agent picks an action that furthers the end goal, they gain rewards, otherwise they will lose rewards. Rewards in this scenario are given to the Agent when it makes a correct decision, or picks an action that moves them further towards the end goal. For example, in the hill climbing Environment, the Agent gains a reward the higher it climbs up the hill, and loses reward for each second that passes, incentivizing getting to the top of the hill as soon as possible.

In short, an Agent picks an Action, which is given to the Environment. The Environment then processes this information, and gives back a reward to the Agent and a new state for the Agent. This allows the Agent to learn as it interacts with the Environment. By introducing non-deterministic elements, I hope to gain more of an understanding on why they aren’t the usual standard in most Environments, and if non-deterministic elements should even be considered for Reinforcement Learning.

By the end of this project, I hope to have introduced non-deterministic elements into a Custom Environment, while using a different pre-made Environment that I can inject non-deterministic elements into, to see how a Model that I had trained previously fairs in a slightly different Environment.